logo_ipparis.png     TelecomParis_endossem_IPP_RVB_100pix.png Telecom Paris
Dep. Informatique & Réseaux

Dessalles_2018.png J-L. DessallesHome page

April 2021

LKR.png Logic and Knowledge Representation
with             Etienne Houzé
                                            other AI courses


Formal logic and automatic theorem proving

This week we’ll continue to study the mechanization of reasoning. Prolog is based on the resolution algorithm, which will be presented first for propositional logic. But Prolog is also able to manipulate variables, something that propositional logic can’t do. We’ll then introduce predicate logic, i.e. a logic with variables about which we "publicly say" something ('predicate’, from the latin verb prae-dicere), and we will focus specifically on first-order logic (FOL). Fortunately, the resolution method also works with first-order logic. The presence of quantifiers ("for all", "there exists") requires however additional mechanisms, in particular the unification method, which is present in Prolog.

We will concentrate for a moment to the notion of proof by refutation, before swiching to predicate logic.

Proof by refutation

Prolog and refutation

Prolog works by refutation. This means that to prove p, it attempts to show that ¬p leads to an absurdity (i.e. it is not satisfiable). How? A prolog program is proximate to a conjunctive normal form (CNF). For instance, a Prolog clause such as:

grand_parent(X,Y) :- parent(X,Z), parent(Z,Y).

is a Horn clause that, rewritten as standard disjunctive clause, becomes :

[grand_parent(X,Y), ¬parent(X,Z), ¬parent(Z,Y)]

A Prolog program consists in the conjunction of its clauses. To prove grand_parent(tom, pat), Prolog temporarily adds the clause [¬grand_parent(tom, pat)] to the program and then tries to produce the empty clause [ ] (which is unsatisfiable) through the process of resolution, discovered by John Alan Robinson in 1965.     

Resolution algorithm

The basic idea of the resolution algorithm is:
For example, consider the following set of clauses:
  1. [a, ¬b, ¬c]
  2. [c, ¬e]
  3. [¬b, e]
  4. [d]
  5. [¬d, b]

Prolog would process these formula as a program:

a :- b, c.
c :- e.
e :- b.
b :- d.

This program can prove a:

?- a.

Let us translate the underlying functioning in Robinson’s terms. The resolution algorithm consists in "annihilating" two opposing terms when two clauses are concatenated. For example, the proposition ¬c in clause 1, [a, ¬b, ¬c] "annihilates" with c in clause 2: [c, ¬e] to give a new clause [a, ¬b, ¬e]. This new clause can be added to the set without modifying its satisfiability.

The idea of the resolution can be paraphrased as:

Let’s continue the resolution process. The last clause, [a, ¬b, ¬e] and clause 3: [¬b, e] resolve into [a, ¬b, ¬b], which resolves with clause 5: [¬d, b] to give [a, ¬d], and finally we end up with clause [a] by resolving with clause 4. The question ?- a. asked to Prolog translates, because we are in refutation mode, into the clause [¬a] being temporarily added to the program. Together with the clause that we have just produced, it resolves into the empty clause [ ]. This explains how Prolog manages to answer True to the question ?- a.

More generally, the resolution algorithm proceeds as follows:

In other words, X is a tautology if and only if a sequence growing from [¬X] contains the empty clause [ ]).
Using the resolution algorithm, prove that:
(((A ⊃ B) ∧ (B ⊃ C)) ⊃ ¬(¬C ∧ A))
To do so, start from an single clause containing the negation of the formula to prove.


We denote by SX the fact that X can be proven from a set S of formulas, which is equivalent to saying that the empty clause can be derived by resolution starting from the conjunction of the elements of S ∪ {¬X}. We denote by ⊢ X the fact that X admits a proof by resolution.
Syntactic Deduction
Show that if S ∪ {X} ⊢ Y, then S ⊢ (X ⊃ Y).
(after answering, read the proof of the reciprocal in the solution).


Note: never forget the difference in nature between the semantic (e.g. propositional) consequence denoted with ⊨ and the syntaxic (proof) consequence, denoted with ⊢!

Soundness and Completeness

The first concern, when one has a proof method such as the resolution algorithm, is to show that it is sound (it only demonstrates tautologies) and complete (it demonstrates all tautologies). Propositional logic admits correct and complete proof methods. The resolution method is one of them; other methods are: truth tables, obviously, but impracticable in predicate logic; the tableau method, which is dual to the resolution method; axiomatic methods, well appreciated in mathematics, easy to automate, but not so functional in practice because they are unable to answer a question given in advance.

The soundness of the method by resolution is verified as follows. We begin by showing that the satisfiability of a sequence (such as the conjunction of the clauses of the current proof) does not change when the sequence grows. In effect, the addition of a new clause by applying an alpha rule or a beta rule does not change the set of evaluations that send the sequence to T. We then observe that a sequence containing the empty clause is not satisfiable (remember that v([]) = F). Finally, we observe that if X admits a proof by resolution, then {¬X} is not satisfiable: since the proof exists, a sequence exists that derives from the clause [¬X] leading to the empty clause, which is unsatisfiable; the sequence was therefore unsatisfiable at all stages of its growth, in particular at its first line [¬X]. If {¬X} is not satisfiable, then X is a tautology, and therefore the method by resolution is sound.

There are also complete proof methods in propositional logic. A complete method is a method that predicts the existence of a proof for each tautology. The truth-table technique provides a manifestly complete method (note that it also provides a decidable procedure, i.e. a mechanical procedure which always leads, in finite time, to a decision concerning the tautological character).

To prove the completeness of the resolution algorithm, it is usually shown that a sequence that cannot lead to the empty clause is satisfiable. However, there are some constructive demonstrations of completeness.

The next exercise uses Prolog to rewrite the resolution algorithm.

Write a automatic prover based on resolution, building upon the predicate cnf used last week. Start from:

prove(F) :-
    cnf([[ -F ]], CNF),
    write('CNF of -'), write(F), write(' = '),
    write(CNF), nl,

We need to write the predicate resolve:

resolve(CNF) :-
    member([ ], CNF),
    write('This is a true formula'), nl.
resolve(CNF) :-
    write('Examining '), write(CNF), nl,
    get0(_),    % waits for user action
    select(C1, CNF, _),            % forgetting this parent clause
    select(C2, CNF, RCNF),    % keeping this parent clause
    . . .    % user 'remove' from cnf program

The predicate resolve has to extract two clauses C1 and C2 from CNF, then to extract P and -P respectively from these two clauses, then to recall itself after having added the concatenation of the remaining parts of these two clauses to CNF (it is recommended to forget one of the two parent clauses). Evidently, the presence of the empty clause has to be checked, and, in the affirmative case, stop.

Try it on ((a ⊃ (b ⊃ c)) ⊃ ((a ⊃ b) ⊃ (a ⊃ c))).

/* TEST */
go :-
    prove( ((a imp (b imp c)) imp ((a imp b) imp (a imp c))) ).

Normally you have to remove all occurrences of P and of -P from the clauses. What do you think about the risks of entering in loops? Which remedies would you think of?     


Predicate logic (first order logic)

A logic for predication

Philosophers of the late nineteenth century noted that a phrase like "All human beings are mortal" was not an atomic proposition. In addition, it meant more than the conjunction of N propositions "xi-is-mortal" describing the status of each living human being who lived or was to live on Earth. The sentence actually expresses an implication: if an entity is a human being, then it is mortal. This sentence thus plays the role of a constraint on the knowledge we have of the world, independently of our ability to enumerate all people. Propositional logic is incapable of describing such a constraint. This is the reason why formalisms as first-order logic have been developed.

The fundamental discovery associated with predicate logic is quantification. Logic, before Frege, struggled with paradoxes such as the non-equivalence of the two utterances: "every instant is preceded by an instant" and "an instant precedes any instant", although equivalence holds for an analogous pair: "4 is preceded by 2" And "2 precedes 4". Unquantified logic was also unable to represent statements about the absence of quality objects, such as "No human being is immortal." The solution consisting in symbolically representing nothingness (and at the same time in "believing" in its existence!) was far from satisfactory.

The introduction of quantification solved most of these problems in a rather remarkable way. The automatic permutation, when using the negation operation, of quantifiers ∀ and ∃, whatever the complexity of the negated expression, is perfectly satisfying from a semantic point of view. Here is an example :

    "If there is an interpreter, then any two individuals communicate"
(x) (y) (l1) (l2) ((s(x, l1)  (s(y, l2)  (z) (s(z, l1)  s(z, l2)))) ⊃ c(x, y))

x, y, z refer to persons; l1, l2 to languages; s(x, l) is a predicate standing for "x speaks l", c(x, y) for "x communicates with y".

    Negated version:
(x) (y) (l1) (l2) ((s(x, l1)  (s(y, l2)  (z) (s(z, l1)  s(z, l2))))  ¬c(x, y))

This negated formula reads: "There are two individuals who do not communicate despite the existence of an interpreter"
It makes sense!


The syntax of predicate logic is defined by the following recursive constructs:

The introduction of quantifiers, and hence of variables, into logic forces us to consider the set in which these variables take their values. For Russell and Frege, this set was unquestionably the "real" world as a whole, considered as an absolute objective reference. In its modern version (model theory), first order logic is interpreted in a given "domain". In other words, it is the chosen domain that determines the semantics.


The definition of semantics is more complex than in propositional logic, as it goes through the definition of a model. A model links logical symbols to a "world" in which these symbols are interpreted.     

Let us consider, for example, a FOL language containing the constants {f,  v, j} and the binary predicate c. In this language, we can evaluate the formula F: (X) (Y) (Z) (c(X, Y) ⊃ c(Z, X)).

Consider the domain = {’France’, ’Vatican’, ’Japan’} and the relation ‘to have diplomatic relations with’. In the interpretation I which matches f to fI = ’France’, etc. and c to the unique relation of D, the validity of F can be examined. One can imagine a completely different field, relating to people.     
Give two different models for a language with the constants {a, b, c}, the binary predicate ‘<’ and the ternary predicate ‘+’.


Attitudes (truth conditions)

At this point, we have anchored the symbols of the formalism on objects of a world (the domain D). We can now define truth values which will correspond, for each predicative symbol, to the fact that the corresponding relation is verified in the domain. This step amounts to defining the attitudes for FOL: it is no longer a matter of seeking the reference of symbolic utterances, that is, their meaning, but of adopting an attitude (here truth or falsehood) about these statements. Note that (unfortunately) current usages in logic do not yet make this semantics/attitudes distinction explicit.

The definition of truth conditions of predicates consists in using the domain as a database:

The truth of a formula F is then defined after we have given a meaning (by association with objects of the domain) to symbols (through interpretation I) and to variables (through assignment A). However, it is possible to define the semantics of formulas more generally:     

The notion of validity replaces the notion of tautology of the logic of propositions.
Note that a formula F is valid if and only if {¬F} is not satisfiable.
Models and Validity
Give a model for a language consisting of the constants {a, b, c} and the ternary predicate ‘+’ in which (∀x) +(a, x, x) is true (x is a variable). Give another model in which the same formula is not true.


What do you think of:
((x) (y) P(x, y))  ((x) (y) ¬P(x, y))


The semantics which has just been defined relies entirely on the existence of a set D, the domain, which must nevertheless enjoy a certain number of properties: as many relations (with the correct arity) as there are predicates, as many functors as there are functors in language, etc. Of course, when it comes to applying logic to a real domain, it is the domain that pre-exists, whereas the language and the formulas that are tailor-made. But when one is interested in first-order logic in itself, as it is the case here, notably to establish properties as the validity of a formula, it is indeed the language and the set of formulas to be considered that are regarded as pre-existing, not the domain.

Proof by refutation in predicate logic

The resolution method is perfectly adaptable to predicate logic (this is what makes it so interesting).

Implementation of resolution in FOL

The difficulty of implementing proof algorithms for FOL comes from the universal quantifier. It presupposes that we find a suitable closed term t to instantiate the universally quantified variable. Obviously, some choices for t will be better than others. A forward-chaining approach would consist in making choices and then testing the possible success of resolution. We’ll rather consider the converse technique: replace the universally quantified variables by free variables, and thus tolerate non-closed terms in the resolution procedure. Then, as the procedure proceeds, we look for instantiations likely to lead to the empty clause. Prolog follows this latter method.

The implementation of the resolution procedure is greatly facilitated by standardizing the presentation of the starting formulas. We have already presented a standardization procedure in propositional logic, the conjunctive normal form (CNF). Now, the issue is to process quantifiers as well. For example, the formula ((x) (y) R(x,y) ⊃ (y) (x) R(x,y)) can be rewritten as (x) (y) (z) (w) (R(x,y) ⊃ R(w,z)).

Below are transformations that transport quantifiers to the front of formulas, producing what is called their prenex form. In the following formulas, the variable x is assumed to have no occurrence in the part of the formula that is outside the scope of the quantifier; this can be achieved by renaming the variables. The following formulas are valid:

  1. ¬(x) A(x) ≡ (x) ¬A(x)
  2. ¬(x) A(x) ≡ (x) ¬A(x)
  3. ((x) A(x) ∧ B) ≡ (x) (A(x)  ∧ B)    we suppose that B does not depend on x
  4. (A ∧ (x) B(x)) ≡ (x) (A ∧ B(x))    we suppose A does not depend on x
  5. ((x) A(x) ∧ B) ≡ (x) (A(x) ∧ B)
  6. (A ∧ (x) B(x)) ≡ (x) (A ∧ B(x))
  7. ((x) A(x) ⊃ B) ≡ (x) (A(x) ⊃ B)    warning: the antecedent of ⊃ includes a negation
  8. (A ⊃ (x) B(x)) ≡ (x) (A ⊃ B(x))
  9. ((x) A(x) ⊃ B) ≡ (x) (A(x) ⊃ B)    warning: the antecedent of ⊃ includes a negation
  10. (A ⊃ (x) B(x)) ≡ (x) (A ⊃ B(x))

Writing a formula in prenex form is a non-deterministic procedure. A good practice is to get existential quantifiers to the leftmost possible positions.

The method then consists in eliminating all the quantifiers. Let’s start with the existential ones. To eliminate the ‘∃’ in the formula, we use "Skolem functions". Skolem functions are necessary because existential variables may be under the scope of universal quantifiers. The existential quantifier and the corresponding variable are replaced by a function that notes variable dependencies.

Warning: The Skolemisation procedure is non-deterministic. A good practice is to treat the most external existential quantifiers first.

Skolem functions enable us to process embedded existential quantifiers. Skolem functions make it possible to note dependencies between quantifiers, before removing them. When Skolemization has been performed, we dispose of the universal quantifiers. The method goes like this:

Automated resolution

  1. Transform the formula in prenex form, renaming variables if necessary
  2. Remove existential quantifiers through skolemization
  3. Remove universal quantifiers
    (this amounts to transforming the associated quantified variables into free variables)
  4. Transform the formula into conjunctive normal form (CNF)
  5. Use the resolution algorithm with unification.
  6. If the empty clause is obtained through this process starting from ¬F, then {¬F} is not satisfiable, and F is valid.

Note that step 4 and step 5 can be used in conjunction to bypass some elementary steps.

The last step in the above mentioned procedure, unification, is one of the main tools of Prolog. The unification algorithm allows us to delay the instantiation of the variables as long as possible. At the moment of computing the resolving clause between two clauses, we adopt the instantiation that suits us best by attempting to unify two terms that will become opposite.

Example: proof by resolution of ((x) (P(x) ∨ Q(x)) ⊃ ((x) P(x) ∨ (x) Q(x)))
  1. [¬((x) (P(x) ∨ Q(x)) ⊃ ((x) P(x) ∨ (x) Q(x)))]    we start from the negated formula (refutation)
  2. [(x) (P(x) ∨ Q(x))]    (development of 1. (α-rule).)
  3. [¬((x) P(x) ∨    (x) Q(x))] (development of 1. (α-rule).)
  4. [¬(x) P(x)]    (development of 3. (α-rule).)
  5. [¬(x) Q(x)]    (development of 3. (α-rule).)
  6. [¬Q(c)]    (a version of 5. (c = constante de Skolem).)
  7. [¬P(x))])    (= 4. freed from univeral quantifier.)
  8. [(P(x) ∨ Q(x))]    (= 2. freed from univeral quantifier.)
  9. [P(x), Q(x)])    (modification of 8. (β-rule).)
  10. [Q(x)]    (resolving clause of 7. and 9.)
  11. [ ]    (resolving clause of 6. and 10. after unification = c.)

Counter-example: If we try to prove ((x) (y) P(x, y)) ⊃ ((x) (y) P(x, y)) by resolution, it starts as:
  1. [¬((x) (y) P(x, y)) ⊃ ((x) (y) P(x, y))]
  2. [(x) (y) P(x, y)]
  3. [¬((x) (y) P(x, y))]
  4. [(x) (y) ¬P(x, y)]
    It is tempting to say that 2. and 4. resolve into the empty clause. This would be wrong.
  5. [(x) P(x, sk1(x))]    skolemization of 2.
  6. [(x) ¬P(x, sk2(x))]    skolemization of 4.
  7. [P(x, sk1(x))]
  8. [¬P(x, sk2(x))]
    the Skolem functions are constants that block unification.
    By the way, it is easy to see that the initial formula isn’t valid (consider P as ‘greater than’ in the domain of integer numbers)
Resolution Order (1)
Using the resolution method, show that the following formula is valid:     (x) (y) (A(y) ⊃ A(x))


Write the following formula in prenex and skolemized form:
((x) (y) (p(x) ∧ p(y)) ⊃ (x) (y) (p(x) ∨ p(y)))
(consider renaming variables when they do not refer to the same entity)


Here is a less simple example. We want to show through the resolution method that:
(y) (z) (P(z, y) ≡ ¬(x) (P(z, x) ∧ P(x, z)))
is a contradiction, which means that it can’t be true. We start by the formula itself (since we want to show that it is not satisfiable).
  1. [(y) (z) (P(z, y) ≡ ¬(x) (P(z, x) ∧ P(x, z)))]
  2. [(z) (P(z, c) ≡ ¬(x) (P(z, x) ∧ P(x, z)))]     skolemization of y
  3. [(z) ((P(z, c) ⊃ ¬(x) (P(z, x) ∧ P(x, z))) ∧ (¬(x) (P(z, x) ∧ P(x, z)) ⊃ P(z, c)))] transformation of 2
  4. [(z) ((P(z, c) ⊃ ¬(x) (P(z, x) ∧ P(x, z))) ∧ (x) (¬(P(z, x) ∧ P(x, z)) ⊃ P(z, c)))] prenexization of x on the right
  5. [(z) ((P(z, c) ⊃ ¬(x) (P(z, x) ∧ P(x, z))) ∧ (¬(P(z, f(z)) ∧ P(f(z), z)) ⊃ P(z, c)))] skolemization of x on the right
  6. [(z) (x) ((P(z, c) ⊃ ¬(P(z, x) ∧ P(x, z))) ∧ (¬(P(z, f(z)) ∧ P(f(z), z)) ⊃ P(z, c)))] prenexization of x
  7. [((P(z, c) ⊃ ¬(P(z, x) ∧ P(x, z))) ∧ (¬(P(z, f(z)) ∧ P(f(z), z)) ⊃ P(z, c)))] elimination of quantifiers
  8. [(P(z, c) ⊃ ¬(P(z, x) ∧ P(x, z)))]    development of 7
  9. [(¬(P(z, f(z)) ∧ P(f(z), z)) ⊃ P(z, c))] development of 7
  10. [¬P(z, c), ¬P(z, x), ¬P(x, z)] development of 8
  11. [(P(z, f(z)) ∧ P(f(z), z)), P(z, c)] development of 9
  12. [P(z, f(z)), P(z, c)] development of 11
  13. [P(f(z), z), P(z, c)] development of 11
  14. [¬P(c, c), P(c, f(c))] resolving clause of 10 and 12 : x=z, z=c
  15. [P(f(c), c)] resolving clause of 10 and 13 : x=z, z=c
  16. [¬P(f(c), c)]    resolving clause of 10 and 15 : = f(c), × = c
  17. [ ] resolving clause of 15 and 16.

Warning: some dependencies between quantifiers are artificial! For instance, in the formula (x) (y) ¬(A(x) ⊃ A(y)), the order of quantifiers can be inversed. To check it, try to put the quantifiers back deeper into the formula, to make them go out in a different order.

(x) (y) ¬(A(x) ⊃ A(y))
(x) ¬(y) (A(x) ⊃ A(y))
(x) ¬(A(x) ⊃ (y) A(y))
(x) (A(x) ⊃ (y) A(y))
((x) A(x) ⊃ (y) A(y)) 
(y) ((x) A(x) ⊃ A(y))
(y) (x) (A(x) ⊃ A(y))
(y) (x) ¬(A(x) ⊃ A(y))

These transformations are possible because of the limited scope of quantifiers. In a formula like:

(x) (y) ¬(A(x, y) ⊃ B(y))

it is impossible for the quantifier that controls y to "enter" deeper into the formula.
Resolution with trap
Test the validity of the following formula:
(x) (y) (z) (w) (R(x,y) ⊃ R(w,z))
(There is a trap here: the presented formula is in prenex form, but not as desired: we would like to see the existential quantifiers in front).


Proving with Prolog

A Prolog program consists of Horn clauses, i.e. clauses of the type:

a :- b, c.

that can be written in disjunctive form, e.g. [a, ¬b, ¬c]

A Horn clause has (at most) only one non-negated element. These elements are predicates (with variables). The variables are assumed to be universally quantified in Prolog clauses.

grand_parent(X, Y) :- parent(X, Z), parent(Z, Y).

means that for any X, for any Y and for any Z, if parent(X, Z) and parent(Z, Y), then grand_parent(X, Y). Written in a logic form:
(x) (y) (z) ((parent(x, z) ∧ parent(z, y)) ⊃ grandparent(x, y))

Note that the variable z only occurs in the antecedent of the implication (i.e., in the tail of the Prolog clause). We can therefore quantify it locally. Warning: the ∀ becomes an ∃, since the antecedent of an implication is a negated term: ((A ⊃ B) ≡ (¬A ∨ B)).
(x) (y) ((z) (parent(x, z)  parent(z, y)) ⊃ grandparent(x, y))

For any x et y, x is the grand-parent of y if there is a z such that . . .

In other words, variables that only appear in the tail of the Prolog clause can be considered to be locally existentially quantified. Prolog, through unification and backtracking, will look for an instantiation of these variables that allows to unify a term of the tail with the head of another clause.
Order 1 Prover
Use the predicate prove (see first exercise) with the following formula :
( (X) a(X) ⊃ (b ⊃ c)) ⊃ (( (Y) a(Y) ⊃ b) ⊃ (a(3) ⊃ c))
Here there are only universal quantifiers. To translate the formula in Prolog, simply remove them, because all variables are supposed to be universally quantified in Prolog. Note that Prolog unify X and Y with 3 to achieve the resolution.




Back to the main page