Artificial Intelligence - foundations of computational agents -- 12.5 Function Symbols

Third edition of Artificial Intelligence: foundations of computational agents, Cambridge University Press, 2023 is now available (including the full text).

12.5 Function Symbols

Datalog requires a name, using a constant, for every individual about which the system reasons. Often it is simpler to identify an individual in terms of its components, rather than requiring a separate constant for each individual.

Example 12.23: In many domains, you want to be able to refer to a time as an individual. You may want to say that some course is held at 11:30 a.m. You do not want a separate constant for each possible time. It is better to define times in terms of, say, the number of hours past midnight and the number of minutes past the hour. Similarly, you may want to reason with facts that mention particular dates. You do not want to give a constant for each date. It is easier to define a date in terms of the year, the month, and the day.

Using a constant to name each individual means that the knowledge base can only represent a finite number of individuals, and the number of individuals is fixed when the knowledge base is designed. However, many cases exist in which you want to reason about a potentially infinite set of individuals.

Example 12.24: Suppose you want to build a system that takes questions in English and that answers them by consulting an online database. In this case, each sentence is considered an individual. You do not want to have to give each sentence its own name, because there are too many English sentences to name them all. It may be better to name the words and then to specify a sentence in terms of the sequence of words in the sentence. This approach may be more practical because there are far fewer words to name than sentences, and each word has it own natural name. You may also want to specify the words in terms of the letters in the word or in terms of their constituent parts.

Example 12.25: You may want to reason about lists of students. For example, you may be required to derive the average mark of a class of students. A class list of students is an individual that has properties, such as its length and its seventh element. Although it may be possible to name each list, it is very inconvenient to do so. It is much better to have a way to describe lists in terms of their elements.

Function symbols allow you to describe individuals indirectly. Rather than using a constant to describe an individual, an individual is described in terms of other individuals.

Syntactically a function symbol is a word starting with a lower-case letter. We extend the definition of a term so that a term is either a variable, a constant, or of the form f(t₁,...,t_n), where f is a function symbol and each t_i is a term. Apart from extending the definition of terms, the language stays the same.

Terms only appear within predicate symbols. You do not write clauses that imply terms. You may, however, write clauses that include atoms that use function symbols as ways to describe individuals.

The semantics must be changed to reflect the new syntax. The only thing we change is the definition of φ. We extend φ so that φ is a mapping that assigns to each constant an element of D and to each n-ary function symbol a function from Dⁿ into D. Thus, φ specifies the mapping denoted by each function symbol. In particular, φ specifies which individual is denoted by each ground term.

A knowledge base consisting of clauses with function symbols can compute any computable function. Thus, a knowledge base can be interpreted as a program, called a logic program.

This slight expansion of the language has a major impact. With just one function symbol and one constant, infinitely many different terms and infinitely many different atoms exist. The infinite number of terms can be used to describe an infinite number of individuals.

Example 12.26: Suppose you want to define times during the day as in Example 12.23. You can use the function symbol am so that am(H,M) denotes the time H:M a.m., when H is an integer between 1 and 12 and M is an integer between 0 and 59. For example, am(10,38) denotes the time 10:38 a.m.; am denotes a function from pairs of integers into times. Similarly, you can define the symbol pm to denote the times after noon.

The only way to use the function symbol is to write clauses that define relations using the function symbol. There is no notion of defining the am function; times are not in a computer any more than people are.

To use function symbols, you can write clauses that are quantified over the arguments of the function symbol. For example, the following defines the before(T₁,T₂) relation that is true if time T₁ is before time T₂ in a day:

before(am(H1,M1),pm(H2,M2)).
before(am(12,M1),am(H2,M2)) ←
      H2<12.
before(am(H1,M1),am(H2,M2)) ←
      H1<H2 ∧
      H2<12.
before(am(H,M1),am(H,M2)) ←
      M1<M2.
before(pm(12,M1),pm(H2,M2)) ←
      H2<12.
before(pm(H1,M1),pm(H2,M2)) ←
      H1<H2 ∧
      H2<12.
before(pm(H,M1),pm(H,M2)) ←
      M1<M2.

This is complicated because the morning and afternoon hours start with 12, then go to 1, so that, for example, 12:37 a.m. is before 1:12 a.m.

Function symbols are used to build data structures.

Example 12.27: A tree is a useful data structure. You could use a tree to build a syntactic representation of a sentence for a natural language processing system. We could decide that a labeled tree is either of the form node(N,LT,RT) or of the form leaf(L). Thus, node is a function from a name, a left tree, and a right tree into a tree. The function symbol leaf denotes a function from a node into a tree.

The relation at_leaf(L,T) is true if label L is the label of a leaf in tree T. It can be defined by

at_leaf(L,leaf(L)).
at_leaf(L,node(N,LT,RT)) ←
at_leaf(L,LT).
at_leaf(L,node(N,LT,RT)) ←
at_leaf(L,RT).

This is an example of a structural recursive program. The rules cover all of the cases for each of the structures representing trees.

The relation in_tree(L,T), which is true if label L is the label of an interior node of tree T, can be defined by

in_tree(L,node(L,LT,RT)).
in_tree(L,node(N,LT,RT)) ←
in_tree(L,LT).
in_tree(L,node(N,LT,RT)) ←
in_tree(L,RT).

Example 12.28: You can reason about lists without any notion of a list being built in. A list is either the empty list or an element followed by a list. You can invent a constant to denote the empty list. Suppose you use the constant nil to denote the empty list. You can choose a function symbol, say cons(Hd,Tl), with the intended interpretation that it denotes a list with first element Hd and rest of the list Tl. The list containing the elements a, b, c would then be represented as

cons(a,cons(b,cons(c,nil))).

To use lists, one must write predicates that do something with them. For example, the relation append(X,Y,Z) that is true when X, Y, and Z are lists, such that Z contains the elements of X followed by the elements of Z, can be defined recursively by

append(nil,L,L).
append(cons(Hd,X),Y,cons(Hd,Z)) ←
append(X,Y,Z).

There is nothing special about cons or nil; we could have just as well used foo and bar.

First-Order and Second-Order Logic

First-order predicate calculus is a logic that extends propositional calculus to include atoms with function symbols and logical variables. All logical variables must have explicit quantification in terms of "for all" (∀) and "there exists" (exist ). The semantics of first-order predicate calculus is like the semantics of logic programs presented in this chapter, but with a richer set of operators.

The language of logic programs forms a pragmatic subset of first-order predicate calculus, which has been developed because it is useful for many tasks. First-order predicate calculus can be seen as a language that adds disjunction and explicit quantification to logic programs.

First-order logic is of first order because it allows quantification over individuals in the domain. First-order logic allows neither predicates as variables nor quantification over predicates.

Second-order logic allows for quantification over first-order relations and predicates whose arguments are first-order relations. These are second-order relations. For example, the second-order logic formula

∀R symmetric(R) ↔(∀X ∀Y R(X,Y) →R(Y,X))

defines the second-order relation symmetric, which is true if its argument is a symmetric relation.

Second-order logic seems necessary for many applications because transitive closure is not first-order definable. For example, suppose you want before to be the transitive closure of next, where next(X,s(X)) is true. Think of next meaning the "next millisecond" and before denoting "before." The natural first-order definition would be the definition

∀X ∀Y before(X,Y) ↔( Y=s(X) ∨ before(s(X),Y)).

This expression does not accurately capture the definition, because, for example,

∀X ∀Y before(X,Y) →exist W Y=s(W)

does not logically follow from Formula (12.28), because there are nonstandard models of Formula (12.28) with Y denoting infinity. To capture the transitive closure, you require a formula stating that before is the minimal predicate that satisfies the definition. This can be stated using second-order logic.

First-order logic is recursively enumerable, which means that a sound and complete proof procedure exists in which every true statement can be proved by a sound proof procedure on a Turing machine. Second-order logic is not recursively enumerable, so there does not exist a sound and complete proof procedure that can be implemented on a Turing machine.

12.5.1 Proof Procedures with Function Symbols