The Complexity of Closed World Reasoning in Constraint-Based Grammar Theories ----------------------------------------------------------------------------- Bob CARPENTER Carnegie Mellon University carp+@cmu.edu Paul KING University of Tuebingen king@hocket.sns.neuphilologie.uni-tuebingen.de MOTIVATION Linguistic theories have traditionally been interpreted in a closed world fashion. For instance, if a list of phrase structure rules was enumerated, then they were assumed to be exhaustive in the sense that any string that couldn't be generated by them is classified as ungrammatical. In constraint-based grammar theories, the issue of choice between alternatives is often reduced to a choice of subtypes to which to resolve a type. For instance, a referent must be resolved as to gender, number and person, each of which involves a choice among a mutually exclusive set of alternatives. Analogously to phrase structure grammars, constraint-based theories such as head-driven phrase structure grammar (HPSG) classify phrases into one of several subtypes such as head-complement, head-adjunct, head-specifier, and so on; every phrase must be realized as exactly one of these types. In the combination of Rounds/Kasper logic and inheritance presented in (Carpenter 1992), type declarations are read in an open-world fashion. Following Smolka (1986), Carpenter allowed objects in models to be assigned types which were not maximally specific. On the other hand, King (1989) followed the spirit of HPSG in requiring every structure to be assigned to a maximally specific type, which he called a species. King enforced the closed world condition by defining types to be sets of species, thus ensuring boolean closure. Carpenter (1992) relaxed King's restriction on boolean closure, simply requiring type hierarchies to be closed under conjunction, and showed how the inheritance notation of HPSG (Pollard and Sag 1987) could be expressed in this model. Carpenter presented a complete proof theory for his models, but failed to establish a proof theory for the closed world case. In this paper, we use techniques of King (1989) to establish the conjecture of Carpenter (1992) that the logic of closed world models is captured by an axiom equating the information in a type to the disjunction of its subtypes: type <==> subtype1 OR ... OR subtypeN Furthermore, we show that the resulting logic is decidable, and that the problem of type soundness and thus conjunctive description satisfiability is NP-complete. This renders type inference intractable in practice, as contrasted with the open world case, which Carpenter (1992) established to be quasi-linear. Nevertheless, the grammar processing project TROLL (Gerdemann, Hinrichs, King, and Goetz 1994) is developing type inference algorithms for closed world reasoning that have been shown to be tractable in a great many realistic applications to grammatical encoding. CLOSED-WORLD LOGIC We assume Carpenter's (1992) type theory: Type is a finite set of types which form a bounded complete partial order under the subsumption ordering <. Feat = {f1,...,fn>} is a finite set of features. A partial appropriateness specification function Approp : Feat x Type -> Type is given that obeys the inheritance and minimal type introduction conditions. The inheritance condition ensures that if the feature f is appropriate for a type s1, so that Approp(f,s1) is defined, and if s2 is a subtype of s1, s1 <= s2, then Approp(f,s1) <= Approp(f,s2). The minimal type introduction condition ensures that for every feature f, there is a most general type s such that Approp(f,s) is defined. A model is a structure where A is a set of objects, theta:A -> Type is a total typing function, and where Fi:A -> A is a partial feature value function interpreting the feature fi, which respects the appropriateness conditions: if Fi(a1) = a2, then Approp(fi,theta(a1)) <= theta(a2), and if Approp(fi,theta(a)) is defined, then Fi(a) is defined. The logical description language is as follows: Desc ::= Type | Path1 == Path2 | f:Desc | Desc AND Desc | Desc OR Desc A Path is a sequence of features and Path(a) is defined by composing the interpretations of the features in Path. (Our complexity result remains unchanged if we enhance our logic with variables and our modelling relation with assignments.) Satisfaction is defined between objects in the model and descriptions as follows: a |= t iff t < theta(a) a |= fi:Phi iff Fi(a) |= Phi a |= Path1 == Path2 iff Path1(a) = Path2(a) a |= Phi1 AND Phi2 iff a |= Phi1 and a |= Phi2 a |= Phi1 OR Phi2 iff a |= Phi1 or a |= Phi2 Carpenter (1992) provided a sound and complete proof theory for the logical equivalence of descriptions over this class of models. The issue of both logical equivalence and satisfiability were shown to be quasi-linear if disjunction is disallowed. Rounds and Kasper (1986) established that the problem is NP-complete in the face of disjunction --- 3-SAT is trivial to encode in this case. A model is said to be a closed world model if theta(a) is a maximally specific type, a species, for every a. Carpenter conjectured that his open world logic could be completed by the axiom family mentioned above: type <==> subtype_1 OR ... OR subtype_m where subtype_1, ..., subtype_m are the subtypes of type (either the immediate subtypes or maximally specific subtypes will suffice). TYPE INFERENCE, COMPLETENESS AND DECIDABILITY In this talk, we establish a proof of Carpenter's conjecture as well as establishing the complexity of the resulting logic. The key insight derives from the operational logic of the TROLL system. King noticed that Carpenter's logical inference operators over feature structure representations, TypInf and Fill, which respectively raise all value types to appropriate values and add in missing appropriate, can be commuted with an operation that tests each substructure for all of its maximal instantiations. To determine whether a given feature structure F can be extended to a maximal one, in other words to perform type inference, each each existing type is raised to all of its possible subtypes and TypInf is applied to see if the result is consistent. The crucial observation is that both of these operations involve only finitely many possibilities. If they succeed, we know the structure is extendable, because Fill is a total operation given our conditions on type hierarchies. Thus even if there are cyclic type declarations such as person < male, female Approp(FATHER, person) = male we do not run into infinite loops in evaluating descriptions like: female AND NAME:sandy AND FATHER:(male AND NAME:terry) The operation of filling in extra features (such as Terry's father above), can be done lazily. The addition of the logical axiom for choice among subtypes mentioned above completes Carpenter's logic. We establish this result by correspondence; commuting in the logical operators of filling and type inferring in the feature structure system can be reflected in the logic by ordering inference operations appropriately. By expanding each structure only insofar as the other one is instantiated, two structures can be tested for logical equivalence in terms of their extensions to closed world structures. In the logic, we simply note that the axioms can be unfolded in just this order and equality determined in finitely many steps if in fact, the descriptions are logically equivalent. COMPLEXITY While this establishes the decidability of type inference by limiting the search space to a finite one, the choices faced are exponential. We will show a relatively straightforward reduction of the 3-SAT problem to the problem of satisfiability of a disjunction-free description in the closed world logic. The key idea is to represent a formula as a feature structure, and force every structure representing a propositional symbol to be resolved to either true or false, and every feature structure representing a conjunct, disjunct, or negation to be resolved for truth value according to the truth values of its conjuncts. The appropriateness conditions are as follows: bot < bool, formula bool < true, false formula < propsymbol, conj, disj, neg, trueform, falseform TRUTHVAL:bool propsymbol < truepropsym, falsepropsym truepropsym TRUTHVAL:true falsepropsym TRUTHVAL:false conj < trueconj, falseconj1, falseconj2 CONJ1:formula CONJ2:formula trueconj CONJ1:trueform CONJ2:trueform falseconj1 CONJ1:falseform falseconj2 CONJ2:falseform ... trueform < truepropsym, trueconj, trueneg, truedisj falseform < falsepropsym, falseconj, falseneg, falsedisj A propositional formula is satisfiable if and only if its encoding in attribute value logic is satisfiable in a closed world model. Clearly the problem is in NP, thus rendering it NP-complete. The fact that we used a fixed inheritance hierarchy in the reduction shows that there are NP-complete instances of the universal problem. CONCLUSION Most linguistic theories, such as HPSG, assume closed-world reasoning in their models. We have established a complete logic for closed world reasoning, and shown the satisfiability problem to be NP-complete in particular cases. What remains to be studied further is the complexity of particular cases.