Grounding FO and FO(ID) with Bounds

Grounding is the task of reducing a first-order theory and finite domain to an equivalent propositional theory. It is used as preprocessing phase in many logic-based reasoning systems. Such systems provide a rich first-order input language to a user and can rely on efficient propositional solvers to perform the actual reasoning. Besides a first-order theory and finite domain, the input for grounders contains in many applications also additional data. By exploiting this data, the size of the grounders output can often be reduced significantly. A common practice to improve the efficiency of a grounder in this context is by manually adding semantically redundant information to the input theory, indicating where and when the grounder should exploit the data. In this paper we present a method to compute and add such redundant information automatically. Our method therefore simplifies the task of writing input theories that can be grounded efficiently by current systems. We first present our method for classical first-order logic (FO) theories. Then we extend it to FO(ID), the extension of FO with inductive definitions, which allows for more concise and comprehensive input theories. We discuss implementation issues and experimentally validate the practical applicability of our method.


Introduction
Grounding, or propositionalization, is the task of reducing a first-order theory and finite domain to an equivalent propositional theory, called a grounding.Grounding is used as a preprocessing phase in many logic-based reasoning systems.It serves to provide the user with a rich input language, while enabling the system to rely on efficient propositional solvers to perform the actual reasoning.
A basic (naive) grounding method is by instantiating the variables in the input theory by all possible combinations of domain elements.Grounding in this way is polynomial in the size of the domain but exponential in the maximum width of a formula in the input theory, and may easily produce groundings of unwieldy size.Several techniques have been developed to efficiently produce smaller groundings.There are two main categories of such techniques.In the first, the input theory is rewritten such that the maximum width of the formulas decreases.Methods like clause splitting (Schulz, 2002) and partitioning (Ramachandran & Amir, 2005) belong to this category.
The second type of techniques is applicable when besides the finite domain, additional data is available.This is often the case in practical model generation problems, such as the ones that are typical in ASP.In a graph problem the data could be an encoding of the input graph; in the context of planning, it could be a description of the initial and goal state, etc.Sometimes the data is explicitly available, e.g., in the form of a database, sometimes it is implicit, e.g., as a set of ground facts in the input theory.The second type of techniques aims at efficiently computing small groundings by taking the data into account.
Observe that both types of techniques can be combined in a grounder.In this paper we mainly focus on a technique of the second category.To explain the intuition underlying our method, consider the following model generation problem.
Example 1.Let T 1 the first-order logic theory over the vocabulary {Edge, Sub}, consisting of the two sentences ∀u∀v (Sub(u, v) ⊃ Edge(u, v)) (1) T 1 expresses that Sub is a subgraph of Edge with at most one outgoing edge in each vertex.Computing such a subgraph of a given graph G = V, E can be cast as a model generation problem with input theory T 1 and data G.The data can be represented as a structure I σ for the subvocabulary σ 1 = {Edge} with domain V and Edge Iσ = E.A solution can be obtained by generating a model of T 1 that expands I σ with an interpretation of Sub.
Applying the naive grounding algorithm produces |V | 2 instantiations of (1) and |V | 3 instantiations of (2).By taking the data into account, atoms over 'Edge' and '=' can be substituted by their truth value in I σ .Simplifying the resulting grounding then eliminates |E| instantiations of (1) and |V | instantiations of (2).Smart grounding algorithms interleave this substitution and simplification with the grounding process in order to avoid creating unnecessary parts of the grounding.
Observe that substituting atoms over σ 1 and then simplifying still produces a grounding of size O(|V | 3 ).Indeed, the simplified grounding of (2) is the set of binary clauses ¬Sub(i, j) ∨ ¬Sub(i, k) such that i, j, k ∈ V and i = j.This set has size |V | 3 − |V |.
Some grounders apply reasoning on the ground theory to reduce it even further.In the example, the simplified grounding of (1) consists of the clauses ¬Sub(i, j) such that (i, j) ∈ E. Since these are unit clauses, each of them is certainly true in every model of the ground theory.It follows that each binary clauses ¬Sub(i, j) ∨ ¬Sub(i, k) such that either ¬Sub(i, j) or ¬Sub(i, k) belongs to the simplified grounding of (1) is certainly true in every model of the ground theory and thus can be omitted from the simplified grounding of (2).The result is a grounding of size |E 1=1 E|, where 1=1 denotes the natural join matching the first columns.For a sparse graph, |E 1=1 E| is much smaller than |V | 3 .However, since reasoning on the ground theory does not avoid creating all instantiations of a formula, it does not significantly speed up the grounding process.
One way to avoid a large grounding without relying on reasoning on the ground theory is by adding redundant information to formulas.This method is frequently used in ASP.For example, ∀x∀y∀z(Edge(x, y) ∧ Sub(x, y) ∧ Edge(x, z) ∧ Sub(x, z) ⊃ y = z) (3) is equivalent to (2) given (1), but its grounding (without reasoning on the ground theory) is equal to the one obtained by the kind of reasoning on the ground theory illustrated above.This illustrates how adding redundant information may sometimes dramatically reduce the size of the grounding.
Since current grounders are optimized to ground formulas like (3) without trying all instances, grounding may also speed up a lot.However, manually adding redundancy to formulas has its disadvantages: it leads to more complex and hence, less readable theories.Worse, it might introduce errors.It requires a good understanding of the used grounder, since it depends on the grounder what information is beneficial to add and where.Also, a human developer could easily miss useful information.
The above motivates a study of automated methods for deriving such redundant information and of principled ways of adding it to formulas.We develop an algorithm that, given a model generation problem with input theory T and input data I σ , derives such redundant information, in the form of a pair of a symbolic upper and lower bound for each subformula of T .Each of these bounds is a formula over the vocabulary of I σ .For instance, for Example 1, our algorithm will compute Edge(x, y) as upper bound for Sub(x, y), meaning that if Edge(x, y) is not true, then Sub(x, y) is not true either.We also show how to insert these bounds in the formulas of T .For example, inserting the upperbound Edge(x, y) for Sub(x, y) and the upperbound Edge(x, z) for Sub(x, z) transforms (2) into (3).
The rest of this paper is organized as follows.In the next section we recall some notions from first-order logic (FO) and we introduce the notations used throughout the paper.In Section 3 we formally define grounding and model generation with additional data.In Section 4 we introduce upper-and lowerbounds for formulas.We present an any-time algorithm to compute them in the context of FO input theories.We show how the bounds can be used to rewrite the input theory to an equivalent theory that has a smaller grounding.
Although many search problems can be cast concisely and naturally as FO model generation problems, some problems require richer logics than FO.One such logic is FO(ID), an extension of FO with inductive definitions.Such definitions can be used to represent, e.g., the concept of reachability in a graph.In Section 5 we extend our rewriting method to FO(ID).
In Section 6 we discuss how to implement our algorithm to compute bounds.As a case study, we show for one particular grounding algorithm how it can be adapted to exploit bounds directly.We also present experimental results that indicate the impact of our method on grounding size and time.We end with related work and conclusions.
The current paper extends our previous work (Wittocx, Mariën, & Denecker, 2008c).Besides proofs for all main propositions and a more thorough experimental validation, also the following parts were added: • The theoretical result stating that our rewriting method certainly yields smaller groundings (Proposition 23); • The extension of the rewriting method to FO(ID) (Section 5); • The section about implementation issues (Section 6).

Preliminaries
In this section, we introduce the conventions and notations used in this paper.We assume the reader is familiar with FO.

First-Order Logic
A vocabulary Σ is a tuple Σ P , Σ F , Σ V where Σ P , Σ F and Σ V are respectively sets of predicate symbols, function symbols and variables.We identify constants with zero-arity function symbols.
Abusing notation, we will often leave out Σ V and simply write Σ P , Σ F to represent Σ.A vocabulary σ is a subvocabulary of Σ, denoted σ ⊆ Σ, if σ P ⊆ Σ P , σ F ⊆ Σ F and σ V ⊆ Σ V .Throughout this paper variables are denoted by lowercase letters, predicate and function symbols by uppercase letters.Each predicate and function symbol has an associated arity n ∈ N. We often denote a predicate symbol P by P/n and a function symbol F by F/n to indicate their arities.
Tuples and sets of variables are denoted by x, y, z.A term over Σ is inductively defined by • A variable x ∈ Σ is a term; • If F/n is a function symbol of Σ and t 1 , . . ., t n are terms over Σ, then F (t 1 , . . ., t n ) is a term.
Tuples of terms are denoted by t, t 1 , t 2 , . . . .A first-order logic formula over Σ is inductively defined by • If P/n is a predicate symbol and t 1 , . . ., t n are terms, then P (t 1 , . . ., t n ) is a formula.
• If t 1 and t 2 are two terms, then t 1 = t 2 is a formula.
We use ϕ ⊃ ψ, ϕ ≡ ψ and t 1 = t 2 as a shorthands for respectively ¬ϕ ∨ ψ, (ϕ ⊃ ψ) ∧ (ψ ⊃ ϕ) and ¬(t 1 = t 2 ).An atom is a formula of the form P (t) or t 1 = t 2 .A literal is an atom or the negation of an atom.An occurrence of a formula ϕ as subformula in a formula ψ is positive, respectively negative, if it occurs in the scope of an even, respectively odd, number of negations.
For a formula ϕ, we often write ϕ[x] to indicate that x are its free variables.That is, if y ∈ x, then y occurs in ϕ, but not in the scope of a quantifier ∀y or ∃y in ϕ.For a variable x and a term t, the formula ϕ[x/t] denotes the result of replacing all free occurrences of x in ϕ by t.This notation is extended to tuples of variables and terms of the same length.A sentence is a formula without free variables.A theory is a finite set of sentences.
A Σ-interpretation I consists of a domain D and • a relation P I ⊆ D n for each predicate symbol P/n ∈ Σ P ; • a function A Σ-structure is an interpretation of only the relation and function symbols of Σ.The restriction of a Σ-interpretation I to a vocabulary σ ⊆ Σ is denoted by I| σ .Vice versa, I is called an expansion of I| σ to Σ.For a variable x and domain element d, I[x/d] is the interpretation that assigns d to x and corresponds to I on all other symbols.This notation is extended to tuples of variables and domain elements of the same length.An interpretation I is called finite if its domain is finite.
The value t I of a term t in an interpretation I, and the satisfaction relation |= are defined as usual (e.g., Enderton, 2001).I is called a model of a formula ϕ if I |= ϕ.We denote by T 1 |= T 2 that every model of theory T 1 is also a model of theory T 2 .
A query is an expression of the form {x | ϕ}, where the free variables of ϕ are among x.A tuple d of domain elements is an answer to {x | ϕ} in a structure The set of all answers to {x | ϕ} in I is denoted by {x | ϕ} I .

Rewriting and Term Normal Form
In this paper we will use the following well-known equivalences to rewrite formulas to logically equivalent formulas.
To facilitate the presentation, we will sometimes require that formulas are in term normal form (TNF).We say that a formula ϕ is in TNF, if every atomic subformula of ϕ is of the form P (x), F (x) = y or x = y, and all negations occur directly in front of atoms.Using ( 10)-( 14), every formula can be transformed in an equivalent formula in TNF.We say that a theory is in TNF if all its sentences are.

SAT
A vocabulary Σ is propositional if Σ F = ∅ and every predicate symbol in Σ P has arity zero.A propositional theory (PC theory) is a theory over a propositional vocabulary.A propositional clause is a disjunction of propositional literals.A PC theory is in conjunctive normal form (CNF) if all its sentences are clauses.The Boolean satisfiability problem (SAT) is the NP-complete problem of deciding for a PC theory whether it is satisfiable.The NP search problem corresponding to a SAT problem is the problem of computing a witness of the decision problem in the form of a model of the theory.SAT solvers typically operate by constructing such a model.
Contemporary SAT solvers exhibit impressive performance.As such, many NP problems can be solved efficiently by translating them to SAT.For instance, this is done in the areas of model generation (Claessen & Sörensson, 2003;McCune, 2003), planning (Kautz & Selman, 1996) and relational data mining (Krogel et al., 2003).Most modern SAT solvers expect a CNF theory as input, instead of a general PC theory.When the input is a satisfiable theory, they return a model as a witness to their answer.

Model Generation and Grounding
Model generation is the problem of computing a model of a logic theory T , usually in the context of a given finite domain, typically the Herbrand Universe.A model generator allows to decide the satisfiability of the theory in the context of this fixed domain.This is useful, e.g., in the context of lightweight verification (Jackson, 2006).Beyond determining satisfiability, there is a broad class of problems of which the answers are naturally given by the models of a declarative domain theory.For example, the model of a theory specifying a scheduling domain typically contains a (correct) schedule.Thus, a model generator applied to this theory will solve the scheduling problem for this domain. 1This idea of model generation as a declarative problem solving paradigm has been pioneered in the area of ASP (Marek & Truszczyński, 1999;Niemelä, 1999).In this area, answers to a problem are given by the models of an ASP theory.
As mentioned in the introduction, many practical model generation problems contain additional data besides the input theory and finite domain.This data can be implicit in the input theory.For example, ASP problems can be split into two parts: a non-ground theory and a list of ground facts.The latter part essentially represents given data.In other contexts (Mitchell & Ternovska, 2005;Torlak & Jackson, 2007;Wittocx et al., 2008d), the data is given as a (partial) structure interpreting part of the vocabulary of the input theory.In this paper we assume without loss of generality that the data is represented by a structure.In practice, it is often the case that some preprocessing, e.g., materializing a view on a database, needs to be done before the data is in this format (see also Section 5.3.2).

The Model Expansion Search Problem
Model generation with an input theory and input structure is called model expansion.Model expansion for a logic L, denoted MX(L), is defined as follows.
Definition 1.Let T be an L-theory over a vocabulary Σ, σ a subvocabulary of Σ and I σ a finite σ-structure.The model expansion search problem with input T, I σ is the problem of computing a Σ-structure M such that M |= T and M | σ = I σ .
The vocabulary σ is called the input vocabulary of the problem, the vocabulary Σ\σ the expansion vocabulary.I σ is called the input structure.We denote by M |= Iσ T that M is a solution to the model expansion search problem with input T, I σ .Similarly, for a formula ϕ over Σ we denote by M |= Iσ ϕ that M expands I σ to Σ and satisfies ϕ.
Observe that if σ = Σ, model expansion reduces to model checking, while if σ = ∅, ∅ , it reduces to model generation for T with a given finite size.Also, if T is a theory over a vocabulary Σ containing no function symbols of arity greater than zero, Herbrand model generation for T can be simulated by model expansion.Indeed, let σ = ∅, Σ F , and I σ the structure with the Herbrand universe of T such that C Iσ = C for every constant C ∈ Σ F .
We illustrate model expansion by two examples.In the examples in this paper, we often use many-sorted FO, since this leads to more concise and readable sentences.In many-sorted FO, the domain of an interpretation is partitioned in sorts (or types), each variable has an associated sort, each n-ary predicate symbol has an n-tuple of associated sorts and each n-ary function symbol an associated (n + 1)-tuple of sorts.If I is an interpretation and variable x has associated sort s, then x I ∈ s I , where s I denotes the set of domain elements of sort s.Similarly, if P/n has associated sorts (s 1 , . . ., s n ), then . We often denote P by P (s 1 , . . ., s n ) and F by F (s 1 , . . ., s n ) : s n+1 to indicate their associated sorts.
Example 2 (Graph Colouring).The graph colouring problem is the problem of colouring a given graph with a given set of colours such that adjacent vertices have different colours.To express this problem in MX(FO), let V tx and Col be sorts and let σ = {Edge(V tx, V tx)}, ∅ .The sort Col denotes the given set of colours, the given graph is represented by V tx and Edge.Let Σ be the vocabulary σ P , {Colour(V tx) : Col} and T the theory that consists of the sentence Then model expansion with input theory T and input vocabulary σ expresses the graph colouring problem.Indeed, for any M |= Iσ T , Colour M is a proper colouring of the graph represented by I σ .
Example 3 (SAT).To represent the SAT problem in MX(FO), let σ be a vocabulary containing the two sorts Atom and Clause, representing the atoms and the clause of the input CNF theory, and the two predicates P osIn(Atom, Clause) and N egIn(Atom, Clause), to represent the positive, respectively negative, occurrences of atoms in clauses.The theory given by ∀c ∃a ((P osIn(a, c) ∧ T rue(a)) ∨ (N egIn(a, c) ∧ ¬T rue(a))) over Σ = σ P ∪ {T rue(Atom)}, ∅ expresses the SAT problem: for any M |= Iσ T , the propositional structure represented by T rue M is a model of the CNF theory represented by I σ .Indeed, the theory forces that every clause contains at least one true literal.
As shown by Mitchell and Ternovska (2005), it follows from Fagin's (1974) theorem that model expansion for FO captures NP, in the following sense: • For any fixed T and σ the problem of deciding whether there exists a model of T expanding an input structure I σ is in NP.
• Vice versa, for any NP decision problem X on the class of finite σ-structures there is a vocabulary Σ ⊇ σ and a first-order Σ-theory T such that model expansion with input theory T expresses X, i.e., I σ belongs to X iff there exists a Σ-structure M such that M |= Iσ T .
This result proves that any NP problem X can be expressed by an MX(FO) problem, and hence shows the broad applicability of MX(FO) solvers to solve NP problems.
As illustrated by the examples above, it is the intention that the theory T is an intuitive representation of a problem X.Not all NP problems can be represented in a natural manner in MX(FO).For instance, the problem of deciding whether a graph is connected can be expressed in MX(FO), but this requires a non-trivial encoding of a fixpoint operator in FO.Model expansion for richer logics than FO is better suited for such problems.In Section 5 we consider MX for FO(ID), an extension of FO with inductive definitions.

Reducing MX(FO) to SAT
For the rest of this paper, let T be a theory over a vocabulary Σ, σ a subvocabulary of Σ and I σ a finite σ-structure with domain D.
Since for every FO theory T , deciding whether T has a model expanding I σ is in NP, this problem can be reduced to a SAT problem T prop in polynomial time.However, if we want to find models of T expanding I σ by using a SAT solver, we need a method to translate models of T prop into models of T .Moreover, if we are interested in finding all models of T expanding I σ , a oneto-one correspondence between these models and the models of T prop is needed.In this paper we focus on reductions that preserve all models, which is the setting in the ASP paradigm (Marek & Truszczyński, 1999;Niemelä, 1999).
Let τ be the vocabulary of T prop .To have a one-to-one correspondence between the models of T expanding I σ and the models of T prop , it should be possible to represent Σ-structures expanding I σ by τ -structures.The most natural way to accomplish this is by choosing τ such that it contains a symbol P d for every P/n ∈ Σ P and d ∈ D n , and a symbol F d,d for every F/n ∈ Σ F and (d, d ) ∈ D n+1 .A τ -structure making P d , respectively F d,d true then corresponds to a Σ structure M such that d ∈ P M , respectively F M (d) = d .In this manner, every Σ-structure expanding I σ has a corresponding τ -structure.Vice versa, every τ -structure A satisfying the requirement that for every function symbol F/n and d ∈ D n , there is exactly one d ∈ D such that F d,d is true in A, corresponds to a Σ-structure with the same domains as I σ .That is, there is a one-to-one correspondence between the τ -structures satisfying for every function symbol F/n and d ∈ D n the formula and the Σ-structures with domain D.
Denote by Σ dom(Iσ) the vocabulary Σ extended with a new constant symbol d for every d ∈ D. We call these new constants domain constants.Abusing notation, we will denote both domain elements and their corresponding domain constants by d.For a formula ϕ[x] and a tuple d of domain constants, we call ϕ[x/d] an instance of ϕ.For a Σ-interpretation M expanding I σ and a formula ϕ containing domain constants, we denote by M |= ϕ that the expansion of M to Σ dom(Iσ) defined by interpreting every domain constant by its corresponding domain element, satisfies ϕ.Definition 2. Two formulas ϕ 1 and ϕ 2 over Σ dom (Iσ) are The following are some straightforward results about I σ -equivalence.
5. If ψ is a subformula of ϕ and is I σ -equivalent to ψ , then the result of replacing ψ by ψ in ϕ is I σ -equivalent to ϕ.
A formula is in ground normal form (GNF) if it contains no quantifiers and all its atomic subformulas are of the form respectively, i = j or i = j, and adding the formula (15) for every function symbol F/n and d ∈ D n , we obtain a propositional theory T prop such that the models of T and T prop correspond.Also note the similarity between GNF and TNF theories.Definition 4. A grounding for T with respect to I σ is a GNF theory T g over Σ dom (Iσ) such that T and T g are I σ -equivalent.T g is called reduced if it does not contain symbols of σ.

Grounding Algorithms
For the rest of this section, we assume that T is a theory in TNF.As explained in Section 2.2, we can make this assumption without loss of generality.Below we introduce, as a reference, the grounding for T with respect to I σ obtained by the naive grounding algorithm mentioned in the introduction.We call this grounding the full grounding and define it formally by induction.
Definition 5.The full grounding Gr full (ϕ, I σ ) of a TNF sentence ϕ with respect to I σ is defined by The full grounding for T with respect to I σ is the theory consisting of the full groundings of all sentences in T with respect to I σ .
We denote the full grounding by Gr full (T, I σ ), or by Gr full (T ) if I σ is clear from the context.It follows directly from Lemma 3 that Gr full (T, I σ ) is indeed a grounding for T with respect to I σ .The size of the full grounding is exponential in the maximal nesting depth of quantifiers in sentences of T , and polynomial in the domain size of I σ .
An inductive definition like (16) can be evaluated in a top-down or bottom-up way.Both approaches are applied in current grounders.On the one hand, there are grounders that go top-down through the syntax trees of the sentences in T .When a subformula ϕ of the form ∀x ψ[x], respectively ∃x ψ[x] is reached, the grounding of ψ[x/d] is constructed for every domain constant d, and then ϕ is replaced by the conjunction, respectively disjunction, of all these groundings.The grounder of the dlv system (Perri et al., 2007) and the grounders gringo (Gebser et al., 2007) and GidL (Wittocx, Mariën, & Denecker, 2008b) take this approach.
Other grounders go bottom-up through the syntax trees.For each subformula ϕ[x] a table is computed consisting of tuples d and corresponding groundings of ϕ[x/d].These tables are computed first for atomic formulas and subsequently for compound formulas.For example, let ϕ[x, y, z] be the formula ψ[x, y] ∧ χ[y, z] and assume the tables for ψ and χ have been computed.Then the table for ϕ is computed by taking the natural join of the tables for ψ and χ on the value for y, and constructing the grounding for ϕ[x/d x , y/d y , z/d z ] as the (possibly simplified) conjunction of the groundings for ψ[x/d x , y/d y ] and χ[y/d y , z/d z ].Examples of grounders with a bottom-up approach are lparse (Syrjänen, 2000;Syrjänen, 2009), kodkod (Torlak & Jackson, 2007) and mxg (Mitchell et al., 2006).
To obtain a reduced grounding for T with respect to I σ one could first construct the full grounding and then replace every subformula ϕ over σ dom (Iσ) in it by if I σ |= ϕ and by ⊥ otherwise.The result can further be simplified by recursively replacing ⊥ ∧ ψ by ⊥, ∧ ψ by ψ, etc.The resulting grounding is the one computed by most current grounding algorithms and is often a lot smaller than the full grounding.We denote it by Gr red (T, I σ ), or by Gr red (T ) if I σ is clear from the context.
Smart grounding algorithms do not use the approach outlined above, but try to avoid creating the full grounding by substituting ground formulas over the input vocabulary σ as soon as possible.For example, a grounder with a top-down approach constructs the grounding of ∀x ψ[x], by grounding all instances ψ[x/d] one by one and then making the conjunction.During this process, all instances ψ[x/d] that are detected to be certainly true are omitted.As soon as an instance ψ[x/d] is detected to be certainly false, ⊥ is returned as grounding for ∀x ψ[x].
A grounder using the bottom-up approach can reduce the size of the tables it computes by not storing tuples that have some default value, e.g., , as corresponding grounding.In particular, if ϕ[x] is a formula over σ, it only stores the tuples d such that I σ |= ϕ[x/d].By reducing the size of the tables in this way, the reduced grounding can be obtained much more efficiently.

Grounding with Bounds
In this section we present our method for reducing grounding size.As mentioned in the introduction, it is based on computing bounds for subformulas of the input theory T .Each bound for a subformula ϕ[x] is a formula over the input vocabulary σ.It describes a set of tuples d for which ϕ[x/d] is certainly true (false) in every model of T expanding any I σ .The larger the set described by a bound, the more precise the bound is.Observe that the fact that bounds are formulas over σ means that they can be evaluated using the given structure I σ .
In Section 4.1, we formally define bounds.Then we indicate how bounds can be inserted in T to obtain a new theory T .The reduced grounding of T is often a lot smaller than the reduced grounding of T .The more precise the inserted bounds are, the smaller the grounding of T becomes.However, we will see that T is in general weaker than T and that additional axioms have to be added to T to obtain equivalence with T .These additional axioms need to be grounded as well so that, if we are not careful, the total size of the grounded theory does not decrease at all.In Section 4.3, we search for sufficient conditions on the bounds to guarantee a smaller grounding.
In Section 4.4, we show how to derive bounds.Our method works in two stages.First, bounds for all subformulas of T are computed using an any-time algorithm.The longer the algorithm runs, the more precise bounds are derived.Often, the bounds derived at this stage do not lead to smaller groundings, for the reason explained in the previous paragraph.In the second stage, bounds that satisfy the conditions to guarantee smaller groundings are derived from the ones computed in the first stage.

Bounds
We distinguish between two kinds of bounds.Definition 6.A certainly true bound (ct-bound) over σ with respect to T for a formula ϕ[x] is a formula ϕ ct [y] over σ such that y ⊆ x and T |= ∀x (ϕ ct [y] ⊃ ϕ[x]).Vice versa, a certainly false bound (cf-bound) over σ with respect to T for ϕ[x] is a formula ϕ cf [z] over σ such that z ⊆ x and We do not mention σ and T if they are clear from the context.
Intuitively, a ct-bound ϕ ct for ϕ[x] provides for every structure I σ a lower bound for the set of tuples for which ϕ is true in every model of T expanding I σ .Indeed, for every M |= Iσ T we have that {x | ϕ ct } Iσ ⊆ {x | ϕ} M .Vice versa, a cf-bound ϕ cf provides a lower bound on the set of tuples for which ϕ is false: {x | ϕ cf } Iσ ⊆ {x | ¬ϕ} M for every M |= Iσ T .Observe that the negation of a ct-bound, respectively cf-bound, gives an upper bound on the set of tuples for which ϕ is false, respectively true, in at least one model of T expanding I σ .
Observe that is a ct-bound for every sentence of T .Indeed, for every sentence ϕ of T , T |= ϕ and therefore T |= ⊃ ϕ.Also, ⊥ is a ct-bound as well as a cf-bound for every formula.We call ⊥ the trivial bound.Intuitively, the trivial bound contains no information at all: {x | ⊥} Iσ = ∅ for every I σ and x.According to the following definition, it is the least precise bound.
If ψ is a more precise bound for ϕ[x] than χ, ψ provides a larger lower bound because {x | χ} Iσ ⊆ {x | ψ} Iσ for every I σ .Definition 8.A c-map C for T over σ is a mapping from all subformulas ϕ of T to tuples (C ct (ϕ), C cf (ϕ)), where C ct (ϕ) and C cf (ϕ) are respectively a ct-and cf-bound for ϕ over σ with respect to T .

The notion of precision pointwise extends to
Let M be a model of T and C a c-map for T over σ.From the definition of ct-and cf-bounds it follows immediately that for every subformula ϕ We say that a structure satisfies C if it has precisely this property.
Definition 9. Let C be a c-map for T over σ.Then the theory C is defined by A c-map is inconsistent if some formula ϕ is both certainly true and false for some tuple, according to that c-map: Proposition 11.If there exists an I σ -inconsistent c-map for T over σ, then M |= Iσ T for every M .If there exists an inconsistent c-map for T over σ, then M |= Iσ T for every M and I σ .
Proof.Let C be an I σ -inconsistent c-map for T over σ and ϕ[x] a subformula of T such that Then there exists a tuple of domain elements d such that To prove the second statement, let C be an inconsistent c-map for T over σ.Then C is a also an I σ -inconsistent c-map for every σ-structure I σ .As such, for any I σ there is no model of T expanding I σ .

C-Transformation
For the rest of this section, fix a c-map C for T over σ.We now show how to insert the bounds of C into the sentences of T .This insertion is based on the following lemma.
Lemma 13.Let ψ be a sentence of T and ϕ a subformula of ψ.If ψ is the result of replacing the subformula ϕ in ψ by ϕ then both ϕ∨C ct (ϕ) and ϕ∧¬C cf (ϕ) are logically equivalent to ϕ.Hence, in this case the sentence ψ in Lemma 13 is essentially the sentence ψ.Intuitively, adding trivial bounds to a sentence ψ does not change the sentence at all.
The bounds assigned by C can be "inserted" in T by applying the transformation of Lemma 13 to all subformulas of T .The result is called a c-transformation of T , and is formally defined as follows.
Definition 14 (c-transformation).A c-transformation of a subformula ϕ of T with respect to C, denoted C ϕ , is the formula (ϕ ∧ ¬C cf (ϕ)) ∨ C ct (ϕ) where ϕ is defined by A c-transformation C T of T with respect to C consists of a c-transformation with respect to C of every sentence of T .
From Lemma 13, we derive the following.
Lemma 15.T and C T are C-equivalent.
In general T and C T are not logically equivalent.C T may have models that do not satisfy C, and therefore cannot be models of T .For example, let C be the c-map that assigns ( , ⊥) to every sentence and (⊥, ⊥) to every other subformula of T .Then all sentences in C T are of the form ϕ ∨ and hence C T simplifies to , which is in general not equivalent to T .To obtain from C T a theory that is equivalent to T , we must add C.

Atom-Based and Atom-Equal C-Maps
Corollary 17 implies that we can compute a grounding for T with respect to I σ by first computing a c-map C for T over σ and then grounding C T ∪ C.This approach is beneficial if the reduced grounding of C T ∪ C is smaller than the reduced grounding of T , and can be constructed at least as fast.In general these conditions are not satisfied.The more precise c-map C is, the smaller the reduced grounding of C T becomes, but the larger the reduced grounding of C is: Moreover, every subformula that occurs in Gr red (C 1 T ) also occurs in Gr red (C 2 T ).
Proof.(Sketch) Let ϕ[x] be a subformula of T and d a tuple of domain elements.It suffices to show that if C 2 ϕ [x/d] is replaced by , respectively ⊥, when grounding, then this is also the case for C 1 ϕ [x/d].This can be proven by induction.For the base case, assume ϕ is an atom.Then If this formula is replaced by or ⊥ when grounding, there are three possibilities: ] is replaced by or ⊥ when grounding, then this is also the case for A c-map that is useful to reduce grounding size should therefore not be too precise, in order to avoid a large theory Gr red (C), but still be precise enough to decrease the size of Gr red (C T ).In this section, we present sufficient conditions to ensure these properties.We first define a class of c-maps that "avoid" a blow-up of Gr red (C) by ensuring C can be replaced by an equivalent, smaller and easy-to-find theory C A .As such, Gr red (C) can be replaced by the smaller theory Gr red (C A ).In the class we present, C A is a subset of C, namely the set of sentences in C that stem from the atomic subformulas of T : Definition 20.Define the theory C A by Example 5 (Example 1 ctd.).Let C 2 be the c-map that assigns (⊥, ¬(Edge(x, y) ∧ Edge(x, z))) to Sub(x, y) ∧ Sub(x, z) and (⊥, ⊥) to every other subformula.C 2 is not atom-based, since (C 2 ) A is equivalent to , while C 2 contains the sentence Let C 3 be the c-map that assigns (⊥, ¬Edge(x, y)) to Sub(x, y), (⊥, ¬Edge(x, z)) to Sub(x, z) and corresponds to C 2 on all other subformulas of T 1 .C 3 is atom-based.Indeed, (C 3 ) A consists of the (equivalent) sentences and C 3 consists of the sentences ( 17), ( 18) and ( 19).Both ( 18) and ( 19) imply ( 17), and therefore, Clearly, a c-map assigning (⊥, ⊥) to every non-atomic subformula of T is an example of an atombased c-map.As such, any c-map can be transformed into an atom-based one by replacing every bound assigned to a non-atomic subformula by ⊥.In the next section, we show how to compute more interesting atom-based c-maps.
Observe that Gr red (C A ) contains only unit clauses.Combining the definition of atom-based c-map and Theorem 16 immediately gives the following result.
Proposition 21.Let C be an atom-based c-map for T over σ.Then T and C T ∪C A are equivalent, and hence I σ -equivalent for every σ-structure I σ .
To obtain small groundings using bounds, it is important that the information in the bounds is exploited wherever possible.In particular, if a ct-or cf-bound ψ is assigned to an atom P (x), then a similar bound should be assigned to every other atom of the form P (y).We call a c-map atom-equal if it has exactly this property for all atomic subformulas of T .That is, C is atom-equal if it assigns essentially the same bounds to atomic subformulas over the same predicate or function symbol: Definition 22.A c-map C for a TNF theory T over σ is atom-equal if for every predicate symbol P/n there exist formulas ϕ ct P [x 1 , . . ., x n ] and ϕ cf P [x 1 , . . ., x n ] such that for every atom and similarly for function symbols.
Note that if no predicate or function symbol occurs more than once in a theory T , then every c-map for T is atom-equal.
Example 6 (Example 1 ctd.).Let T 2 be the theory obtained by adding the sentence ∃w Sub(w, w) to T 1 .The only predicate that occurs more than once in T 2 is the predicate Sub.Let C 4 be a c-map for T 2 that assigns the following bounds to the atomic subformulas of T 2 over Sub: (⊥, ¬Edge(u, v)) to Sub(u, v), (⊥, ¬Edge(x, y)) to Sub(x, y), (⊥, ¬Edge(x, z)) to Sub(x, z) and (⊥, ¬Edge(w, w)) to Sub(w, w).Then C 4 is atom-equal.Indeed, if we take ϕ ct Sub = ⊥ and ϕ cf Sub = ¬Edge(x 1 , x 2 ), then the conditions of Definition 22 are satisfied for predicate Sub.
For an atom-equal c-map C, C A in general contains many equivalent sentences.For example, for the c-map C 4 as in Example 6, (C 4 ) A contains amongst others, the equivalent sentences ( 18) and ( 19).It also contains ∀w ¬Edge(w, w) ⊃ ¬Sub(w, w), which is implied by ( 18).As a result, if C is an atom-equal c-map, grounding C A in a naive way yields a grounding that contains several formulas more than once.In the following proposition, we assume this redundancy is removed.In other words, we assume a grounding algorithm for C A that never adds the same GNF formula more than once to the grounding.This can be accomplished by grounding instead of C A the sentences ∀x (ϕ ctb P ⊃ P (x)) and ∀x (ϕ cfb P ⊃ ¬P (x)) for every predicate symbol P , where ϕ ctb P and ϕ cfb P are as in Definition 22, and similarly for function symbols.
Proposition 23.Let C be an atom-based, atom-equal c-map for a TNF theory T .If T has a model expanding I σ , then Gr red (C T ∪ C A ) is at most as large as Gr red (T ).
In the proof, we denote the size of a theory T g by |T g |.
Proof.The outline of this proof is as follows.First, we show that every subformula that occurs in Gr red (C T ), occurs in Gr red (T ).Then, we prove that no atom occurring in Gr red (C A ) occurs in Gr red (C T ).Next, we show that every atom occurring in Gr red (C A ) occurs at least once in Gr red (T ).Since we assumed Gr red (C A ) does not contain any formula more than once, it follows that We can directly apply Proposition 18 to show that every subformula of Gr red (C T ) occurs in Gr red (T ): if C is the trivial c-map, then Gr red (T ) is equal to Gr red (C T ), and clearly C is more precise than C .
We now show that none of the atoms occurring in Gr red (C A ) occur in Gr red (C T ).Let P (d) be an atom occurring in Gr red (C T ).Then there is an atomic subformula It remains to show that every atom that occurs in Gr red (C A ) also occurs in Gr red (T ).Let M be a model of Gr red (T ).Such a model exists because we assumed that T has a model expanding I σ .Let P (d) be an atom that does not occur in Gr red (T ).If P is a predicate of the input vocabulary, then P (d) does not occur in Gr red (C A ) either.If on the other hand, P is in the expansion vocabulary, then the structure M obtained from M by swapping the truth value of P (d) is also a model of Gr red (T ).Since Gr red (C T ∪ C A ) is I σ -equivalent to Gr red (T ) and P ∈ σ, it follows that M |= Gr red (C A ) and M |= Gr red (C A ).Because Gr red (C A ) only contains unit clauses, we conclude that P (d) does not occur in Gr red (C A ).
We now have the following algorithm to create a small grounding for T with respect to I σ : first compute an atom-based, atom-equal c-map C for T over σ (We will present an algorithm for this in Section 4.4).If C is I σ -inconsistent, output ⊥ and stop.Else, output Gr red (C T ∪ C A ).
It follows from Propositions 11 and 21 that the result of this algorithm is indeed a grounding for T with respect to I σ .Observe that the first step of this algorithm is independent of I σ .If one has to solve several model expansion problems with a fixed input theory T and input vocabulary σ, but varying I σ , it suffices to compute C only once.
To perform the last step of the algorithm, one could apply any off-the-shelf grounder on input C T ∪ C A .

Computing Bounds
We now present an algorithm to compute a (non-trivial) c-map C. It is based on our work on approximate reasoning for FO (Wittocx, Mariën, & Denecker, 2008a).In general the resulting cmap is neither atom-based nor atom-equal, but an atom-based, atom-equal c-map can be derived from it.

Refining C-Maps
Constructing a non-trivial c-map can be done by starting from the least precise c-map, i.e., the one that assigns (⊥, ⊥) to every subformula of T , and then gradually refining it.Each refinement step consists of three operations: 1. Choose a subformula ϕ of T .
2. Compute from the current c-map C a new ct-bound ϕ r ct or cf-bound ϕ r cf for ϕ.Below, we elaborate on this step: we present six different ways to obtain new ct-or cf-bounds, called refinement bounds, from T and C. If the sentences of T are represented by their "syntax trees", each node corresponds to a subformula of T .Bottom-up refinement bounds are bounds for a node computed by considering the bounds assigned by C to its children.Vice versa, top-down refinement bounds are computed by looking at the parents and siblings of a node.Axiom refinement bounds are bounds for the roots, i.e., for the sentences of T , while input, copy and functional refinement bounds are in practice mainly bounds for atomic subformulas of T .
According to the following lemma, a refinement step yields a new bound for ϕ that is more precise than the one assigned by C.
Lemma 24.If ψ and χ are two ct-bounds for ϕ with respect to T , then ψ ∨ χ is also a ct-bound for ϕ.Moreover, ψ ∨ χ is more precise than ψ and more precise than χ.The same holds for cf-bounds.
We conclude that repeatedly applying refinement steps leads to a more and more precise c-map.The resulting algorithm is an any-time algorithm.In Section 6 we will discuss a stop criterion for the algorithm.We will also give examples where it can reach a fixpoint, and examples where it cannot.
We now present the different ways to obtain refinement bounds.
Input Refinement Let ϕ[x] be a formula over the input vocabulary σ.
We call these input refinement ct-and cf-bounds.
Axiom Refinement If ϕ is a sentence of T , then is an axiom refinement ct-bound for ϕ.This refinement bound states that a sentence of T is true in every model of T .
Bottom-Up Refinement For a compound subformula ϕ, depending on its structure, Table 1 gives the bottom-up refinement ct-bound ϕ r ct and cf-bound ϕ r cf for ϕ with respect to C. It is rather straightforward to obtain these formulas.For instance, the formula in the bottom-right of the table indicates that if ϕ is the formula ψ ∨ χ, then ϕ is certainly false for those tuples for which both ψ and χ are certainly false.Or, more formally, if both Top-Down Refinement In the case of top-down refinements, the bounds of a formula ψ are used to construct refinement bounds for one of its direct subformulas ϕ (i.e., ϕ is one of ψ's children in the syntax tree).The top-down refinement ct-bounds ϕ r ct and cf-bounds ϕ r cf for ϕ are given in Table 2.In this table, the tuple y denotes the free variables of ψ that do not occur in ϕ and x denotes a new variable.We illustrate some of these refinement bounds.For further explanation why Table 2: Top-down refinement bounds these bounds are in a certain sense the most precise ones that can be obtained, we refer to our work on approximate reasoning (Wittocx et al., 2008a).Let ψ be the formula ∀x P (x, y).Recall that intuitively, the ct-bound C ct (ψ) indicates for which domain elements d, ∀x P (x, d) is certainly true.For such a d and an arbitrary d ∈ D, P (d , d) must be true.Hence, C ct (ψ) is a ct-bound for ϕ.Indeed, since x does not occur free in C ct (ψ), T |= ∀x∀y (C ct (ψ) ⊃ P (x, y)) follows from T |= ∀y (C ct (ψ) ⊃ ∀x P (x, y)).Now let ψ be the formula P (x) ∧ Q(x, y).If we know that Let ψ be the formula ∃x P (x, y) and assume that ∃x P (x, d y ) is certainly true, but for all d x , except d x , P (d x , d y ) is certainly false.Then we can conclude that P (d x , d y ) must be true.This is precisely what is expressed by the formula Functional Refinement If ϕ[x, y] is the formula F (x) = y, functional refinement bounds for ϕ take into account that F is a function.The functional refinement ct-bound ϕ r ct and cf-bound ϕ r cf are given by: where y is a new variable.Informally, the first of these formulas indicates that F (x) is certainly equal to y if for every y = y, F (x) is certainly not equal to y .The second one says that F (x) is certainly not equal to y if F (x) is certainly equal to y for some y = y.
Copy Refinement Let ϕ[x 1 , . . ., x n ] and ψ[y 1 , . . ., y m ] be two formulas such that ϕ[x 1 /z, . . ., x n /z] and ψ[y 1 /z, . . ., y m /z] are the same, modulo a renaming of their non-free variables.That is, ϕ and ψ have exactly the same syntax tree, but their variables may differ.Denote by E(ϕ, ψ) the set of all equalities x i = y j such that for some occurrence of x i in ϕ, y j occurs in the corresponding position in ψ.Then the formula ∃y 1 . . .∃y m (C ct (ψ) ∧ E(ϕ, ψ)) is a copy refinement ct-bound for ϕ and the formula ∃y 1 . . .∃y m (C cf (ψ) ∧ E(ϕ, ψ)) is a copy refinement cf-bound for ϕ.We also say that these are the copy-refinement bounds from ψ to ϕ.
Example 7. Let ϕ be the formula P (x 1 , x 1 ) ∧ ∀s Q(x 2 , s) and ψ the formula P (y 1 , y 2 ) ∧ ∀t Q(y 2 , t).Because ϕ[x 1 /z, x 2 /z] is equal to ψ[y 1 /z, y 2 /z] modulo the renaming of s by t, these formulas satisfy the requirement for copy refinement.The set E(ϕ, ψ) is given by {x 1 = y 1 , x 1 = y 2 , x 2 = y 2 } and hence, is a copy refinement ct-bound for ϕ.Observe that if C ct (ψ) does not contain bounded occurrences of x 1 or x 2 , this formula is equivalent to the simpler formula One-Step Refinements We call ϕ r ct (ϕ r cf ) a refinement ct-bound (cf-bound) for ϕ with respect to C if it is an input, axiom, bottom-up, top-down, functional or copy refinement ct-bound (cf-bound) for ϕ with respect to C. Lemma 25 states that a refinement ct-bound (cf-bound) is indeed a ct-bound (cf-bound).

Lemma 25. If ϕ r
ct is a refinement ct-bound for ϕ with respect to C, then it is a ct-bound for ϕ.Similarly for cf-bounds.
Proof.The proof consists of a simple analysis of all cases.We proved some of the cases when we introduced input, bottom-up and top-down refinement.The proof of the other cases is similar.
Definition 26.Let C be a c-map for T over σ, ϕ a subformula of T , ϕ r ct a refinement ct-bound and ϕ r cf a refinement cf-bound for ϕ with respect to C. An assignment C r that corresponds to C, except that it assigns From Lemma 24 and 25 we obtain the following result.
Proposition 27.Every one-step refinement of a c-map for T over σ is a c-map for T over σ.
As already mentioned at the beginning of this section, one can compute a c-map for T over σ by first assigning (⊥, ⊥) to every subformula of T and then repeatedly applying one-step refinements.We call this nondeterministic any-time algorithm the refinement algorithm.
Example 8 (Example 1 ctd.). Figure 1 shows a possible run of the refinement algorithm for input T and σ.Here, the sentences of T 1 are represented by their syntax trees.The numbers indicate at which step the bounds are refined.The trivial bounds are not shown.
In step (1), ct-bound ⊥ for the first sentence is replaced by ⊥ ∨ using axiom refinement.Of course, this new bound can be simplified to .For all following steps, the figure shows simplified bounds.In step ( 2) and (3) the bounds of subformula Edge(u, v) are refined by input refinement.Then, top-down refinement is used to set the ct-bound of ¬Sub(u, v) ∨ Edge(u, v) to .Next, by top-down refinement, ¬Edge(u, v) becomes the ct-bound for ¬Sub(u, v) and then the cf-bound for Sub(u, v).
At this step, a fixpoint is reached: every one-step refinement that can be performed yields a bound that is logically equivalent to the one it tries to refine.
Example 9. Consider a simplified planning problem, where actions should be scheduled such that if an action a p is a precondition of an action a 0 , then a p is performed at an earlier time point than a 0 .This problem is described by the theory T 3 , consisting of the sentence ∀a 0 ∀a p ∀t 0 P rec(a p , a 0 ) ∧ Do(a 0 , t 0 ) ⊃ (∃t p t p < t 0 ∧ Do(a p , t p )).
From this sentence, it follows that if a chain of i actions must be executed before a 0 can be executed, then a 0 cannot be executed before the ith timepoint.Therefore, for any i > 0, the following formula is a cf-bound for Do(a 0 , t 0 ) over σ 2 = {P rec, <}: Denote this formula by χ i .For any n > 0 and a sufficient number of steps, the refinement algorithm can derive that ψ n := χ 1 ∨ . . .∨ χ n is a cf-bound for Do(a 0 , t 0 ).Clearly, for n 1 = n 2 , ψ n1 is not logically equivalent to ψ n2 .This indicates that the refinement algorithm will not reach a fixpoint for input T 3 and σ 2 .
As shown by the examples, there are several issues concerning the practical implementation of the refinement algorithm.
1. Due to the non-deterministic nature of the algorithm, a heuristic is needed to choose which bounds to refine and which kind of refinement to apply.A reasonable choice is to first apply all possible axiom and input refinements.Then, top-down refinement for formula ϕ is applied only if a bound for its parent or one of its siblings in the syntax tree has recently been refined.Similarly, bottom-up refinement is applied if a bound for one of ϕ's children has been refined.Such a strategy was used in Example 1.
2. The bounds should be simplified at regular time points, i.e., they should be replaced by equivalent but smaller formulas.If bounds are not simplified, they can only grow in size, rapidly leading to formulas of unwieldy size.A simplification algorithm is discussed in Section 6.
(15) cf: y = z∨ ¬Edge(x, y) ∨ ¬Edge(x, z) (1) ct: ⊥ ∨ (4) ct: (7) ct: (10) ct: (2) ct: Figure 1: Refining a c-map 3. To be able to detect that a fixpoint has been reached, one needs to find out that two bounds are equivalent.In general this is undecidable.To detect a fixpoint in at least some cases, one could use an FO theorem prover (and restrict its running time).
In case a fixpoint cannot be reached or detected, another stop criterion is needed.For example, one could restrict the number of one-step refinements, or the total time the refinement algorithm can use.Another stop criterion, and a simple fixpoint check are discussed in Section 6.

Extracting an Atom-Based and Atom-Equal C-Map
The c-maps obtained by the refinement algorithm are in general neither atom-based nor atom-equal.
To derive from an arbitrary c-map C an atom-equal c-map that is at least as precise as C, we first collect for each predicate P all bounds that are assigned to occurrences of P in the theory.Then the disjunction of these bounds is assigned as new bound to each occurrence of P .Because all bounds assigned to atoms over P are then essentially the same, we have an atom-equal c-map.We now present this method more formally: Definition 28.Let C be a c-map for a TNF theory T and P/n a predicate.Let P (x 11 , . . ., x 1n ), . . ., P (x m1 , . . ., x mn ) be all occurrences of P in T and let y 1 , . . ., y n be n new variables.Denote by ϕ i ct , respectively ϕ i cf , the formulas where the variables x ij are new variables.The ct-copy closure of P (x k1 , . . ., x kn ) with respect to C is the disjunction 1≤i≤m ϕ i ct [y 1 /x k1 , . . ., y n /x kn ].The cf-copy closure of P (x k1 , . . ., x kn ) is the formula 1≤i≤n ϕ i cf [y 1 /x k1 , . . ., y n /x kn ].The copy-closure for atoms of the form F (x) = y is defined similarly.
We denote the ct-copy closure of an atom ϕ by copy C ct (ϕ), and its cf-copy closure by copy C cf (ϕ).Definition 29.The copy-closure of C is the c-map that assigns (copy C ct (ϕ), copy C cf (ϕ)) to every atomic subformula ϕ of T , and corresponds to C on all other subformulas.
Proposition 30.The copy-closure of a c-map is an atom-equal c-map.
Proof.This follows immediately from the definition of atom-equal c-map since for every predicate symbol P (or function symbol F ), the same bounds, namely the formulas 1≤i≤n ϕ i ct and 1≤i≤n ϕ i cf mentioned in definition 28, are assigned to every atom over P (respectively F ).
Recall that a c-map C is atom-based if C is implied by C A , i.e., by all sentences in C that stem from bounds for atomic subformulas of T .A method to derive an atom-based c-map from an arbitrary c-map is based on the following observation.Let C be a c-map for T over σ and let ϕ[x] be the subformula χ ∧ ψ of T .If C ct (ϕ) is the formula C ct (χ) ∧ C ct (ψ), i.e., it is the bottom-up refinement ct-bound for ϕ with respect to C, then It is easy to check that the same property holds for all other bottom-up refinement bounds: Lemma 31.Let C be a c-map for T over σ and ϕ[x] a subformula of T , and let ϕ r ct and ϕ r cf be the bottom-up refinement bounds for ϕ with respect to C. If S is the set of direct subformulas of ϕ, i.e., its children in the syntax tree, and T is the theory given by Observe that a bottom-up c-map C for T is completely determined by the bounds it assigns to the atomic subformulas of T .Hence, given a c-map, one can derive a bottom-up c-map from it by retaining the bounds for the atomic subformulas and then computing the corresponding bottom-up c-map.We conclude that we can derive an atom-based, atom-equal c-map from an arbitrary c-map by deriving an atom-based c-map from its copy-closure.
Example 11 (Example 1 ctd.).Let C 6 be the fixpoint shown in Figure 1.This c-map is atom-equal (and equivalent to its copy-closure).The bottom-up c-map derived from C 6 is shown in Figure 2. Observe that this c-map is less precise than C 6 .For instance, the cf-bound assigned by C 6 to the conjunction Sub(x, y) ∧ Sub(x, z) is a disjunction of two bounds, namely bound y = z, obtained by top-down refinement, and bound ¬Edge(x, y) ∨ ¬Edge(x, z), obtained by bottom-up refinement.In the c-map of Figure 2, only the latter bound is present.
This formula contains repeated constraints Edge(x, y) and Edge(x, z) on the variables x, y and z.
In general bottom-up c-maps produce many such repetitions.These could easily be eliminated to speed up the grounding process, but it depends on the used grounding algorithm which ones are best deleted.

Inductive Definitions
Although all NP problems can be cast as MX(FO) problems, modelling such problems using pure FO can be extremely complex.In practice, modelling is often enhanced considerably by using extensions of FO with constructs such as inductive definitions, subsorts, aggregates, partial functions and arithmetic.For this enriched language we have implemented the model generator idp (Wittocx et al., 2008b;Wittocx & Mariën, 2008).2 In this paper we focus on grounding of the extension of FO with inductive definitions.It is well-known that in arbitrary domains, inductively definable concepts such as "reachability" are not FO-expressible.In finite domains however, they can be encoded (e.g., by encoding the fixpoint construction), but the process is tedious and leads to large theories.In this section we will extend the refinement algorithm to FO(ID) (Denecker, 2000;Denecker & Ternovska, 2008).This language extends FO with a construct for representing some of the most common types of inductive definitions: monotone induction and non-monotone induction such as induction over a well-founded order and iterated inductive definitions.Such definitions have many applications in real-life computational problems, e.g., in planning problems or problems involving reachability or dynamic systems (Denecker & Ternovska, 2008, 2007).At the same time, FO(ID) is also an integration of FO and logic programming.

Three-Valued Structures
While FO(ID) has a standard two-valued semantics, three-valued structures are used in the formal semantics of definitions.Indeed, an inductive definition defines a set by describing how to construct it.In the semantics, the intermediate stages of the construction are recorded by three-valued sets, representing for any object whether it belongs to the set or not, or whether this has not yet been derived.We therefore recall the basic concepts of three-valued logic.
We denote the truth values true, false and unknown by respectively t, f and u.A three-valued Σ-interpretation Ĩ consists of a domain D and • a domain element x Ĩ ∈ D for each variable x; • a function P Ĩ : D n → {t, f, u} for each predicate symbol P/n; • a function F Ĩ : D n → D for each function symbol F/n.
If P Ĩ (d) = u for every tuple d of domain elements and predicate symbol P , then Ĩ is two-valued: it corresponds to the interpretation I that assigns d ∈ P I iff P Ĩ (d) = t for every predicate P and corresponds to Ĩ on all other symbols.
The truth order ≤ on the set of truth values is induced by f < u < t, the precision order ≤ p is induced by u < p f and u < p t.These orders are extended to three-valued Σ-structures: if Ĩ and J correspond on Σ F , then we define • Ĩ ≤ J iff P Ĩ (d) ≤ P J (d) for every d and P ; • Ĩ ≤ p J iff P Ĩ (d) ≤ p P J (d) for every d, P .
Observe that two-valued structures are maximally precise three-valued structures.On the other hand, the least precise three-valued structure assigns P Ĩ (d) = u for every d and P .
We define the truth value Ĩ(ϕ) of a formula ϕ in a three-valued interpretation Ĩ with domain D by the standard Kleene semantics: • Ĩ(P (t 1 , . . ., t n )) := P Ĩ (t Ĩ 1 , . . ., t Ĩ n ); An atom of the form P (d), where d is a tuple of domain constants, is called a domain atom.For a truth value v and a domain atom P (d), we denote by Ĩ[P (d)/v] the interpretation that assigns v to P (d) and corresponds to Ĩ on all other symbols.This notation is extended to sets of domain atoms.

Inductive Definitions
An FO(ID) theory is a set of FO sentences and definitions.A definition ∆ is a finite set of rules of the form3 ∀x (P (x) ← ϕ), where P is a predicate and ϕ an FO formula.The free variables of ϕ should be among x.P (x) is called the head of the rule, ϕ the body.Predicates that occur in the head of a rule of ∆ are called defined predicates of ∆.The set of all defined predicates of ∆ is denoted Def(∆).All other symbols are called open with respect to ∆.The set of open symbols of ∆ is denoted by Open(∆).
Observe that an FO(ID) theory has the appearance of an FO theory augmented with a collection of logic programs.As illustrated by Denecker and Ternovska (2008), this entails that FO(ID)'s definitions can not only be used to represent mathematical concepts, but also for the sort of common sense knowledge that is often represented by logic programs, such as (local forms of) the closed world assumption, inheritance, exceptions, defaults, causality, etc.
The semantics of definitions is given by their well-founded model (Van Gelder, Ross, & Schlipf, 1991).As argued by Denecker and Ternovska (2008), the well-founded semantics correctly formalizes the semantics of all of the above mentioned types of inductive definitions in mathematics.We borrow the presentation of this semantics from Denecker and Vennekens (2007).
Definition 34.Let ∆ be a definition and Ĩ a three-valued structure.A well-founded induction for ∆ above Ĩ is a sequence Jξ 0≤ξ≤α of three-valued structures such that 1. J0 assigns P J0 (d) = u, if P is a defined predicate and corresponds to Ĩ on the open symbols; 2. For each limit ordinal λ ≤ α, Jλ = lub ≤p { Jξ | ξ < λ}; 3.For every ordinal ξ, Jξ+1 relates to Jξ in one of the following ways: (a) Jξ+1 = Jξ [P (d)/t] for some domain atom P (d) such that P Jξ (d) = u and for some rule where U is a set of domain atoms, such that for each P (d) ∈ U , P Jξ (d) = u and for all rules ∀x (P (x Intuitively, (a) says that a domain atom P (d) can be made true if there is a rule with P (x) as head and body ϕ such that ϕ[x/d] is already true.On the other hand (b) explains that P (d) can be made false if there is no possibility of making a corresponding body true, except by circular reasoning.The set U , commonly called an unfounded set, is a witness to this: making all atoms in U false also makes all corresponding bodies false.
A well-founded induction is called terminal if it cannot be extended anymore.The limit of a terminal well-founded induction is its last element.Denecker and Vennekens (2007) show that each terminal well-founded induction for ∆ above Ĩ has the same limit, which corresponds to the wellfounded model of ∆ extending Ĩ| Open(∆) , and is denoted by wfm ∆ ( Ĩ).The well-founded model is three-valued in general.
A two-valued structure I satisfies a definition ∆ if I = wfm ∆ (I).An FO(ID) theory T is a finite set of FO sentences and definitions.I satisfies T if it satisfies all definitions and sentences in T .If ∆ is a definition over Σ and J a Σ| Open(∆) -structure, there exists at most one expansion I of J to Σ such that I |= ∆.A definition is called total if for any Σ| Open(∆) -structure J there is precisely one expansion I of J to Σ that satisfies ∆.Intuitively, total definitions correspond to well-formed definitions: for every defined predicate P , they define for each tuple of domain elements whether d belongs to the relation denoted by P or not.If a definition is not total, this typically indicates an error.Hence in practice, all definitions that occur in MX(FO(ID)) specifications are total.For example, this is the case for all MX(FO(ID)) specifications used in the second ASPcompetition (Denecker, Vennekens, Bond, Gebser, & Truszczyński, 2009).In general, checking whether a definition is total is undecidable.However, there are several broad and easily recognizable classes of total definitions.For example, all monotone and stratified definitions are total.
We give some examples of definitions and MX(FO(ID)) problems.
Example 12. Definition ∆ 1 defines relation T C to be the transitive closure of relation R.
. A well-known concept that we will use later on in this section is the completion of a definition.The completion of a definition ∆ is an FO theory that is weaker than ∆, and is defined as follows.
Definition 35.The completion of a definition ∆ is the FO theory that contains for every P ∈ Def(∆) the sentence ∀x (P (x) where ∀y 1 (P (y 1 ) ← ϕ 1 ), . . ., ∀y n (P (y n ) ← ϕ n ) are the rules in ∆ with P in the head.
We denote the completion of ∆ by Comp(∆).Clearly, every body of a rule in ∆ occurs in Comp(∆).If T is a theory then we denote by Comp(T ) the result of replacing in T all definitions by their completion.The following result states that the completion of T is weaker than T .
The SAT(ID) problem is the problem of deciding whether a given propositional FO(ID) theory is satisfiable.Currently there exist three SAT(ID) solvers.IDsat (Pelov & Ternovska, 2005) works by translating a SAT(ID) problem into an equivalent SAT problem and then calls a SAT solver.MidL (Mariën, Wittocx, & Denecker, 2007) and MiniSAT(ID) (Mariën, Wittocx, Denecker, & Bruynooghe, 2008) take a native approach.Mariën (2009) provides details on the specific form of propositional FO(ID) theories accepted by these solvers, and a method to transform arbitrary propositional FO(ID) theories into this form.

Grounding Inductive Definitions
Like MX(FO) problems, MX(FO(ID)) problems can be reduced to SAT(ID) problems by grounding.In this section we extend grounding and the refinement algorithm of Section 4 to FO(ID).Without loss of generality (Mariën, Gilis, & Denecker, 2004), we assume that none of the predicates of the input vocabulary σ is defined by a definition in T , and no predicate is defined by more than one definition.Moreover, we assume that every rule body is in TNF.

Full and Reduced Grounding
Let T be an FO(ID) theory.As for FO, a grounding T g for T with respect to I σ is a propositional FO(ID) theory that is I σ -equivalent to T .We extend the notion of full and reduced grounding to definitions.
Definition 37. The full grounding of a rule ∀x P (x) ← ϕ with respect to I σ is the set where n is the number of variables in x.Similarly, the reduced grounding of ∀x The full (reduced) grounding of a definition ∆ is the union of the full (reduced) groundings of all rules in ∆.
The full (reduced) grounding of an FO(ID) theory T is the set of the full (reduced) groundings of all sentences and definitions in T .

Definitions Depending Only on σ
We say that a definition ∆ depends on expansion symbols if Open(∆) ⊆ σ.If ∆ does not depend on expansion symbols, then the interpretation of every predicate in Def(∆) is the same in every model M of T expanding I σ .Indeed, for such a definition and any M |= Iσ T , M | Open(∆) is completely determined by I σ .Therefore also wfm ∆ (M ) only depends on I σ .
The deductive database literature describes several algorithms to compute wfm ∆ (M ) for a definition that does not depend on expansion symbols.Most of them are only defined for definitions where every rule body is a conjunction of atoms.But some of them, such as the Rete algorithm (Forgy, 1982) and the semi-naive evaluation technique (Ullman, 1988), can easily be adapted to handle full FO bodies.
Assume ∆ is a definition that does not depend on expansion symbols.Let τ be the vocabulary σ P ∪ Def(∆), σ F and I τ the τ -structure such that I τ | σ = I σ and I τ |= ∆.Then clearly, M |= Iσ T iff M |= Iτ T for any structure M .However, a grounding for T \ ∆ with respect to τ can be obtained more efficiently, since Gr red (T \ ∆, I τ ) is necessarily smaller than Gr red (T, I σ ).Indeed, T \ ∆ is a subtheory of T , and Gr red (T \ ∆, I τ ) does not contain symbols of Def(∆), while Gr red (T, I σ ) does.
Observe also that the set of c-maps for T over τ is a superset of the set of c-maps for T over σ, since the bounds assigned by the former c-maps are formulas over τ , instead of only over σ.As such, c-maps computed by the refinement algorithm for T over τ might yield more efficient grounding compared to c-maps computed for T over σ.

Bounds for Definitions
We now extend the refinement algorithm to FO(ID).
Definition 38.A formula ϕ is a subformula of an FO(ID) theory T if it is a subformula of a sentence in T or a subformula of a rule body in a definition of T .A c-map for T over σ is an assignment of a ct-and cf-bound over σ to every subformula of T .
Note that a c-map does not assign bounds to heads of rules in a definition.Our strategy to compute a c-map for an FO(ID) theory T is simple: construct the completion of T and apply the refinement algorithm on Comp(T ) to obtain a c-map C for Comp(T ).The restriction of C to the subformulas of T is a c-map for T .Indeed, every subformula ϕ of T occurs in Comp(T ) and since In order to use a c-map for grounding, we lift the definition of c-transformation to FO(ID) theories.
Definition 39.Let C be a c-map for a theory T and ∆ a definition in T .The c-transformation of a rule ∀x (P (t) ← ϕ) of ∆ is given by ∀x (P (t) ← C ϕ ).The c-transformation C ∆ of a definition ∆ is the set of c-transformations of rules in ∆.The c-transformation of T is the set of the c-transformations of the formulas and definitions in T .
We also lift the notion of C-equivalence to definitions.
Definition 40.Two definitions ∆ 1 and ∆ 2 are C-equivalent if for every structure I that satisfies C, However, Lemma 15 does not hold for FO(ID) theories: for a definition ∆, C ∆ is not necessarily C-equivalent to ∆.
Example 14.Let σ be the empty vocabulary and T the theory This theory is unsatisfiable because the definition {P ← P } has only one model, in which P is false.This contradicts the sentence in T .Clearly, is a ct-bound for P .If C is a c-map for T over σ assigning ( , ⊥) to P , then C {P ← P } = {P ← (P ∧ ¬⊥) ∨ }, which is equivalent to {P ← }.This definition has only a model that assigns true to P .Since this model also satisfies C, we conclude that {P ← P } and C {P ← P } are not C-equivalent.
Definition 41.Let ∆ a definition of T .We call c-map C for T ∆-tolerant if C ∆ and ∆ are C-equivalent.We call C T -tolerant if it is ∆-tolerant for every definition ∆ of T .
In the following, we say that a formula occurs positively (negatively) in a definition ∆ if it occurs positively (negatively) in a body of a rule in ∆.
Proposition 42.Let ∆ be a definition of a theory T .Then a c-map C for T over σ is ∆-tolerant if for every subformula ϕ of ∆ that contains a predicate P ∈ Def(∆), the following hold: 2. If ϕ occurs positively in ∆ and P occurs positively in ϕ, then C ct (ϕ) = ⊥.
Note that the c-map of Example 14 violates the second condition.We will prove Proposition 42 by inductively constructing for any structure I that satisfies C, a sequence of three-valued structures that is a well-founded induction above I for both ∆ and C ∆ .If I |= ∆, we show that a terminal sequence with this property can be constructed, proving that I also satisfies C ∆ .If I |= ∆, a sequence with this property can be constructed such that its last element is not less precise than I.This shows that I does not satisfy C ∆ either.To construct a well-founded induction for both ∆ and C ∆ , we prove that each step that extends a well-founded induction for ∆ is also a valid step to extend it for C ∆ .Step (3a) in Definition 34 is covered by Lemma 43, step (3b) by Lemma 44.
Lemma 43.Let I be a structure that satisfies a c-map C for T over σ and let J ≤ p I be a threevalued interpretation such that J| σ is two-valued.Then J(ϕ) ≤ p J(C ϕ ) for every subformula ϕ of T .
Proof.We prove this lemma by induction.First assume ϕ[x] is an atom.Then C ϕ is the formula The inductive cases are all very similar to the base case.We prove one of them.Assume ϕ is the formula and by induction J(C ψ ) = J(C χ ) = f.Since also J(C ct (ϕ)) = f, we conclude that J(C ϕ ) = f.If on the other hand J(ϕ) = t, then J(C cf (ϕ)) = f.Also J(ψ) = t or J(χ) = t, and therefore J(C ψ ) = t or J(C χ ) = t.Hence J(C ϕ ) = t.Lemma 44.Let ∆ be a definition of T and C a c-map for T over σ that satisfies the three conditions of Proposition 42.Let I be a structure that satisfies C and J ≤ p I a three-valued interpretation such that J| σ is two-valued.If U is a set of domain atoms defined in ∆ and unknown in J, then for every subformula ϕ of ∆ such that J[U/f](ϕ) = u, the following hold: Proof.Denote H := J[U/f].If J(ϕ) = u, the result follows immediately from Lemma 43.
We prove the case where J(ϕ) = u by induction.Assume that ϕ is an atom P (x).Since J(ϕ) = u and H(ϕ) = u, we know that P (x J ) ∈ U and H(ϕ ).If ϕ occurs negatively in ∆, then we have to prove that H(ϕ) ≤ H(C ϕ ).Since H(ϕ) = f, this inequality holds regardless the value of C ct (ϕ) and C cf (ϕ) in H.If on the other hand, ϕ occurs positively, we have to prove that H We omit the inductive cases, since they are very similar to the base case.
Proof of Proposition 42.Let I be a structure that satisfies C. We have to prove that I |= ∆ iff I |= C ∆ .If ∆ is not total, the proof is trivial, since then ∆ and C ∆ are equivalent.Now assume that ∆ is total and let Jξ 0≤ξ≤α be a well-founded induction for both ∆ and C ∆ above I.We will prove that if Jα is not two-valued, and Jα < p I, there exists a Jα+1 such that Jξ 0≤ξ≤α+1 is again a well-founded induction for ∆ and C ∆ .Also observe that if λ is a limit ordinal and Jξ 0≤ξ<λ is a well-founded induction for both ∆ and C ∆ , then the same holds for Jξ 0≤ξ≤λ .This is sufficient to conclude the proof.Indeed, if I |= ∆, we can keep on extending the sequence until we end up in I, and derive that I |= C ∆ .If I |= ∆, then we will eventually extend the well-founded induction with a structure Jα+1 ≤ p I. But then, the well-founded model of C ∆ will also be more precise than Jα+1 , which shows that I |= C ∆ .
Assume that Jα is not two-valued and Jα < p I. Because ∆ is total, there exists a Jα+1 such that Jξ 0≤ξ≤α+1 is a well-founded induction for ∆.We have to prove that it is also a well-founded induction for C ∆ .There are two possibilities: • Jα+1 = Jα [P (d)/t] for some domain atom P (d) and there is a rule ∀x Hence, Jξ 0≤ξ≤α+1 is a well-founded induction for C ∆ .
From Proposition 42 we derive the following procedure to compute a T -tolerant c-map for a theory T .First compute a c-map C for T that is not necessarily T -tolerant.Then, for every definition ∆ of T and every subformula ϕ of ∆, replace C ct (ϕ) and C cf (ϕ) by ⊥, if this is required to satisfy the conditions of Proposition 42.
We conclude that the following algorithm produces a correct grounding for FO(ID) theory T : 1. Compute a c-map C for T over σ.
2. If C is inconsistent with respect to I σ , output ⊥ and stop.
3. Else, derive an atom-based, T-tolerant c-map C from C.
4. Output Gr red (C T ∪ C A ), using any off-the-shelf grounder for FO(ID).

Implementation and Experiments
So far we have focussed mostly on grounding size.Proposition 23 guaranteed that grounding with bounds produces smaller groundings.In this section we are concerned with the efficiency and practical implementation of grounding with bounds.A first issue was mentioned at the end of Section 4.4.2:an atom-based c-map C computed by the refinement algorithm contains many repeated constraints on variables.To ground C T efficiently, such repetitions should be avoided as much as possible.Secondly, an efficient grounder consults bounds as soon as possible.In particular, it should use bounds to avoid unnecessary instantiations of variables, rather than to remove these instantiations afterwards.As a case study, we will show in detail how to adapt a basic "top-down style" grounding algorithm to efficiently exploit bounds.We sketch how the same principles can be applied for a "bottom-up style" grounder.
In the second part of this section we discuss some aspects of implementing the refinement algorithm.As we mentioned in Section 4.4.1, there are several issues concerning the practical implementation of this algorithm.In particular, a method to simplify bounds is needed, as well as a good stop criterion.We show how these issues can be addressed by representing bounds as first-order binary decision diagrams.
Finally, we report on our implementation, called GidL, of the refinement and grounding algorithm.We present experimental results that show the impact of using bounds on grounding size and time.

Case Study: Top-Down Grounding with Bounds
For the rest of this section, assume T is in TNF and fix an I σ -consistent, atom-based c-map C for T over σ.We call a formula of the form ϕ ∨ ψ or ∃x ϕ a disjunctive formula.Vice versa, a conjunctive formula is a formula of the form ϕ ∧ ψ or ∀x ϕ.
We now present a simple "top-down style" grounding algorithm that exploits bounds without constructing C T ∪ C A explicitly.The algorithm is shown in Algorithm 1. Basically, it consults the bounds assigned by C whenever it substitutes the free variables of a formula ϕ[x] by domain constants d.If according to the bounds, ϕ[x/d] is certainly true, i.e., I σ [x/d] |= C ct (ϕ), then the grounding of ϕ[x/d] is not computed.Instead, the algorithm then proceeds as if ϕ[x/d] is equal to .Similarly if ϕ[x/d] is certainly false.In this way, the algorithm avoids creating unnecessary instantiations.One can check that if C is the trivial c-map, Algorithm 1 reduces to a straightforward top-down style grounding algorithm that produces Gr full (T ).
Line 1 of Algorithm 1 checks whether one of the sentences of T is certainly false.If this is the case, then clearly T is unsatisfiable (cf.Definition 10), and this can be reported immediately.Before a sentence is grounded, line 4 checks whether this sentence is certainly true according to C. Only sentences that are not certainly true are grounded.Observe that both checks are simple syntactic checks and can be executed in constant time.
Function groundConj gets as input a formula ϕ[x] and returns a grounding for ∀x ϕ[x].In particular, if ϕ is a sentence, then the result of applying groundConj to ϕ is a grounding for ϕ.
In groundConj, universal quantifiers are implicitly pushed inside conjunctions.That is, if ϕ[x] is a conjunction ψ 1 ∧ . . .∧ ψ n , then for every i ∈ [1, n], the grounding of ∀x ψ i is computed by applying groundConj to ψ i .The conjunction of these groundings is returned as grounding for ∀x ϕ.According to equivalence (6) of Section 2.2, this transformation yields an equivalent formula.
Function groundConj only consults the c-map when variables are substituted by domain constants or when the input formula is an atom.As such, groundConj ignores ("eliminates") the bounds assigned to conjunctive formulas.As we mentioned at the end of Section 4.4.2, this is important to avoid repeated constraints on a variable.
In groundConj(ϕ[x]), only those substitutions ϕ[x/d] for which I σ [x/d] |= C ct (ϕ) are grounded (see, e.g., line 12).Indeed, the other substitutions yield a formula that is certainly true in all models of T expanding I σ , and can therefore be omitted from the ground conjunction C that is computed.
Before ϕ[x/d] is grounded, it is checked whether this substitution yields a formula that is certainly false (see, e.g., line 13).If this is the case, the whole conjunction C will certainly be false, and therefore ⊥ is returned immediately.Observe that implicitly the formula C ct (ϕ) ∨ (¬C cf (ϕ) ∧ ϕ) is grounded.Hence the correctness of groundConj follows from Lemma 13.
Function groundDisj is dual to groundConj.On input ϕ[x], it returns a grounding for ∃x ϕ[x].It implicitly pushes existential quantifiers through disjunctions and eliminates the bounds assigned to disjunctive formulas.
Function groundDef returns a grounding for its input definition ∆.It grounds the rules of ∆ one-by-one.For each rule ∀x (P (x) ← ϕ[y]), only those substitutions ϕ[y/d] that are possibly true are tried (line 4).If ϕ[y/d] is certainly true, it is replaced by (line 5).
In lines 7-11 of Algorithm 1, the theory C A is grounded.Recall that this is necessary to obtain a grounding that is I σ -equivalent to T (see Proposition 21).Observe that if C is the trivial c-map, no output is produced when lines 7-11 are executed.
The computationally expensive steps in Algorithm 1 are the steps where the truth values in I σ of (some of the) bounds assigned by C are computed.For large bounds, these steps can become infeasible.Indeed, the expression complexity of FO is PSPACE-complete (Stockmeyer, 1974).As such, grounding with too complex bounds may take more time and space than constructing the full grounding and simplifying it afterwards.The stop criterion of Section 6.2.3 for the refinement algorithm is designed to avoid too complex bounds.Our experiments in Section 6.3 show that carefully restricting the complexity of the bounds leads to faster grounding.
We stress that Algorithm 1 is just one example of a grounding algorithm that exploits bounds.4The principle of consulting bounds as soon as possible can be applied to adapt other grounding algorithms as well.For example, recall that a bottom-up style grounder starts by storing all instances of atomic subformulas of T in a table.To exploit bounds efficiently, a bottom-up grounder should consult the bounds while constructing these tables and leave out, e.g., all instances that are certainly false.As such, it avoids unnecessary large tables, which in turn improves the speed of the subsequent grounding steps.

Implementing the Refinement Algorithm and Querying Bounds
In this section we discuss some aspects of implementing the refinement algorithm.As mentioned above, applying a simplification method for first-order formulas to simplify the bounds at regular time points is essential for a good implementation.One can use Goubault's (1995) method for this purpose.To this end, the bounds need to be represented by first-order binary decision diagrams.We show in this section that such a representation can be applied without too much overhead when applying one-step refinements.Moreover, using binary decision diagrams leads to extra benefits: we obtain a cheap equivalence check for bounds and an elegant algorithm to query bounds, which is needed to implement Algorithm 1.At the end of this section we discuss a stop criterion for the refinement algorithm and we discuss an implementation.
Definition 45 (Goubault, 1995).FO binary decision trees (BDTs) and kernels are defined by simultaneous induction: • If ϕ is a BDT and x a variable, then ∃x ϕ is a kernel; • and ⊥ are BDTs; • If ϕ is a kernel and ψ 1 and ψ 2 are BDTs, then ϕ ψ 1 ; ψ 2 is a BDT.
Observe that the graph representation of a BDT is a tree whose nodes are atoms or existentially quantified BDTs.Goubault (1995) showed that for every FO formula ϕ there exists a BDT ϕ such that ϕ and ϕ are equivalent.In an actual implementation, sharing, reducing and ordering are applied to obtain a simplified and compact representation of BDTs.Such representations are called reduced ordered binary decision diagrams (BDDs).Sharing means that isomorphic subtrees are stored at the same address in memory.Reducing involves exhaustively replacing subtrees of the form ϕ ψ; ψ by ψ.A BDT ϕ is ordered if the kernels appear in some fixed order on every path in the graph representation of ϕ.
As mentioned above, there are several important benefits of using BDDs to represent bounds for a formula: • An implementation of the refinement algorithm using BDDs allows us to use the simplification algorithm for BDDs of Goubault (1995).
• As explained in Section 4.4, to detect that the refinement algorithm has reached a fixpoint, one needs to check the equivalence of bounds.Often, the BDDs representing two equivalent formulas will be equal.5Hence, a cheap (but necessarily incomplete) equivalence check for two bounds consists of checking the syntactic equality of the two BDDs representing them.Since equal BDDs are stored at the same address, this check is done in constant time.
• As we will show in Section 6.2.2, querying a bound ϕ[x], i.e., finding all tuples d such that I σ [x/d] |= ϕ, can easily be implemented directly on a BDD representation of ϕ.Querying a bound is one of the main operations performed by a grounding algorithm that exploits bounds directly (such as Algorithm 1).
On the other hand, using BDDs does not result in too much overhead when computing a c-map.If ϕ, ψ and χ[x, y] are represented by BDDs, then a BDD representing ¬ϕ, ∃x ϕ, ∀x ϕ, ϕ ∧ ψ, ϕ ∨ ψ and χ[x/x , y] can be computed efficiently (Bryant, 1986;Goubault, 1995).This implies that every one-step refinement on a c-map C can be implemented efficiently, even if the bounds assigned by C are BDDs.

Querying a Bound
In Algorithm 1, the main operation performed on a bound ϕ " u u l l l l l l l ⊥ We illustrate the query algorithm on an example.
Example 15.Let ϕ[x, y] be the BDD shown in figure 4, and let {a, b} be the domain of I σ , To find an answer for ϕ[x, y], the query algorithm starts at the root P (x).Since none of its children are equal to ⊥, every domain constant is tried.Assume domain constant a is tried first.Because a ∈ P Iσ , the algorithm continues with node R(a) ; ⊥.Because the "else" child of this node is ⊥ and a ∈ R Iσ , the algorithm returns to the root and tries domain element b.Since b ∈ P Iσ , it goes to node Q(b, y) ; ⊥.Since the "else" child of this node is ⊥, the algorithm tries those substitutions d for y such that (b, y/d) ∈ Q Iσ .Thus, y is substituted by b.Finally, answer [x/b, y/b] is returned.

A Stop Criterion for the Refinement Algorithm
As shown in Section 4.4, the c-map refinement algorithm does not reach a fixpoint on certain inputs.Also, even in the case a fixpoint can be found, computing it may take a long time, and the bounds assigned by the fixpoint can be so complex that querying becomes very inefficient.Using such bounds may severely slow down grounding.This indicates the need for a good stop criterion.
Simple Stop Criteria A very simple stop criterion limits the number of one-step refinements that may be performed to a given maximum number m.This m may depend on the theory T .For instance, m can be set to C × (number of subformulas in T ), where C is some fixed constant.
A slightly less naive technique, which can be combined with the previous, limits the "complexity" of the bounds by putting a fixed upper bound N on the number of nodes the BDD representation of a bound may have.If a one-step refinement would lead to a new bound with more nodes than N , this refinement is not performed.As this limits the number of applicable one-step refinements, the probability of reaching a fixpoint increases.

Stop Criteria via Estimators
The experiments we present in Section 6.3 indicate that there exist appropriate values for C and N that produce positive results on most of the examples.Still, on some problems, grounding slows down severely, while the size of the produced grounding does not decrease.One of these problems is the following clique problem (entry 6 in Table 4).∀x ((∀y (Clique(y) ∧ x = y ⊃ Edge(x, y))) ⊃ Clique(x)).
If Edge Iσ is symmetric, i.e., I σ represents an undirected graph, a model of T expanding I σ is a clique in I σ that is not contained in a strictly larger clique in I σ .Within a small number of iterations, the refinement algorithm finds for Clique(x) the ct-bound ∀x x = x ⊃ Edge(x, x ).This formula expresses that Clique(x) is certainly true in every solution if x is directly connected to every other vertex in the input graph.Clearly, for most graphs, no vertex satisfies this condition.So, for most graphs, ⊥ would be an equally precise ct-bound, but would allow much faster querying.
The situation is worse for the cf-bound for Clique(x).Since for an undirected graph, every single vertex is a clique, and thus occurs in at least one of the solutions, the cf-bound is necessarily unsatisfiable with respect to T .Yet, our implementation of the refinement algorithm came up with ∃x (¬Edge(x, x ) ∧ x = x ∧ (∀x (x = x ⊃ Edge(x , x )))) as cf-bound.The query algorithm outlined above takes cubic time in the number of vertices to find out that no x satisfies this formula.
To avoid the problems illustrated by the example above, one could estimate the reward of a bound versus the cost of evaluating it.Recall that more precise bounds yield smaller grounding sizes.Therefore, the reward of a bound ψ is dictated by its precision.Given I σ , it is possible to find a good estimate for the number of answers to ψ in I σ (Demolombe, 1980), which is in turn a measure for the precision of ψ.For a fixed query algorithm, one can also estimate the cost cost(ψ) of computing an answer in I σ to a query ψ.In the following, we assume that the reward of a bound is a positive real number, and its cost a strictly positive real number.computed bound ψ is too complex, i.e., its BDD representation contains too many nodes or the ratio r(ψ) is above a certain threshold, ψ is not used.
If BDDs are used to represent the bounds assigned by C, line 8 can be implemented in linear time in the size of C. If we use Goubault's simplification algorithm for BDDs for implementing line 9, the worst case complexity of this step is non-elementary in the size of C ct (ϕ) ∨ ψ (Goubault, 1995).The estimators we used to implement line 10 take linear time in the size of ψ.It may seem that the complexity of the simplification method limits the practical applicability of Algorithm 6.However, since large BDDs usually do not pass the test in line 10, the simplification method is rarely applied on large BDDs.In the experiments of the next section, the running time of the refinement algorithm is negligible compared to the running time of the grounding algorithm.

Experiments
We implemented Algorithm 1 and Algorithm 6, using BDDs to represent bounds.The resulting grounder is called GidL.In this section, we present experiments, obtained with GidL, that show the impact of using bounds on grounding size and time.
As input for GidL, we used 37 benchmark problems, mainly taken from Asparagus. 6The details about the experiments are available at http://dtai.cs.kuleuven.be/krr/software.html.We used four different versions of GidL: GidL nb : Assigns ϕ, ¬ϕ as bound to every atomic subformula ϕ over the input vocabulary, and ⊥, ⊥ to every other subformula.As such, it creates the reduced grounding of the input theory.
GidL bu : Assigns ϕ, ¬ϕ as bound to every atomic subformula ϕ over the input vocabulary and then applies bottom-up refinements to obtain a bottom-up c-map.
GidL mn : Limits the refinement algorithm to 4 × (number of subformulas in T ) one-step refinements and allows a maximum of 4 internal nodes in each BDD used to represent the bounds.According to previous experiments (Wittocx et al., 2008b), this is the best setting when limiting the number of nodes.
GidL r : Limits the refinement algorithm to 4 × (number of subformulas in T ) one-step refinements.
It limits the complexity of the derived bounds by estimating the number of answers and the cost, as described in the previous section.
In Table 3, the influence of bounds on the grounding size is shown.The second and third column show the ratio of the grounding size obtained with GidL mn and GidL r compared to Gr red (T ).For GidL nb and GidL bu , this ratio is always equal to 1.When interpreting Table 3, it is important to note that small reductions in grounding size are not important.The reason being that all reductions that can be obtained by the refinement algorithm are also obtained by applying unit propagation on the grounding (see Section 7 for a discussion).Since there exist very efficient implementations of unit propagation, it is not beneficial to let the refinement algorithm find small reductions at a relatively high cost.We see that both GidL mn and GidL r reduce the grounding size with more than 50% in around 30% of the benchmarks.In 7, respectively 6, of the benchmarks there is a spectacular reduction of more than 95%.
More important than reductions in size are reductions in grounding time.Table 4 shows the running times of the different versions of GidL, and (between brackets) the ratio of the running time to the running time of GidL nb .The running time of the refinement algorithm is included (it never took more than 0.02 seconds).A time-out (###) of 600 seconds was used.
On many benchmarks, the reduction in grounding time with respect to GidL nb is due to the reduction in grounding size.Yet there are also several benchmarks where time decreases a lot, while 6. http://asp.haiti.cs.uni-potsdam.de/there is almost no reduction in size.This is mostly due to the creation of a bottom-up c-map, as can be seen from the running times of GidL bu .Applying bottom-up refinements leads to the assignment of non-trivial bounds to non-atomic subformulas.This allows for earlier pruning by a top-down style grounder, and hence faster grounding.
From Table 4, we can see that GidL mn performs quite well.On half of the benchmarks, it is more than 44% faster than GidL nb .It is also more than 20% faster than GidL bu on half of the benchmarks.There are some outliers however.On benchmarks 6 and 11, it is far slower than GidL bu , while not producing a significantly smaller grounding.This indicates the use of a complex bound with relatively small reward.Compared to GidL mn , GidL r is faster and more robust, indicating that using estimators for the reward and cost of bounds pays off in most cases.In only two of the benchmarks, our naive estimator makes a wrong guess.In benchmark 1, a bound with high cost and no reward is allowed, in benchmark 7, a bound with low cost and high reward is not allowed by GidL r .It is part of future work to implement improved estimators.
We conclude from our experiments that grounding with bounds is applicable in practice.It often leads to smaller grounding sizes on standard benchmark problems, and if the bounds are carefully restricted, it yields a significant speed up.Since the time to compute bounds is small compared to the overall grounding time, computing them is essentially for free.
In general, a smaller grounding does not necessarily lead to faster propositional model generation.For example, grounding size (and time) increases when symmetry breaking formulas are added, but these formulas may drastically improve the overall solving time (Torlak & Jackson, 2007).Another example are clause-learning SAT solvers: the clauses learnt by these solvers are redundant, but may improve the solving time by orders of magnitude.The question arises whether our method of grounding with bounds may lead to slower overall model generation time compared to grounding without bounds.This is not the case.The experiments above show that in general, grounding with bounds is faster than grounding without bounds.Since grounding with bounds also produces smaller groundings, the subsequent initialization phase of the SAT solver is executed faster.If T 1 and T 2 are two groundings obtained by grounding the same input theory and structure with, respectively without bounds, it can be shown 7 that the typical simplification steps applied in this initialization phase transform T 1 and T 2 in exactly the same simplified theory T 3 .Thus, after initialization, the SAT solver is applied on exactly the same theory, whether or not the grounder used bounds.It follows that in general, the overall model generation time does not increase when bounds are applied while grounding.

Related Work
In the previous sections we described a method to obtain fast and compact grounding.Several such methods have been described in the literature.Some of them are -like ours -preprocessing techniques that rewrite the input theory.Other techniques involve reasoning on the propositional level.In this section we provide an overview.We indicate which ones can be applied to improve GidL.We also give an overview of existing grounders.

Methods to Optimize Grounding
Derivation of Bounds To our knowledge, the methods proposed in the literature to derive bounds are less general than the one we presented in this paper.This is illustrated by Table 5, where we show for several grounders the impact of manually adding redundant information.For all the grounders in this table except GidL, manually adding redundancy may have a serious impact.For some grounders, the need to add redundancy can sometimes be avoided by writing the input theory in a specific format.For example, the grounder gringo (Gebser et al., 2007) uses a syntactic check to derive bounds: it derives that predicate q of the input vocabulary is a bound for predicate p if p 7. The exact formulation and the proof of the property are beyond the scope of this paper.is defined by a choice rule of the form, e.g., {p(X)} :-q(X).However, if this rule is replaced by {p(X)} :-dom(X), where dom denotes the domain, and the constraint :-p(X),not q(X),dom(X) is added, q is still a bound for p, but this is not detected by gringo, as can be seen in Table 5.
The grounder of the dlv system (Perri et al., 2007) may derive bounds by reasoning on the propositional level.As we explain below, the order in which rules and constraints are grounded is of crucial importance for such a method to pay off.Since dlv grounds rules before constraints, using a constraint to state that q is a bound for p does not improve grounding time.
Propagation on the Propositional Level One of the techniques to produce smaller groundings consists of applying a constraint propagation method on the ground theory T g and replacing by , respectively ⊥, every ground literal that is derived to be true, respectively false.The resulting theory is then simplified.This technique is applied by the grounder psgrnd (East et al., 2006), which uses unit propagation (Davis & Putnam, 1960) and complete one-atom lookahead (Li & Anbulagan, 1997) as propagation methods.The latter is performed once the grounding is finished, the former is triggered each time a unit clause is added to the grounding.If an inconsistency is detected by unit propagation, the grounding process is terminated immediately.Observe that this technique yields small groundings but does not improve grounding speed, except for the (rare) case where the propagation method detects an inconsistency during grounding.Indeed, it does not avoid computing all ground instances of the formulas in the input theory.
If a propositional constraint propagation method is applied while the grounding is being constructed, the derived information could be used to refine bounds.For instance, if unit-propagation derives that the domain atom P (d 1 , . . ., d n ) is true, then x 1 = d 1 ∧ . . .∧ x n = d n is a ct-bound for P (x 1 , . . ., x n ).These bounds could be used to speed up the construction of the rest of the grounding.For this method to be effective, however, some careful fine-tuning of the order in which sentences are grounded is required.It may even be necessary to alternatingly compute partial groundings of different sentences.To the best of our knowledge, this process has not been worked out or implemented with unit-propagation or one-atom lookahead as underlying propagation method.On the other hand, most ASP grounders apply it for the following limited propagation method: if all rules defining a predicate P are grounded, it is concluded that a domain atom P (d) is certainly true if it occurs in a ground rule of the form P (d) ← , and certainly false if it does not occur in the head of any ground rule.In this case, a good grounding order can be derived from the dependency graph of the input theory (e.g., Cadoli & Schaerf, 2005;Perri et al., 2007).In GidL, this strategy is implemented for grounding definitions.
Sharing A second technique is called sharing and consists of detecting subformulas in the ground theory T g that occur more than once.If such a subformula ϕ is detected, all its occurrences in T g are replaced by a new atom P , and the sentence P ≡ ϕ is added.If ϕ is a large formula and occurs often in T g , this may result in a significant grounding size reduction.Also, sharing improves the propagation in SAT solvers.Shlyakhter, Sridharan, Seater, and Jackson (2003) present an algorithm to detect identical subformulas on the first-order level, Torlak and Jackson (2007) for the propositional level.In GidL, we implemented a simple sharing technique using dynamic programming.We adapted function groundConj so that instead of returning a conjunction C, it creates a new atom P , adds the sentence P ≡ C to the grounding, and returns P .If groundConj is applied multiple times on the same input ϕ, the same predicate P is returned each time, but P ≡ C is added only once.Function groundDisj is adapted in a similar fashion.
Clause splitting Clause splitting is a well-known rewriting technique applied in MACE style model generation (McCune, 2003).It consists of splitting a first-order clause where x ∈ z 2 , y ∈ z 1 and z = z 1 ∪ z 2 into two new clauses Here, S is a new predicate symbol.The full grounding of ( 20) is of the size O(|D| 3 ), while the full grounding of ( 21) and ( 22) has only size O(|D| 2 ).If sharing is implemented by adapting the functions groundConj and groundDisj as explained above, the effect of clause splitting can be obtained by moving quantifiers according to the equivalences (4), ( 5), ( 8) and ( 9) of Section 2.2.For instance, we can apply equivalences ( 4) and ( 8) to replace (20) by ∀x∀z (ϕ 1 ∨ (∀y ϕ 2 )).Grounding the latter while applying sharing has the same effect as clause splitting.Similarly, the grounding size of ∃x∃y∃z (ϕ 1 [x, z 1 ] ∧ ϕ 2 [y, z 2 ]) can be reduced by replacing this formula by ∃x∃z (ϕ 1 ∧ (∃y ϕ 2 )).
The simple heuristic to guide clause splitting described by Claessen and Sörensson (2003) can directly be applied to choose which quantifiers to move inside.We conclude that clause splitting could easily be incorporated in GidL.
Database Techniques Several techniques for optimizing querying in databases can be used to optimize grounding.Examples are join-ordering strategies, backjumping and indexing techniques.
One of the most basic techniques to improve grounding speed consists of reordering (long) conjunctions or disjunctions of literals to speed up grounding.Which order is best depends on the grounding algorithm.Different strategies are described by, e.g, Leone, Perri, and Scarcello (2001), Syrjänen (1998Syrjänen ( , 2009) ) and in the database literature (Garcia-Molina, Ullman, & Widom, 2000).There is no problem implementing a similar technique in GidL.Also, reordering the nodes in the BDD representation of the bounds could optimize querying.It is part of future work to investigate such reordering strategies for BDDs.
One of the important methods in the dlv grounder is the use of a backjumping technique (Perri et al., 2007) to efficiently find all instances of a conjunction ϕ 1 ∧ . . .∧ ϕ n that are possibly true, given (an overestimation of) the possibly true instances of each of the conjuncts ϕ i .In GidL, this backjumping technique is applied to implement line 12 of function groundDisj.Indeed, if ϕ is the formula ϕ 1 ∧ . . .∧ ϕ n , then line 12 amounts to finding all possible instances of ϕ, while the cf-bounds for ϕ 1 , . . ., ϕ n provide an overestimation of the possibly true instances of these conjuncts.Similarly, the backjumping technique is applied to improve line 12 of groundConj, where all possibly false instances of a disjunction are calculated.Catalano, Leone, and Perri (2008) present an adaptation of indexing strategies for grounding.
Partition-Based Reasoning Ramachandran and Amir (2005) describe a sophisticated grounding technique that can reduce the grounding size of FO theories, depending on the availability of some graphical structure in these theories.This technique is not directly applicable in our case, since it produces groundings that are not necessarily I σ -equivalent to the input theory.The only guarantee is that the ground theory is satisfiable iff the input problem is satisfiable.

Grounders
A non-native approach to ground an MX(FO(ID)) problem consists of first translating it to an equivalent normal logic program under the well-founded semantics.This translation is described by Mariën et al. (2004).Next, a (slightly adapted) grounder for ASP is used to ground the logic program.This is the approach taken by MXidL (Mariën, Wittocx, & Denecker, 2006).The first native grounding algorithm for MX(FO) and MX(FO(ID)) was described by Patterson, Liu, Ternovska, and Gupta (2007).It is based on relational algebra and takes a "bottom-up approach" (see Section 3.2.1).To construct a grounding of a sentence ϕ, it first creates all possible groundings of the atomic subformulas.Then it combines these groundings using relational algebra operations, working its way up the syntax tree.Finally, a grounding for ϕ is obtained.Mitchell et al. (2006) report on an implementation, called mxg, of the algorithm.
kodkod (Torlak & Jackson, 2007) is an MX grounder for a syntactic variant of FO.Like mxg, it works in a bottom-up way.It represents intermediate groundings by (sparse) matrices.One of the features of kodkod is that it allows a user to give part of a solution to an MX problem as a three-valued structure.Specifically, the user can force that some atoms P (d), where P is an expansion predicate, are certainly true (or certainly false).kodkod then takes advantage of this information to produce smaller groundings.GidL also allows for a three-valued structure as input.
When applying the refinement algorithm, the set of tuples d for which the user indicates that P should be true is then used as initial ct-bound for P instead of ⊥.Similarly for the cf-bound.This leads to more efficient and compact groundings.
mace (McCune, 2003) and paradox (Claessen & Sörensson, 2003) are finite model generators for FO.They work by choosing a domain and grounding the input theory to SAT.If the resulting grounding is unsatisfiable, the domain size is increased and the process is repeated.The grounding algorithm in mace and paradox basically constructs the full grounding and simplifies it afterwards.Small groundings are obtained by first rewriting the input theory using, e.g., clause splitting.Also methods that build the grounding incrementally are applied in these systems to avoid recomputing every grounding from scratch.East et al. (2006) developed the grounder psgrnd for M X(P S pb ).P S pb is a fragment of FO(ID), extended with pseudo-boolean constraints.As explained above, psgrnd performs reasoning on the ground theory to reduce memory usage and grounding size.The experiments performed by East et al. (2006) show that carefully designed data structures are of key importance to build an efficient grounder.
ASP grounders take as input a normal logic program and transform it into an equivalent ground normal logic program.As such, these grounders do not deal with (deeply) nested formulas.Currently, there are three ASP grounders: lparse (Syrjänen, 2000;Syrjänen, 2009), gringo (Gebser et al., 2007) and the grounding component of dlv (Perri et al., 2007).All of them use techniques from database theory to perform grounding efficiently.
Finally, we mention the grounder spec2SAT (Cadoli & Schaerf, 2005).Its input theories are in the np-spec language, a language with Datalog-like syntax and semantics based on model minimality.The grounding algorithm implemented in spec2SAT is basically a simplified version of the grounding algorithm of dlv.
It would be interesting to compare the efficiency of the above mentioned grounders experimentally.However, it is currently not possible to conduct such an experiment in a scientifically fair way.There are several reasons for this.First, all grounders have a different input language, making it impossible to run them on the same input.Also, there are several output languages for grounders.A richer output language leads to more compact and fast grounding.For instance, for some prob-lems, lparse's output size is necessarily cubic in the input domain size, while GidL's output format allows for quadratic size.Thirdly, even if the input and output languages of all grounders were the same, an expert could easily manipulate experiments by carefully choosing his modelling style.For example, if he does not manually add bounds to the input theories, GidL has an advantage.If bodies of rules are not ordered, dlv is more likely to produce good results.Etc.Finally, because of the large amount of data processed by grounders, carefully designed data structures and an optimized implementation of the core grounding algorithm is very important to achieve fast grounding (East et al., 2006).However, several of the above mentioned grounders are not yet optimized in that sense.As such, it is difficult to derive conclusions about grounding algorithms by experimentally comparing the efficiency of current implementations of these algorithms.

Conclusions
We presented a method to compute for a given theory, upper and lower bounds for all subformulas of that theory.We showed how these bounds can be used for efficiently creating small groundings in the context of Model Expansion for FO and FO(ID).Our method frees a user from manually discovering bounds and adding them to a theory.
We presented a top-down style grounding algorithm that incorporates bounds.We discussed implementation issues and showed by experiments that our method works in practice: on many benchmark problems, it leads to significant reductions in grounding size and time.
Future work includes the extension of our algorithm to compute bounds for richer logics, such as, e.g., extensions of FO with aggregates and arithmetic.On the implementation side, we plan to use more sophisticated estimators to evaluate whether a computed bound is beneficial for grounding.

Theorem 16 .
If C is a c-map for T over σ and C the theory defined in Definition 9, then C T ∪ C is equivalent to T .Proof.Let M be a model of T .Then M |= C, and because of Lemma 13, M |= C T ∪ C. On the other hand, if M |= C T ∪ C, then by Lemma 13, M |= T .Corollary 17.If C is a c-map for T over σ, then T and C T ∪ C are I σ -equivalent for any σstructure I σ .
follows that for any subformula P (y) occurring in T , neither d ∈ {y | C ct (P (y))} Iσ nor d ∈ {y | C cf (P (y))} Iσ .Therefore P (d) does not occur in Gr red (C A ).
then T |= ∀x ϕ r ct ⊃ ϕ and T |= ∀x ϕ r cf ⊃ ¬ϕ.Definition 32.A c-map C for T is called a bottom-up c-map if for every non-atomic subformula ϕ of T , C ct (ϕ) is the bottom-up ct-refinement bound for ϕ with respect to C, and C cf (ϕ) is the bottom-up cf-refinement bound for ϕ with respect to C. The next proposition follows directly from Lemma 31.Proposition 33.A bottom-up c-map C is atom-based.
Figure 2: A bottom-up c-map

∆ 1 =
∀x∀y (T C(x, y) ← R(x, y)).∀x∀y (T C(x, y) ← ∃z (T C(x, z) ∧ T C(z, y))).Example 13.To cast the problem of finding a Hamiltonian path in a given graph as an MX(FO(ID)) problem, let σ = {Edge/2}, ∅ Σ = {σ P ∪ {Ham/2, Reached/1}, {Start/0} .Predicate Ham represents the edges that form the path and Reached the vertices that are in the path.The constant Start represents the first vertex of the path.Let T be the theory Then model expansion for input structure T and input vocabulary σ expresses the Hamiltonian path problem: in every model M |= Iσ T , the collection of edges (v 1 , v 2 ) ∈ Ham M forms a Hamiltonian path in the graph represented by Edge Iσ .
[x] is querying: finding tuples d of domain constants such that I σ |= ϕ[x/d].Finding a tuple d such that I σ |= ϕ[x/d] corresponds to querying ¬ϕ.We now show that querying a bound ϕ[x] can be done directly on the BDD representation by a simple backtracking algorithm.
where d 1 , . . ., d n , d are domain constants.A theory is in GNF if all its sentences are in GNF.A GNF theory is essentially propositional: by replacing in a GNF theory T every atom P (d) by P d

Table 4 :
Impact of bounds on grounding time

Table 5 :
Grounding times (in seconds) for the Hamiltonian circuit problem with an input graph of 200 nodes and 1800 edges.Encoding constr uses a constraint to state that each edge in the cycle should be an edge of the graph.Encoding redun adds redundancy to include this bound in all rules and constraints.Encoding defin contains no redundancy, but limits the possible edges in the cycle to the edges in the graph while defining the search space for the cycle.