Spatial Aggregation: Theory and Applications

Visual thinking plays an important role in scientific reasoning. Based on the research in automating diverse reasoning tasks about dynamical systems, nonlinear controllers, kinematic mechanisms, and fluid motion, we have identified a style of visual thinking, imagistic reasoning. Imagistic reasoning organizes computations around image-like, analogue representations so that perceptual and symbolic operations can be brought to bear to infer structure and behavior. Programs incorporating imagistic reasoning have been shown to perform at an expert level in domains that defy current analytic or numerical methods. We have developed a computational paradigm, spatial aggregation, to unify the description of a class of imagistic problem solvers. A program written in this paradigm has the following properties. It takes a continuous field and optional objective functions as input, and produces high-level descriptions of structure, behavior, or control actions. It computes a multi-layer of intermediate representations, called spatial aggregates, by forming equivalence classes and adjacency relations. It employs a small set of generic operators such as aggregation, classification, and localization to perform bidirectional mapping between the information-rich field and successively more abstract spatial aggregates. It uses a data structure, the neighborhood graph, as a common interface to modularize computations. To illustrate our theory, we describe the computational structure of three implemented problem solvers -- KAM, MAPS, and HIPAIR --- in terms of the spatial aggregation generic operators by mixing and matching a library of commonly used routines.


Introduction
It is commonly believed that there are two styles of scienti c thinking: analytical, a logical chain of symbolic reasoning from premises to conclusions, and visual, the holding of imagistic, analogue representations of a problem in one's mind so that perceptual and symbolic operations can be brought to bear to make inferences.Neither style is to be preferred a priori over the other.However, for problems whose complexity precludes a direct analytical approach, a certain amount of qualitative and visual imagination is needed to provide the necessary \feel" or \understanding" of the physical phenomena.Once the picture is clear, the analytical mathematics can take over and lead more e ciently to logical conclusions.This \feel and physical understanding" is often considered to be informal, imprecise, and apparently unteachable, but necessary for scientists and engineers.
We believe part of this ability to visualize and imagine must consist of skills to generate images, discover structures and relations in the images, transform the structures, and predict how the structures respond to internal dynamics or external forcing.
A eld is a mapping from one continuum (say R m ) to another (say R n ).More concretely, one can visualize a m-dimensional space with a n-vector attached to each point in the space.Fields are commonplace in science and engineering applications.They are used to describe how physical quantities vary over space and time.Temperature in a room is a threedimensional scalar eld.Weather data can be described as a 4D-spacetime eld with a 6vector attached to each point: velocity (three components) of air ow, temperature (scalar), pressure (scalar), and density (scalar).Other examples of elds include the brightness intensity array in vision, the con guration space in mechanism analysis, and the phase space (vector eld) of dynamical systems.
In actual computer representations, we often approximate a eld with a grid.The grid may be uniform or non-uniform.The eld can be reconstructed from numerical simulation or measurements.A eld does not contain any symbolic abstractions; it is completely numerical.Fields are composable.One can extend the dimension of the underlying space and/or the number of components in the vector attached to each point of the space.
As a representation for physical systems, a eld has two distinguishing characteristics.First, it is information-rich in the sense of the Shannon-Weaver measurement of information.An instantaneous eld of a 128 3 -grid ow simulation may contain on the order of 10 8 bits of information.Second, it is pictorial in the sense that structures and relations are only implicitly represented in the eld.
As a consequence of both the information-richness and the pictorial quality, we argue that in reasoning about elds the central computational problem is the e cient transformation of a pointwise eld description of a physical system into economical symbolic abstractions well suited for explaining the structure and behavior of the system. 2 Figure 1 illustrates how the eld ontology relates to the other commonly used ontologies in Qualitative Physics: device (DeKleer & Brown, 1984), process (Forbus, 1984), and constraint (Kuipers, 1986).To be useful, the symbolic descriptions must impose a conceptual structure on the system so that the complexity of the system can be understood in terms of well-de ned parts and subparts and interactions among them.The relevant parts and interactions are often abstract global properties of the eld.An abstract property is a property whose support is large and nonlocal, whereas the support of a property is de ned as the subset of a eld on which the property depends.On the other hand, for computational complexity reasons we prefer to build the recognition procedures from basic routines that are local and independent of task-level information as much as possible.These considerations lead us to adopt an architecture where the pointwise description and the nal symbolic descriptions are mediated by layers of equivalence classes of objects with explicit adjacency relations.We call such a layer of objects a spatial aggregate.
Where do spatial aggregates come from?In a real eld, there tend to be continuities of properties (such as intensity or temperature or pressure) so that the eld can be divided into equivalence classes, i.e., open regions where a particular property varies in an approximately uniform way.With continuities we can achieve an economy of description by focusing on the open regions and their boundaries instead of the pointwise eld.Higher-order continuities, i.e., continuities of properties de ned on the open regions, can similarly be used to build more abstract spatial aggregates.
The formation of equivalence classes presupposes the existence of continuity.This brings us to a methodological point.It is important to clearly identify the source of continuities in the eld or equivalently in the physical system the eld represents.The discovery of valid and general continuities in the physical system is as much a scienti c contribution as the subsequent computational use of them to form an articulated conceptual model to explain structure and behavior.
Our motivation for this paper comes from the desire to understand the computational structures shared by a class of automatic problem solvers that integrate visual, symbolic, and numerical methods.We would like to make this computational structure explicit so that comparisons and generalizations can be made.Our goal is to develop a way of organizing 2. Inferring structural descriptions from a eld can be an ill-posed problem (e.g., recovering 3D shapes from 2D images).To avoid these di culties, we will assume the structure-recovery problem to be well-posed so that our main concerns are computational e ciency and appropriate abstractions.programs around image-like analogue representations, and an appropriate language to make programs written in this style clear.The next section develops the theory of spatial aggregation in detail.Section 3 describes a language to support programs that are organized around neighborhood graphs.Section 4 illustrates the usefulness of the language by describing succinctly the computation structure of three implemented programs { kam, maps, and hipair.We choose these programs as illustrations largely because of our familiarity with them.Section 5 shows how to program in the spatial aggregation language, using an example from image analysis.We plan to investigate the applicability of our framework to several other programs, such as those constructed by Kuipers and Levitt (1988), Forbus et al. (1991), Gelsey (1995), and Junker and Braunschweug (1995).

Spatial Aggregation Theory
Given a eld, a spectrum of reasoning tasks can be de ned.The following list is roughly in the order of increasing complexity: Infer structural descriptions.Find out objects, if any, that exist in the eld.
What are their shapes, sizes, and locations?How are they distributed?How are they created?How do they evolve as some parameter (say time) is varied?
Classify.Assign semantic labels to objects and con gurations.Infer correlations.Determine how the geometry and distribution of one type of objects correlate with those of another type?Check consistency.Given two objects or con gurations, test if they are equivalent or if they are pairwise consistent.
Infer incremental behavior.Given an instantaneous con guration, predict its possible short-term behaviors.
Infer behavioral descriptions.Explain and summarize the evolution of objects by a set of domain-speci c interaction rules.

Requirements of imagistic reasoning
Partly motivated by Ullman's theory of visual analysis (Ullman, 1984), we nd desirable the following general requirements on imagistic reasoning: Abstractness.The problem solver should be able to nd objects de ned by abstract global properties.
Open-endedness.The problem solver architecture should be applicable to a variety of domains ( uid motion, seismic data, weather data, phase space, or con guration space).This requirement implies that the basic recognition routines must be modular and composable.Task-speci c knowledge a ects the choice and ordering of these routines.
E ciency.The \building blocks" of the recognition machinery must be local and non-goal-speci c. \Non-goal-speci c" means the operations of the building blocks do not depend on the interpretation of the objects they manipulate.This requirement implies that the basic routines should have local supports and in principle can run in parallel.
Soundness.The structural and behavioral descriptions must be consistent with known physical and mathematical principles.
Succinctness.The structural and behavioral descriptions should contain the qualitatively important distinctions relevant to the high-level tasks at hand.

Theory
Our theory of imagistic reasoning postulates the existence of multi-layers of spatial aggregates.Figure 2 shows the layers of spatial aggregates and computations organized around them.A primitive aggregate is de ned as an equivalence class of subsets of the pointwise eld representation.An aggregate is composed of equivalence classes of primitive aggregates.The eld is assumed to have a task-dependent metric.The metric induces a topology on the space and hence it is meaningful to talk about adjacency.The data structure for a spatial aggregate is a neighborhood graph whose nodes represent objects and edges represent adjacency relations among the objects.The input eld is sampled to form the lowest layer of abstraction; the eld can also be a ected by control actions from the higher-level abstraction layers.
Just as the stream construct in the scheme programming language provides a common interface for organizing signal processing computations, the neighborhood graph is our conceptual glue for piecing together operations that manipulate elds.We like to visualize nodes of neighborhood graph as open sets (in topology) in some appropriate space.Two nodes are adjacent if their respective open sets are contiguous. 3 The topological notion of adjacency is amazingly useful in reasoning about physical systems.In grouping objects into equivalence classes, a cluster tends to give rise to a connected component of the neighborhood graph.In reasoning about kinematics, the neighborhood graph provides the essential connectivity information among free space regions.In nding \interesting" structures, the pairwise consistency of the adjacent nodes localizes search regions.In isolating bifurcation patterns, the mismatch of adjacent objects provides a hint for further analysis.In constraint propagation and path search, the adjacency structure imposes locality to increase computational e ciency.Prevalence and simplicity { these two aspects of the neighborhood graph make it a powerful data structure for unifying many spatial computations.
Our theory revolves around the computation of the neighborhood graph and the nature of the processes that construct, lter, transform, and compare neighborhood graphs.We isolate a set of generic operators aggregate, classify, re-describe, and search which correspond to the important conceptual pieces common to a class of imagistic problem solvers such as kam (Yip, 1991), maps (Zhao, 1994), and hipair (Joskowicz & Sacks, 1991).
The next section discusses these operators in detail.Section 4 illustrates the use of these operators in a rational reconstruction of three implemented computer programs.

The Language of Spatial Aggregation
We present a language for describing computational processes organized around spatial aggregates.The language provides a small set of operators to construct and manipulate neighborhood graphs.The operators make the conceptual structure of several implemented programs clear.There are multi-layers of spatial abstraction.An abstraction level is dened by the neighborhood graph, a data structure representing spatial aggregates and adjacency relations.The input eld is fed to the lowest abstraction layer.Note the identical computational structure on each layer.The aggregate operator computes adjacency relations based on a task-speci c metric.The neighborhood graph is the common interface for map and lter routines.The remaining operations correspond to the generic analysis tasks.A repertoire of task-independent geometric manipulation routines (which are not shown) are accessible by the generic operators.

Task-level operators
The task-level generic operators consist of aggregate, classify, re-describe, localize, search, incremental-analyze, together with the predicates pairwise-consistent? and consistent?.The neighborhood graph is the \conceptual glue": it allows the computation of hierarchical structural descriptions to be organized in a uniform manner.The following box summarizes what the language provides and what a user needs to supply in order to write programs in spatial aggregation.
A user must specify the neighborhood relation, eld metric, and equivalence relation for these operators.
Examples of N-graph include 4-adjacency arrays, minimal spanning tree, and Voronoi diagram.

aggregate(objects combiner)
The aggregate operator assembles a collection of objects into a spatial structure using the combiner procedure and explicates the spatial relations among the objects in terms of the neighborhood graph. 4The operator returns a neighborhood graph (N-graph).The N-graph can be lazily built.For example, to recognize a trajectory in a phase space, the aggregate operator might be given a set of discrete points and a combiner procedure (such as minimal spanning tree) to establish adjacency relations.The combiner procedure might use a metric or topological properties of the underlying space.

classify(N-graph cluster-proc class-rules)
The classify operator forms equivalence classes according to an equivalence relation (using the cluster-proc), and assigns a semantic label to each equivalence class | a subgraph of the input N-graph | according to the classi cation rules.For example, the orbit clustering procedure groups orbits into ow pipes. 5The classi cation rules are a set of production rules.The operator returns a labeled N-graph.The catalog of the classi cation labels is domain-speci c.These classi cation labels serve as indices for storage and retrieval of shared class properties and methods for instantiating them.

re-describe(N-graph desc-type)
The re-describe operator changes the representation of a primitive object.Like a lambda abstraction in scheme, this operator allows a compound object (say a subset of a N-graph) to be treated as a primitive.Given a classi ed object, the description-type procedure instantiates additional properties speci c to that class of objects.For example, if a point set is classi ed as a space curve, it becomes sensible to compute additional geometric properties like length, curvature, and torsion.4. localize(N-graph select-proc enumerate-proc) The localize operator systematically enumerates members of an equivalence class (nodes of N-graph) and selects those according to the select procedure.This operator \opens up" an abstraction to allow individual members of the equivalence class to be singled out.

search(N-graph initial-states goal-p combiner)
The search operator returns paths starting from the initial-states and satisfying the goal-p predicate.The combiner procedure controls the order in which the graph is traversed.

incremental-analyze(N-graph state-desc delta)
Given a N-graph and a description of states and constituent laws, the incrementalanalyze operator computes the in nitesimal change to the qualitative state due to a small perturbation.The perturbation delta might be in the temporal, state, or parameter space.There are predicates pairwise-consistent? and consistent?: pairwise-consistent?(obj1 obj2 consistency-rules) The pairwise-consistent? predicate decides if two objects are consistent according to the consistency-rules.The objects can be primitive objects such as nodes of an N-graph or N-graphs themselves.
consistent?(obj consistency-rules) Consistent? tests if an object is well-formed according to the consistency-rules. 5.A ow pipe is a class of orbits that can be continuously deformed into each other.It is an example of the homotopy equivalence class.

Generic data structure and routines
The neighborhood graph is constructed by N-graph-constructor(objects neighbor-p) The N-graph-constructor takes a set of primitive objects and a neighborhood predicate as arguments, and returns a neighborhood graph.An example of such a neighborhood graph is the Voronoi diagram.The predicate neighbor-p tests if two nodes are neighbors.The set of task-independent routines operate on the objects in the neighborhood graphs and support the task-level operations.map(N-graph proc) The map routine transforms a neighborhood graph using a prespeci ed procedure.

filter(N-graph mask)
A lter selects a subset of the neighborhood graph for further processing.
In addition to the generic operators, the language provides routines to perform common geometric manipulation.The following routines are especially useful: 1. intrinsic-geometry(obj properties) computes intrinsic geometric properties of objects (e.g., area, curvature, surface normal).2. contain?(obj1 obj2) checks if obj2 is inside obj1.3. intersect(obj1 obj2) computes intersection of two objects.4. @(object) is the boundary operator that returns the boundary of an object.The dimension of boundary is co-dimension 1. 5. (object) is the co-boundary operator that returns a new object whose boundary is the object.The dimension of the new object is one higher than that of the object.6. convolve(object mask) performs pointwise convolution with the given mask.

Examples of Spatial Aggregation
In this section, we describe the architecture of three implemented systems kam (Yip, 1991), maps (Zhao, 1994), and hipair (Joskowicz & Sacks, 1991) in terms of the spatial aggregation framework.Although these programs are designed for di erent tasks, their computations share a strikingly similar pattern: These programs construct spatial objects, and interpret them via multi-layers of abstraction by object aggregation, classi cation, and re-description.Composite objects at a lower level are labeled and manipulated as primitive units at the next higher level.
Despite the fact that we are the authors of two of these programs, the structural similarities among these programs are not apparent to us until we carefully reconstructed these programs by de ning the appropriate neighborhood graphs and generic operators.Analyzing these programs in a common framework will help us to understand not only what the programs do, but also greatly enhance our ability to construct future programs by a few spatial aggregation operators.

KAM
The task for kam is to explore the dynamics of Hamiltonian systems and produce high-level summaries of their qualitative behaviors.
Given the state equations of a Hamiltonian system, kam derives a symbolic description of its qualitative behavior | in terms of orbit types, 6 orbit bundles, phase portraits, and bifurcation patterns | from a collection of point sets representing orbits (or trajectories) in the phase space (see Figure 3).The point sets can be obtained from numerical simulation or measurements.To provide a useful interpretation of the point set, kam has to decide (1) where to look for interesting orbits, and (2) how to group these orbits into larger structures.Kam proceeds via a sequence of intermediate representations that allow the gradual recovery of orbit structures and eventually the more global dynamical properties of the system.Kam is able to view an object at multiple levels of abstraction.For example, an orbit can be viewed as points in the phase space or a curve or part of an orbit bundle.
The computations in kam are organized into four layers (as shown in Figure 4): (1) orbit, (2) orbit bundle, (3) phase portrait, and (4) bifurcation pattern.We will walk through the rst level in su cient detail to illustrate how the computation is synthesized from the spatial aggregation operators and neighborhood graph.Details of the remaining levels are described by Yip (1991).
The input is a point set.The aggregate operator imposes an adjacency relation on the point set by constructing a minimal spanning tree (MST).Two points are adjacent or neighbors if they are connected by an edge in the MST.Although the MST is appropriate for orbit interpretation, other applications might require di erent adjacency relations (such as Voronoi diagrams or k-nearest neighbors).The output of the aggregate operator is a neighborhood graph that encodes the edges of the MST.
The consistent? predicate checks if there are any inconsistent edges, i.e., edges that are signi cantly longer than their nearby edges, in the neighborhood graph.Deleting the inconsistent edge will partition the graph into subgraphs each of which represents a cluster of the original point set.
Next, the classify operator assigns a label, an orbit type, to the neighborhood graph according to the shape of the MST and the number of clusters.If the assignment is unsuccessful, kam assumes the input point set does not contain enough points to reveal the structure of the orbit.Kam will request more points and repeat the aggregation step.
If the assignment is successful, the re-describe operator takes the labeled neighborhood graph and lls in information that is relevant to that particular orbit type.For example, if the orbit is a periodic orbit, the period of the orbit is determined.After lling in the details, the re-describe operator packages the orbit as a primitive object and passes it to 6.We introduce some useful terminology here.A dynamical system is a smooth vector eld.An orbit is an integral curve of the vector eld.An orbit bundle is a collection of adjacent orbits having the same qualitative behavior.A phase portrait is the collection of orbits that ll the phase space.A bifurcation pattern is a characteristic change in the structure of a phase portrait as some system parameters vary.
Yip & Zhao Right: Phase spaces for the buckling beam (upper) and locally controlled beam (lower).To stabilize the buckling beam far from the unbuckled state | the unstable equilibrium G, maps (1) nds a ow pipe, a group of qualitatively similar trajectories, that reaches G, (2) deforms the trajectory emanating from the initial state via a force control until the trajectory is close to G, and then (3) switches to a conventional linear controller to achieve the desired stabilization.Let region R in the lower phase plane be a linearly controllable region with control u 2 .Starting from an initial state S and initial control u 1 , the system evolves along a trajectory within the ow pipe until it is close to the projection of the region R.The force control u 1 is turned on to deform the trajectory so that the system moves into the region R where a linear controller drives the system to the desired unbuckled state G.
the next level of abstraction, the orbit bundle level, where the same process of aggregation, consistency checks, classi cation, and re-description is repeated.

MAPS
Maps' task is to analyze the qualitative phase-space structures of dissipative systems and use the analysis results to guide the synthesis of control laws.Like kam, maps extracts high-level dynamical information from the phase space structures.But maps goes beyond kam in two important aspects: (1) maps deals with threedimensional structures explicitly (whereas kam reasons with cross-sections of three-dimensional structures), and (2) maps uses the phase space structures to synthesize nonlinear control actions.
Maps synthesizes a global control path geometrically (see Figure 5).Given an initial state and a desired state for the system under control, maps searches for a path in the phase space that connects the initial and the desired state.If the goal is not directly reachable from the initial state, maps pieces together multiple path segments by varying the control actions.A brute-force search for individual control paths in a continuum is clearly infeasible.Maps partitions the continuous phase space into a manageable discrete set of objects | ow pipes | by de ning appropriate equivalence relations, and searches out the ow pipes for good control paths.
The computations in maps are organized into four layers (as shown in Figure 6): (1) stability region, (2) ow pipe, (3) phase portrait, and (4) ow pipe graph.The input are the xed points of the dynamical system 7 .Two xed points are adjacent if they are connected to the same saddle by trajectories.The adjacency relation is represented by a neighborhood graph.The trajectories passing through the saddles are classi ed into equivalence classes and assigned stability region boundary labels.The re-describe operator computes the regions delimited by the stability region boundaries and represents them by polyhedra.The stability regions are fed to the next layer.
In the second layer, a stability region is triangulated by the Delaunay method.The aggregate operator constructs a neighborhood graph of the triangulation using the adjacency relation de ned by the Voronoi diagram, the dual of the Delaunay triangulation.The triangulated sub-regions are classi ed into equivalence classes according to a topological criterion which states that two adjacent sub-regions are equivalent if the trajectories passing through them can be connected in a consistent manner.Equivalence classes of sub-regions are classi ed as ow pipes.Recall each ow pipe is a coarse representation of a set of trajectories having the same qualitative properties.The use of ow pipes simpli es considerably the control path planning problem.
The third layer aggregates the ow pipes to form a phase portrait.The fourth layer is where control decisions are made.Flow pipes from di erent phase portraits are aggregated to form a larger structure, the ow pipe graph, which is the fundamental data structure supporting path planning in the phase space.Two ow pipes are adjacent if the phase space regions covered by the ow pipes overlap.Intuitively, one can switch from one ow pipe to an adjacent one by setting appropriate control parameters that generate the phase portraits in question.Given an initial and desired state, the search operator searches the ow pipe graph for solution paths.
Information can also be passed down the abstraction layer.Once a connected sequence of ow paths is found to satisfy a control objective, individual trajectory segments within the ow pipe are found by the localize operator using a shooting method.

HIPAIR
Hipair performs kinematic analysis of xed-axes mechanisms built of rigid parts.Given a description of the shapes and motion types (such as translation and rotation) of the parts, hipair derives realizable con gurations of the mechanism.
Hipair derives realizable con gurations of a mechanism by constructing and manipulating the con guration space of the mechanism (see Figure 7).The con guration space is the space of positions and orientations of the parts that make up the mechanism.hipair partitions the con guration space into free space regions where parts do not overlap, and blocked space regions where they overlap.Only con gurations that correspond to the free space regions are realizable.The boundaries of the free space regions are determined by the 7. Fixed points, or equilibrium points, are critical points in the phase space where the velocity vector vanishes.Fixed points are classi ed into three types according to the behavior of the nearby trajectories.
A xed point is an attractor if the nearby trajectories all move towards it.It is a repellor if they all move away from it.It is a saddle if some move towards and some move away from it.follower.is the cam rotation.x is the follower displacement.The shaded regions are the blocked space, indicating that the parts overlap.The free space regions are the realizable con gurations of the cam-follower.The boundaries of the free space regions are determined by the contact relations between the cam ngers and the follower.
contact relations among the parts that touch each other.A region diagram is a graph whose nodes are free space regions and edges specify region adjacencies.The region diagram of the mechanism is composed of the regions diagrams of its pairwise interacting parts.For example, the region diagram of a mechanism with 10 parts is constructed from the region diagrams of 45 possibly interacting pairs.The computations in hipair are organized into three layers (as shown in Figure 8): (1) free space region, (2) subassembly region diagram, (3) mechanism region diagram.The input are the shapes of parts and their motion types.Hipair rst considers a pair of interacting parts.It looks up the equations of the contact curves, i.e., curves in the con guration space for the pair corresponding to the con gurations where the two parts touch, from a pre-compiled table of common contact curves.A contact curve is partitioned into segments by intersection points of the curve with either another contact curve or the boundaries of the con guration space.Two segments are adjacent if they share an endpoint.The aggregate operator assembles the segments and their adjacency relations into a neighborhood graph.The search operator traverses the neighborhood graph to nd all closed chains of segments, where a closed chain of segments is a sequence of segments that intersect itself.Each closed chain of segment encloses a free space region.The consistent? predicate discards closed chains that lie inside other closed chains.The classify operator assigns a label to each closed chain, and the re-describe operator computes the free space regions delimited by the closed chains.Each free space region is subdivided into convex regions.
The input to the second layer are free space regions.They are aggregated into a neighborhood graph.Two free space regions are adjacent (or neighbors) if they touch.Given an initial con guration S 0 of an interacting pair, the search operator nds the free space regions reachable from S 0 by a depth rst search.The neighborhood graph is re-described as a subassembly region diagram.
M 2 = fR 0 ; S 1 ; T 0 g M 3 = fR 1 ; S 1 ; T 0 g The consistent? predicate checks each of the candidate neighbors and discards the unrealizable ones.

An Illustration
In this section, we show what it is like to program in the spatial aggregation language.The example is a boundary tracer for line drawings. 8We pick this example because image analysis routines can be quite naturally written in the spatial aggregation style.
Boundary tracing is a basic operation in image analysis. 9The operation might be used to identify and group boundary segments from the same object.For example, consider a line drawing of overlapping 2D objects (see Figure 9).To group the boundary segments, one might rst decompose the gure into segments, and junctions.A tracing process then joins colinear segments.
We rst de ne a neighborhood relation between pixels by the 4-adjacency (namely, the neighbors of a pixel are the pixels in its immediate north, east, south, and west).Because there is often no e cient way to construct N-graphs directly from neighborhood relations, we de ne an explicit N-graph neighborhood constructor that nds all the 4-adjacency neighbors of a given pixel.
Next the aggregate operator assembles the pixels into an N-graph by the N-graph constructor.Pixels in the N-graph are considered similar if they are neighbors and neither is a junction, where a junction is de ned as a pixel whose value is one and which has more than two one-value neighbors.The classify operator groups the pixels into equivalence The foreground equivalence classes are then re-described as higher-level objects, boundary segments, which are in turn aggregated into a new N-graph using a di erent neighborhood relation.Speci cally two boundary segments are neighbors if their minimum separation distance is less than a speci ed separation.Next, adjacent boundary segments which are colinear are grouped into equivalence classes, called contours.A contour represents the complete boundary of an object.Figure 10(c) shows the result of grouping.
We might want to check for impossible contours.A contour is legal if it is closed and not self-intersecting.Such conditions are expressed in a standard pattern language.Pairwise consistency rules can likewise be de ned.
The program written in the spatial aggregation language is shown in Figure 12 and Figure 13. 10 10.In the actual implementation of the language described by Kellogg, Zhao, and Yip (1996), the syntax of the operators di ers slightly from those in Section 3.

Related Work
The literature in visual and spatial reasoning is enormous (e.g., Kosslyn, 1994;Glasgow, 1993).In this section, we discuss only the computationally oriented approaches.
The rst line of work investigates how diagram-like representations aid heuristic search.Gelernter (1963) used diagrams in his geometry theorem prover to prune goals that are obviously false.Nevins' geometry theorem prover constrained forward deduction to conclude facts about objects explicitly depicted in the diagrams (Nevins, 1975).Stallman and Sussman (1977) exploited the connectivity and locality of lumped-parameter model to guide forward reasoning and implement symbolic constraint propagation.In a similar spirit, Larkin and Simon (1987) showed that in elementary mechanics problem a diagrammatic representation can reduce search because the diagram provides convenient indices for clustering objects and relations.
The second line of work concerns analogue simulations in naive physics.Funt's whisper program is the rst AI program that uses primarily perceptual primitives to predict dynamical events in a simple blocks world (Funt, 1980).Arguing that the commonsense predictions of solid or uid behavior cannot possibly depend on the solution of complicated equations, Gardin and Meltzer (1989) proposed a \molecular" simulation of strings and uids.A body of uid, for example, is decomposed into macro-molecules interacting with each other according to a small set of local rules.Chandrasekaran and Narayanan (1990) proposed a direct analogue simulation of the motion of a sliding block on an inclined plane.Their objective is to develop a cognitive architecture for visual perception and mental imagery.
The direct representation of a scene they propose consists of a hierarchical, multi-resolution symbol structure encoding spatial relations among objects, and is linked to an analogical representation of the scene (image).The major challenge in analogue simulation is how to provide a reliable simulation without incorporating extensive physics and geometrical modeling.
The third line of work consists of spatial reasoning research in qualitative physics.Kuipers and Levitt (1988) described an approach to spatial reasoning in robot navigation and mapping of large-scale spaces.They proposed a four-level hierarchical representation incorporating topological and metric descriptions in terms of entities such as places, paths, distances, and angles.Forbus et al. (1991)   while the place vocabulary is a quantization of the space according to task-speci c criteria (see also footnote 1).Comparing the spatial aggregation framework and the MD/PV framework, we note two major di erences.First, whereas a metric diagram is a mixed symbolic/quantitative representation, a eld is purely numerical and does not encode any structures explicitly.Second, our theory postulates multi-layer spatial aggregates with identical computational structure at each layer.By focusing on the eld ontology, which can be thought of as a special class of metric diagrams, we are able to emphasize the importance of the structure-recovery problem, and the commonalities underlying several implemented programs.

Conclusion
We have developed the spatial aggregation paradigm as a realization of imagistic reasoning.The paradigm systematizes the important task of interpreting time-varying information-rich elds.The paradigm consists of three ideas: (1) a eld ontology, an image-like analogue representation, as input, (2) structural discovery { the e cient transformation from pointwise eld representation to economical symbolic descriptions { as the central computational problem, and (3) a multi-layer neighborhood graph as the common interface and a small set of generic operators { aggregate, classify, redescribe, and search { as building blocks for computational processes that derive symbolic abstractions from the analogue representation.The paradigm relies on the important observations that the physical constraints on a real eld (such as continuity and conservation) provide useful equivalence relations and economical descriptions, and a nonlocal property of a lower layer can often be redescribed as a local property of a higher layer.
The spatial aggregation paradigm supports the recovery of abstract properties via the multi-layer neighborhood graphs.It produces concise descriptions by manipulating equivalence classes of objects as primitives.It constructs modular programs from generic operators by mixing and matching a library of commonly used routines.It expresses task-speci c knowledge in terms of eld metric, adjacency relations, consistency predicates, classi cation rules, and redescription properties.
To illustrate our theory, we examine the computational structure of three implemented programs { kam, maps, and hipair { that integrate symbolic, numerical, and visual reasoning.We show a small set of generic operators that construct, transform, lter, classify, and search neighborhood graphs capture the commonalities of these programs.We develop a language, a way of organizing programs around neighborhood graphs, to make programs written in this style clear.
We are currently developing a toolkit to support problem solving using the generic operators of the spatial aggregation paradigm.Many research questions are still open.Can the operators be interfaced with computational geometry and with numerical analysis to build robust, e cient programs?What scienti c problems can be solved by spatial aggregation?
Imagistic reasoning is a powerful strategy for mapping between analog signals generated by physical systems and discrete, symbolic representations of the systems.Spatial aggregation is only one of its many realizations.We believe that reasoning methods that derive their power primarily from perceptual operations on analog representations and only secondarily from search and analytical methods might prove e ective in automating commonsense reasoning as well.

Figure 1 :
Figure 1: Field as an ontological abstraction for reasoning about physical systems.The diagram depicts the relationships among di erent ontologies used in Qualitative Physics.The central computational problem in eld reasoning is the recovery of economical structural descriptions for qualitative behavior description and explanation.A key step in the structural recovery is the formation of equivalence classes.Identifying general and valid continuities on which useful equivalence class relations are based is an important scienti c contribution.

Figure 2 :
Figure2: A schematic representation of the computational structure for analysis of a eld ontology.There are multi-layers of spatial abstraction.An abstraction level is dened by the neighborhood graph, a data structure representing spatial aggregates and adjacency relations.The input eld is fed to the lowest abstraction layer.Note the identical computational structure on each layer.The aggregate operator computes adjacency relations based on a task-speci c metric.The neighborhood graph is the common interface for map and lter routines.The remaining operations correspond to the generic analysis tasks.A repertoire of task-independent geometric manipulation routines (which are not shown) are accessible by the generic operators.

Figure 3 :Figure 4 :Figure 5 :
Figure 3: Top: (a) The phase portrait of a Hamiltonian system.The geometric structures in the phase portrait can vary drastically as the system parameter A changes.Like an expert dynamicist, kam explores the dynamics of a nonlinear Hamiltonian system by nding interesting structures in the phase space.It decides what initial conditions and parameter values to try.It interprets what it nds and uses the structures it draws for itself to guide further exploration.Bottom: (b-1) The minimal spanning tree representation of a point set.(b-2) Magnifying the boxed region | crosses ( ) are inconsistent edges.Kam imposes adjacency relations on a point set representing a trajectory in phase space.The structure of the minimal spanning tree reveals the type of the trajectory.

Figure 6 :Figure 7 :
Figure6: The computational structure of maps viewed as spatial aggregation operators acting on neighborhood graphs.It has four layers of abstraction: stability regions, ow pipes, phase portrait, and ow pipe graph.Note the structural similarities between kam and maps.Control synthesis is implemented by the search and localize operators acting on the neighborhood graph representing the ow pipe graph.
(a).Figure11illustrates how the output in Figure 10(b) and (

Figure 9 :Figure 10 :
Figure 9: A line drawing of two overlapping objects.

Figure 11 :
Figure 11: Boundary tracing operation: data ow in the spatial aggregation implementation.
developed the Metric Diagram/Place Vocabulary theory.The metric diagram contains both numerical and symbolic descriptions of a scene,