Specific-to-General Learning for Temporal Events with Application to Learning Event Definitions from Video

We develop, analyze, and evaluate a novel, supervised, specific-to-general learner for a simple temporal logic and use the resulting algorithm to learn visual event definitions from video sequences. First, we introduce a simple, propositional, temporal, event-description language called AMA that is sufficiently expressive to represent many events yet sufficiently restrictive to support learning. We then give algorithms, along with lower and upper complexity bounds, for the subsumption and generalization problems for AMA formulas. We present a positive-examples--only specific-to-general learning method based on these algorithms. We also present a polynomial-time--computable ``syntactic'' subsumption test that implies semantic subsumption without being equivalent to it. A generalization algorithm based on syntactic subsumption can be used in place of semantic generalization to improve the asymptotic complexity of the resulting learning algorithm. Finally, we apply this algorithm to the task of learning relational event definitions from video and show that it yields definitions that are competitive with hand-coded ones.


Introduction
Humans conceptualize the world in terms of objects and events.This is reflected in the fact that we talk about the world using nouns and verbs.We perceive events taking place between objects, we interact with the world by performing events on objects, and we reason about the effects that actual and hypothetical events performed by us and others have on objects.We also learn new object and event types from novel experience.In this paper, we present and evaluate novel implemented techniques that allow a computer to learn new event types from examples.We show results from an application of these techniques to learning new event types from automatically constructed relational, force-dynamic descriptions of video sequences.
We wish the acquired knowledge of event types to support multiple modalities.Humans can observe someone faxing a letter for the first time and quickly be able to recognize future occurrences of faxing, perform faxing, and reason about faxing.It thus appears likely that humans use and learn event representations that are sufficiently general to support fast and efficient use in multiple modalities.A long-term goal of our research is to allow similar cross-modal learning and use of event representations.We intend the same learned representations to be used for vision (as described in this paper), planning (something that we are beginning to investigate), and robotics (something left to the future).
A crucial requirement for event representations is that they capture the invariants of an event type.Humans classify both picking up a cup off a table and picking up a dumbbell off the floor as picking up.This suggests that human event representations are relational.We have an abstract relational notion of picking up that is parameterized by the participant objects rather than distinct propositional notions instantiated for specific objects.Humans also classify an event as picking up no matter whether the hand is moving slowly or quickly, horizontally or vertically, leftward or rightward, or along a straight path or circuitous one.It appears that it is not the characteristics of participant-object motion that distinguish picking up from other event types.Rather, it is the fact that the object being picked up changes from being supported by resting on its initial location to be supported by being grasped by the agent.This suggests that the primitive relations used to build event representations are force dynamic (Talmy, 1988).
Another desirable property of event representations is that they be perspicuous.Humans can introspect and describe the defining characteristics of event types.Such introspection is what allows us to create dic-tionaries.To support such introspection, we prefer a representation language that allows such characteristics to be explicitly manifest in event definitions and not emergent consequences of distributed parameters as in neural networks or hidden Markov models.
We develop a learner for an event representation possessing these desired characteristics as follows.First, we present a simple, propositional, temporal logic called AMA that is a sublanguage of a variety of familiar temporal languages (e.g.linear temporal logic, or LTL (Bacchus & Kabanza, 2000), temporal event logic (Siskind, 2001)).This logic is expressive enough to describe a variety of interesting temporal events, but restrictive enough to support an effective learner, as we demonstrate below.We proceed to develop a specificto-general learner for the AMA logic by giving algorithms and complexity bounds for the subsumption and generalization problems involving AMA formulas.While we show that semantic subsumption is intractable, we provide a weaker syntactic notion of subsumption that implies semantic subsumption but can be checked in polynomial time.Our implemented learner is based upon this syntactic subsumption.
We next show means to adapt this (propositional) AMA learner to learn relational concepts.We evaluate the resulting relational learner in a complete system for learning force-dynamic event definitions from positive-only training examples given as real video sequences.This is not the first system to perform visual event recognition from video.We review prior work and compare it to the current work later in the paper.In fact, two such prior systems have been built.HOWARD (Siskind & Morris, 1996) learns to classify events from video using temporal, relational representations.But these representations are not force dynamic.LEONARD (Siskind, 2001) classifies events from video using temporal, relational, force-dynamic representations but does not learn these representations.It uses a library of hand-code representations.This work adds a learning component to LEONARD, essentially duplicating the performance of the hand-coded definitions automatically.
While we have demonstrated the utility of our learner in the visual event learning domain, we note that there are many domains where interesting concepts take the form of structured temporal sequences of events.In machine planning, macro-actions represent useful temporal patterns of action.In computer security, typical application behavior, represented perhaps as temporal patterns of system calls, must be differentiated from compromised application behavior (and likewise authorized user behavior from intrusive behavior).
In what follows, Section 2 introduces our application domain of recognizing visual events.Section 3 describes the high-level construction of our learner.Section 4 introduces the AMA language, syntax and semantics, and several concepts needed in our analysis of the language.Section 5 develops and analyzes algorithms for the subsumption and generalization problems in the language, and introduces the more practical notion of "syntactic subsumption".Section 6 extends the basic propositional learner to handle relational data, negation, and to control exponential run-time growth.Section 7 presents our results on visual event learning, and Sections 8 and 9 compare to related work and conclude.

Recognizing Visual Events
LEONARD (Siskind, 2001) is a system for recognizing visual events from video camera input-an example of a simple visual event is "a hand picking up a block".This research was originally motivated by the problem of adding a learning component to LEONARD.Below we briefly describe the system and the framework for extending LEONARD to learn to recognize events.
LEONARD is a three-stage pipeline depicted in Figure 1.The raw input consists of a video-frame sequence depicting events.First, a segmentation-and-tracking component transforms this input into a polygon movie: a sequence of frames, each frame being a set of convex polygons placed around the tracked objects in the video.Figure 2a shows a "partial" video sequence of a PICKUP event that is overlaid by the corresponding polygon movie.Next, a model-reconstruction component transforms the polygon movie into a force-dynamic model.This model describes the changing support, contact, and attachment relations between the tracked objects over time.Figure 2b shows a visual depiction of the force-dynamic model corresponding to the PICKUP event.Finally, an event-recognition component armed with a library of event definitions determines which events occurred in the model and, accordingly, in the video.Figure 2c shows the "text" output and input of the event-recognizer for the PICKUP event.The first line corresponds to the output which indicates the depicts the event-learning component described in this paper.The input to the learning component consists of training models of a target event (e.g., movies of PICKUP events) and the output is an event definition (e.g., a temporal logic formula defining PICKUP).
interval(s) where a PICKUP occurred-the remaining lines are the text encoding of the event-recognizer input (model-reconstruction output), indicating the time-intervals in which various force-dynamic relations are true in the video.
The event-recognition component of LEONARD represents event types with event logic formulas like the following simplified example, representing Ü picking up Ý off of Þ.
PICKUP´Ü Ý Þµ ´SUPPORTS´Þ Ýµ CONTACTS´Þ Ýµµ ´SUPPORTS´Ü Ýµ ATTACHED´Ü Ýµµ This formula asserts that an event of Ü picking up Ý off of Þ is defined as a sequence of two states where Þ supports Ý by way of contact in the first state and Ü supports Ý by way of attachment in the second state.
SUPPORTS, CONTACTS, and ATTACHED are primitive force-dynamic relations.This formula is a specific example of the more general class of AMA formulas that we use in our learning, presented later in Section 4.
Prior to the work reported in this paper, the definitions in LEONARD's event recognition library were hand coded.Here, we add a learning component to LEONARD so that it can learn to recognize events.Figure 1 shows how the learning component fits into the overall system.The input to the learning component consists of force-dynamic models from the model-reconstruction stage and its output consists of event definitions which are used by the event-recognizer.We take a supervised learning approach where the force-dynamic model-reconstruction process is applied to training videos of a target event-type-the resulting force-dynamic models are then given to the learner which induces a candidate definition of the event-type.Note that our learning component does not require negative examples of the event type (i.e., movies depicting non-occurrences).

Bottom-up Learning from Positive Data
In this work we present a specific-to-general positive-only learner for temporal events-our learning algorithm is only given positive training examples (where the target event occurs) and is not given negative examples (where the target event does not occur).The positive-only setting is of interest as it appears that humans are able to learn many event definitions given primarily or only positive examples.From a practical standpoint, a positive-only learner removes the often difficult task of collecting negative examples that are "representative" of what is not the event to be learned.
A typical learning domain specifies an example space (the objects we wish to classify) and a concept language (formulas that represent sets of examples that they cover).Generally we say a concept ½ is more general (less specific) than ¾ if and only if ¾ is a subset of ½ -alternatively, a generality relation that may not be equivalent to subset may be specified, often for computational reasons.Setting the goal of finding a concept consistent with a set of positive-only training data generally results in the trivial solution of returning the most general concept in the language.To avoid adding negative training data, it is common to specify the learning goal as finding the least-general concept that covers all of the data1 .With enough data and an appropriate concept language, the least-general concept often converges usefully.We take a standard specific-to-general machine-learning approach to finding the least-general concept covering a set of positive examples.Assume we have a concept language Ä and an example space Ë.The approach relies on the computation of two functions: the least-general covering formula (LGCF) of an example and the least-general generalization (LGG) of a set of formulas.An LGCF in Ä of an example in Ë is a formula in Ä that covers the example such that no other covering formula is strictly less general.Intuitively, the LGCF of an example, if unique, is the "most representative" formula in Ä of that example.An LGG of any subset of Ä is a formula more general than each formula in the subset and not strictly more general than any other such formula.Neither the LGG nor the LGCF is guaranteed to exist or be unique-these properties must be shown for any language of interest.
Given the existence and uniqueness (up to concept equivalence) of the LGCF and LGG, the specificto-general approach proceeds as follows.First, use the LGCF to transform each positive training instance into a formula of Ä.Second, return the LGG of the resulting formulas.The returned formula represents the least-general concept in Ä that covers all the positive training examples.This learning approach has been pursued for a variety of concept languages including, clausal first-order logic (Plotkin, 1971), definite clauses (Muggleton & Feng, 1992), and description logic (Cohen & Hirsh, 1994).It is important to choose an appropriate concept language as a bias for this learning approach or the concept returned may simply be (or resemble) one of two extremes, either the disjunction of the training data or the universal concept.
In this work, the concept language is the AMA temporal event logic presented below and the example space is the set of all models of that logic.Intuitively, a training example depicts a model where a target event occurs.(The models can be thought of as movies.)We will consider two notions of generalization for AMA concepts (semantic generalization and a weaker syntactic counterpart) and, under both notions, study the properties and computation of the LGCF and LGG.

Representing Events with AMA
We study a subset of an interval-based logic called event logic (Siskind, 2001) utilized by LEONARD for event recognition in video sequences.This logic is "interval-based" in explicitly representing each of the possible interval relationships given originally by Allen (1983) in his calculus of interval relations (e.g., "overlaps", "meets", "during").Event logic formulas allow the definition of event-types which can specify static properties of intervals directly and dynamic properties by hierarchically relating sub-intervals using the Allen relations.In this paper the formal syntax and semantics of full event logic are needed only for Proposition 4 and are given in Appendix A.
Here we restrict our attention to a much simpler subset of event logic we call AMA, defined below.We believe that our choice of event logic rather than first-order logic, as well as our restriction to the AMA fragment of event logic, provide a useful learning bias by ruling out a large number of 'practically useless' concepts while maintaining substantial expressive power.The practical utility of this bias is demonstrated via our empirical results in the visual event recognition application.AMA can also be seen as a restriction of LTL to conjunction and "Until", with similar motivations.Below we present the syntax and semantics of AMA along with some of the key technical properties of AMA that will be used throughout this paper.

AMA Syntax and Semantics
It is natural to describe temporal events by specifying a sequence of properties that must hold over consecutive time-intervals; e.g., "a hand picking up a block" might become "the block is not supported by the hand and then the block is supported by the hand."We represent such sequences with MA timelines 2 , which are sequences of conjunctive state restrictions.Intuitively, an MA timeline is given by a sequence of propositional conjunctions, separated by semi-colons, and is taken to represent the set of events that temporally match the sequence of consecutive conjunctions.An AMA formula is then the conjunction of a number of MA timelines, representing events that can be simultaneously viewed as satisfying each of the conjoined timelines.where prop is any primitive proposition (sometimes called a primitive event-type).We take this grammar to formally define the terms "MA timeline", "MA formula", "AMA formula", and "state".A -MA formula is an MA formula with at most states, and a -AMA formula is an AMA formula all of whose MA timelines are -MA timelines.We often treat states as proposition sets (with ØÖÙ the empty set) and AMA formulas as MA timeline sets.We may also treat MA formulas as sets of states-it is important to note, however, that MA formulas may contain duplicate states, and the duplication can be significant.For this reason, when treating MA timelines as sets, we formally intend sets of state-index pairs (where the index gives a states position in the formula).We do not indicate this explicitly to avoid encumbering our notation, but the implicit index must be remembered whenever handling duplicate states.
The semantics of AMA formulas is defined in terms of temporal models.A temporal model Å Å Á over the set of propositions PROP is a pair of a mapping Å from the natural numbers (representing time) to the truth assignments over PROP, and a closed natural number interval Á.We note that Siskind (2001) gives a continuous-time semantics for event logic where the models are defined in terms of real-valued timeintervals.The temporal models defined here use discrete natural-number time-indices; however, our results here still apply under the continuous-time semantics (that semantics bounds the number of state changes in the continuous timeline to a countable number).It is important to note that the natural numbers in the domain of Å are representing time discretely, but that there is no prescribed unit of continuous time represented by each natural number.Instead, each number represents an arbitrarily long period of continuous time during which nothing changed.Similarly, the "states" in our MA timelines represent arbitrarily long periods of time during which the conjunctive restriction given by the state holds.The satisfiability relation for AMA formulas is given as follows: The condition defining satisfaction for MA timelines may appear unintuitive at first due to the fact that there are two ways that × ¾ ¡ ¡ ¡ × Ò can be satisfied.The reason for this becomes clear by recalling that we are using the natural numbers to represent continuous time intervals.Intuitively, from a continuoustime perspective, an MA timeline is satisfied if there are consecutive continuous-time intervals satisfying the sequence of consecutive states of the MA timeline.The transition between consecutive states × and × •½ can occur either within an interval of constant truth assignment (that happens to satisfy both states) or exactly at the boundary of two time intervals of constant truth value.In the above definition, these cases correspond to .Later we show that the MA-projection of a model can be viewed as "representing" that model in a precise sense.
The following two examples illustrate some basic behaviors of AMA formulas.
Example 2 (Infinite Descending Chains) Given propositions and , the MA timeline ¨ ´ µ is subsumed by each of the formulas , , , . . . .This is intuitively clear when our semantics are viewed from a continuous-time perspective-any interval in which both and are true can be broken up into an arbitrary number of subintervals where both and hold.This example illustrates that there can be infinite descending chains of AMA formulas where the entire chain subsumes a given formula (but no member is equivalent to the given formula).In general, any AMA formula involving only the propositions and will subsume ¨.

Motivation for AMA
MA timelines are a very natural way to capture "stretchable" sequences of state constraints.But why consider the conjunction of such sequences, i.e., AMA?We have several reasons for this language enrichment.First of all, we show below that the AMA least-general generalization (LGG) is unique; this is not true for MA.Second, and more informally, we argue that parallel conjunctive constraints can be important to learning efficiency.In particular, the space of MA formulas of length grows in size exponentially with , making it difficult to induce long MA formulas.However, finding several shorter MA timelines that each characterize part of a long sequence of changes is exponentially easier.(At least, the space to search is exponentially smaller.)The AMA conjunction of these timelines places these shorter constraints simultaneously and often captures a great deal of the concept structure.For this reason, we analyze AMA as well as MA and, in our empirical work we consider k-AMA.
The AMA language is propositional.But our intended applications are relational, or first-order, including visual event recognition.Later in this paper we show that the propositional AMA learning algorithms we develop can be effectively applied in relational domains.Our approach to first-order learning is distinctive in automatically constructing an object correspondence across examples (e.g., compare (Lavrac, Dzeroski, & Grobelnik, 1991;Roth & Yih, 2001)).Similarly, though AMA does not allow for negative state constraints we show how to obtain the practical advantages of negation, which is crucial in visual event recognition.

Conversion to First-Order Clauses
We note that AMA formulas can be translated in various ways into to first-order clauses.It is not straightforward, however, to then use existing clausal generalization techniques for learning.In particular, to capture the AMA semantics in clauses, it appears necessary to define subsumption and generalization relative to a background theory that restricts us to a "continuous-time" first-order-model space.For example, consider the AMA formulas ¨½ ´ µ and ¨¾ where and are propositions-from example 2 we know that ¨½ ¨¾.Now, consider a straightforward clausal translation of these formulas giving ½ ´Áµ ´Áµ and ¾ ´Á½ µ ´Á¾ µ MEETS´Á ½ Á ¾ µ SPAN´Á ½ Á ¾ Áµ, where the Á and Á are variables that represent time intervals, MEETS indicates that two time intervals meet each other, and SPAN indicates that the union of the first two time-interval arguments equals the third time-interval argument.The intention is for satisfying assignments for Á in ½ and ¾ to indicated intervals over which ¨½ and ¨¾ are satisfied, respectively.It should be clear that, contrary to what we want, ½ ¾ (i.e., ½ ¾ ), since it is easy to find "unintended" first-order models that satisfy ½ , but not ¾ .Thus such a translation, and other similar translations, do not capture the continuous-time nature of the AMA semantics.
In order to capture the AMA semantics in a clausal setting, one might define a first-order theory that restricts us to "continuous-time" models-for example, allowing for the derivation "if property holds over an interval, then that property also holds over all sub-intervals".Given such a theory ¦, we have that ¦ ½ ¾ , as desired.However, it is well known that least-general generalizations relative to such background theories need not exist (Plotkin, 1971), so prior work on clausal generalization does not simply subsume our results for the AMA language.
We note that for a particular training set, it may be possible to compile a "continuous-time" background theory, ¦, into a finite but "adequate" set of ground facts.Relative to such ground theories, clausal LGGs are known to always exist and thus could be used for our application.However, the only such compiling approaches that look promising to us require exploiting an analysis similar to the one given in this paper-i.e., understanding the AMA generalization and subsumption problem separately from clausal generalization, and exploiting that understanding in compiling the background theory.We have not pursued such compilations further.
Even if we are given such a compilation procedure, there are other problems with using existing clausal generalization techniques for learn AMA formulas.For the clausal translations of AMA we have found, the resulting generalizations typically fall outside of the (clausal translations of formulas in the) AMA language, so that the language bias of AMA is lost.In preliminary empirical work in our video-event recognition domain using clausal inductive logic programming (ILP) systems, we found that the learner appeared to lack the necessary language bias to find effective event definitions.While we believe it would be possible to find ways to build this language bias into ILP systems, we chose instead to define and learn within the desired language bias directly, by defining the class of AMA formulas, and studying the generalization operation on that class.

Basic Concepts and Properties of AMA
We use the following convention in naming our results: "propositions" and "theorems" are the key results of our work, with theorems being those results of the most technical difficulty, and "lemmas" are technical results needed for the later proofs of propositions or theorems.We number all the results in one sequence, regardless of type.Proofs of theorems and propositions are provided in the main text-omitted proofs of lemmas are provided in the appendix.
We give pseudo-code for our methods in a non-deterministic style.In a non-deterministic language functions can return more than one value "non-deterministically", either because they contain non-deterministic choice points, or because they call other non-deterministic functions.Since a non-deterministic function can return more than one possible value, depending on the choices made at the choice-points encountered, specifying such a function is a natural way to specify a richly structured set (if the function has no arguments) or relation (if the function has arguments).To actually enumerate the values of the set (or the relation, once arguments are provided) one simply has to add a standard backtracking search over the different possible computations corresponding to different choices at the choice points.

SUBSUMPTION AND GENERALIZATION FOR STATES
The most basic formulas we deal with are states (conjunctions of propositions)-in our propositional setting computing subsumption and generalization at the state level is straightforward.A state Ë ½ subsumes Ë ¾ (Ë ¾ Ë ½ ) iff Ë ½ is a subset of Ë ¾ , viewing states as sets of propositions.From this we derive that the intersection of states is the least-general subsumer of those states and that the union of states is likewise the most general subsumee.

INTERDIGITATIONS
Given a set of MA timelines, we need to consider the different ways in which a model could simultaneously satisfy the timelines in the set.At the start of such a model (i.e., the first time point), the initial state from each timeline must be satisfied.At some time point in the model, one or more of the timelines can transition so that the second state in those timelines must be satisfied in place of the initial state, while the initial state of the other timelines remains satisfied.After a sequence of such transitions in subsets of the timelines, the final state of each timeline holds.Each way of choosing the transition sequence constitutes a different "interdigitation" of the timelines.
Alternatively viewed, each model simultaneously satisfying the timelines induces a co-occurrence relation on tuples of timeline states, one from each timeline, identifying which tuples co-occur at some point in the model.We represent this concept formally as a set of tuples of co-occurring states, i.e., a co-occurrence relation.We sometimes think of this set of tuples as ordered by the sequence of transitions.Intuitively, the tuples in an interdigitation represent the maximal time intervals over which no MA timeline has a transition, giving the co-occurring states for each such time interval.
Definition 1 An interdigitation Á of a set of MA timelines ¨½ ¨Ò is a co-occurrence relation over ¨½ ¢ ¡ ¡ ¡ ¢ ¨Ò (viewing timelines as sets of states3 ) that is piecewise total, and simultaneously consistent with the state orderings of the ¨ .We say that two states × ¾ ¨ and × ¼ ¾ ¨ for co-occur in Á iff some tuple of Á contains both × and × ¼ .We sometimes refer to Á as a sequence of tuples, meaning the sequence lexicographically ordered by the ¨ state orderings.
We note that there are exponentially many interdigitations of even two MA timelines (relative to the total number of states in the timelines).Example 3 below shows an interdigitation of two MA timelines, and pseudo-code for non-deterministically generating an arbitrary interdigitation for a set of MA timelines can be found in Figure 3.Given an interdigitation Á of the timelines × ½ × ¾ ¡ ¡ ¡ × Ñ and Ø ½ Ø ¾ ¡ ¡ ¡ Ø Ò (and possibly others, the following basic properties of interdigitations are easily verifiable: 1.For , if × and Ø co-occur in I then for all ¼ , × does not co-occur with Ø ¼ in Á, and 2. Á´× ½ Ø ½ µ and Á´× Ñ Ø Ò µ.RETURN extend-tuple´Ë ¼ an-interdigitation´ ¨¼ ½ ¨¼ Ò µµ; Figure 3: Pseudo-code for an-interdigitation(), which non-deterministically computes an interdigitation for a set ¨½ ¨Ò of MA timelines.The functions head(¨) and rest(¨) return the first state in the timeline ¨and ¨with the first state removed, respectively, and extend-tuple(Ü,Á) extends a tuple Á by adding a new first element Ü to form a longer tuple.The function a-non-empty-subset-of(Ë) non-deterministically returns an arbitrary non-empty subset of Ë.
For the forward direction, assume that ¨½ ¨¾, and let Å be any model such that ¨½ MAP´Åµ.It is clear that such Å exists and satisfies ¨½.It follows that Å satisfies ¨¾, and Lemma 1 then implies that there is a witnessing interdigitation for MAP´Åµ ¨¾ and thus for ¨½ ¨¾.¾

LEAST-GENERAL COVERING FORMULA
A logic can discriminate two models if it contains a formula that satisfies one but not the other.It turns out that AMA formulas can discriminate two models exactly when much richer internal positive event logic (IPEL) formulas can do so.Internal formulas are those that define event occurrence only in terms of properties within the defining interval (i.e., satisfaction by Å Á depends only on the proposition truth values given by Å inside the interval Á)-positive formulas are those that do not contain negation.Appendix A gives the full syntax and semantics of IPEL (which are used only to state and prove Lemma 3 ).The fact that AMA can discriminate models as well as IPEL indicates that our restriction to AMA formulas retains substantial expressive power and leads to the following result which serves as the least-general covering formula (LGCF) component of our specific-to-general learning procedure.First, we introduce the concept of model embedding.We say that model Å embeds model Å ¼ iff MAP´Åµ MAP´Å ¼ µ.Lemma 3 For any ¾ ÁÈ Ä, if model Å embeds any model that satisfies , then Å satisfies .
Proposition 4 The MA-projection of a model is its LGCF for internal positive event logic (and hence for AMA), up to semantic equivalence.
Proof: Consider model Å.We know that MAP´Åµ covers Å, so it remains to show that MAP´Åµ is the least general formula to do so, up to semantic equivalence.
Let be any IPEL formula that covers Å.Let Å ¼ be any model that is covered by MAP´Åµ-we want to show that also covers Å ¼ .We know from Lemma 1 that there is a witnessing interdigitation for MAP´Å ¼ µ MAP´Åµ-thus, by Proposition 2, MAP´Å ¼ µ MAP´Åµ showing that Å ¼ embeds Å. Combining these facts with Lemma 3 it follows that also covers Å ¼ and hence MAP´Åµ .¾ Proposition 4 tells us that for IPEL the LGCF of a model exists, is unique, and is an MA timeline.Given this property, when an AMA formula © covers all the MA timelines covered by another AMA formula © ¼ , we have © ¼ ©.Thus, for the remainder of this paper when considering subsumption between formulas we can abstract away from temporal models and deal rather with MA timelines.Proposition 4 also tells us that we can compute the LGCF of a model by constructing the MA-projection of that model.Based on the definition of MA-projection, it is straightforward to derive an LGCF algorithm which runs in time polynomial in the size of the model4 .We note that the MA-projection may contain repeated states-in practice we remove repeated states which does not change the meaning of the resulting formula (as demonstrated in Example 1).

COMBINING INTERDIGITATION WITH GENERALIZATION OR SPECIALIZATION
Interdigitations are useful in analyzing both conjunctions and disjunctions of MA timelines.When conjoining a set of timelines, any model of the conjunction induces an interdigitation of the timelines such that cooccurring states simultaneously hold in the model at some point (viewing states as sets, the union of the co-occurring states must hold).By constructing an interdigitation and taking the union of each tuple of co-occurring states to get a sequence of states, we get an MA timeline that forces the conjunction of the timelines to hold.We call such a sequence an "interdigitation specialization" of the timelines.Dually, an "interdigitation generalization" involving intersections of states gives an MA timeline that holds whenever the disjunction of a set of timelines holds.
Definition 2 An interdigitation generalization (specialization) of a set ¦ of MA timelines is an MA timeline × ½ × Ñ , such that, for some interdigitation Á of ¦ with Ñ tuples, × is the intersection (respectively, union) of the components of the j'th tuple of the sequence Á.The set of interdigitation generalizations (respectively, specializations) of ¦ is called IG´¦µ (respectively, IS´¦µ).
The corresponding IG and IS members are Each timeline in IG´¦µ (dually, IS´¦µ) subsumes (is subsumed by) each timeline in ¦-this is easily verified using Proposition 2. For our complexity analyses, we note that the number of states in any member of IG´¦µ or IS´¦µ is lower-bounded by the number of states in any of the MA timelines in ¦ and is upperbounded by the total number of states in all the MA timelines in ¦.The number of interdigitations of ¦, and thus of members of IG´¦µ or IS´¦µ, is exponential in that same total number of states.The algorithms we present later for computing LGGs require the computation of both IG´¦µ and IS´¦µ.Here we give pseudo-code to compute these quantities-figure 4 gives pseudo-code for the function an-IG-member that an-IG-member´ ¨½ ¨¾ ¨Ò ) // The ¨ are MA timelines.// Outputs a member of IG´ ¨½ ¨¾ ¨Ò µ.RETURN map ´intersect-tuple an-interdigitation ´ ¨½ ¨Ò µµ; Figure 4: Pseudo-code for an-IG-member, which non-deterministically computes a member of IG´Ì µ where Ì is a sequence of MA timelines.The function intersect-tuple(Á) takes a tuple Á of sets as its argument and returns their intersection.The higher-order function map( Á) takes a function and a tuple Á as arguments and returns a tuple of the same length as Á obtained by applying to each element of Á and making a tuple of the results.
non-deterministically computes an arbitrary member of IG´¦µ (an-IS-member is identical only we replace intersection by union).Given a set of MA timelines ¦ we can compute IG´¦µ by executing all possible deterministic computation paths of the function call an-IG-member(¦), i.e., computing the set of results obtainable from the non-deterministic function for all possible decisions at non-deterministic choice-points.
We now give a useful lemma and a proposition concerning the relationships between conjunctions and disjunctions of MA concepts (the former being AMA concepts).For convenience here, we use disjunction on MA concepts, producing formulas outside of AMA with the obvious interpretation.
Lemma 5 Given an MA formula ¨that subsumes each member of a set ¦ of MA formulas, ¨also subsumes some member ¨¼ of IG´¦µ.Dually, when ¨is subsumed by each member of ¦, we have that ¨is also subsumed by some member ¨¼ of IS´¦µ.In each case, the length of ¨¼ is bounded by the size of ¦.
Proposition 6 The following hold: 1. (and-to-or) The conjunction of a set ¦ of MA timelines equals the disjunction of the timelines in IS´¦µ.

(or-to-and) The disjunction of a set ¦ of MA timelines is subsumed by the conjunction of the timelines in IG´¦µ.
Proof: To prove part 2 recall that for any ¨¾ ¦ and any ¨¼ ¾ IG´¦µ we have that ¨ ¨¼.From this it is immediate that ´Ï ¦µ ´Î IG´¦µµ.Using a dual argument we can show that ´Ï IS´¦µµ ´Î ¦µ.It remains to show that ´Î ¦µ ´Ï IS´¦µµ, which is equivalent to showing that any timeline subsumed by ´Î ¦µ is also subsumed by ´Ï IS´¦µµ (by Proposition 4).Consider any MA timeline ¨such that ¨ ´Î ¦µ-this implies that each member of ¦ subsumes ¨.Lemma 5 then implies that there is some ¨¼ ¾ IS´¦µ such that ¨ ¨¼.From this we get that ¨ ´Ï IS´¦µµ, as desired.¾ Using "and-to-or", we can now reduce AMA subsumption to MA subsumption, with an exponential increase in the problem size.
For the backward direction assume that for all ¨½ ¾ IS´© ½ µ and ¨¾ ¾ © ¾ that ¨½ ¨¾.This tells us that for each ¨½ ¾

Subsumption and Generalization
In this section we study subsumption and generalization of AMA formulas.First, we give a polynomial-time algorithm for deciding subsumption between MA formulas and then show that deciding subsumption for AMA formulas is coNP-complete.Second we give algorithms and complexity bounds for the construction of least-general generalization (LGG) formulas based on our analysis of subsumption, including existence, uniqueness, lower/upper bounds, and an algorithm for the LGG on AMA formulas.Third, we introduce a polynomial-time-computable syntactic notion of subsumption and an algorithm that computes the corresponding syntactic LGG that is exponentially faster than our semantic LGG algorithm.Fourth, in Section 5.4, we give a detailed example showing the steps performed by our LGG algorithms to compute the semantic and syntactic LGGs of two AMA formulas.

Subsumption
All our methods rely critically on a novel algorithm for deciding the subsumption question ¨½ ¨¾ between MA formulas ¨½ and ¨¾ in polynomial-time.We note that merely searching the possible interdigitations of ¨½ and ¨¾ for a witnessing interdigitation provides an obvious decision procedure for the subsumption question-however, there are, in general, exponentially many such interdigitations.We reduce the MA subsumption problem to finding a path in a graph on pairs of states in ¨½ ¢ ¨¾, a polynomial-time operation.
Pseudo-code for the resulting MA subsumption algorithm is shown in Figure 5.The main data structure used by the MA subsumption algorithm is the subsumption graph.

Definition 3 The subsumption graph of two
To achieve a polynomial-time bound one can simply use any polynomial-time pathfinding algorithm.In our case the special structure of the subsumption graph can be exploited to determine if the desired path exists in Ç´ÑÒµ time, as the example method shown in the pseudo-code illustrates.The following theorem asserts the correctness of the algorithm assuming a correct polynomial-time path-finding method is used.Lemma 8 Given MA timelines ¨½ × ½ ¡ ¡ ¡ × Ñ and ¨¾ Ø ½ ¡ ¡ ¡ Ø Ò , there is a witnessing interdigitation for ¨½ ¨¾ iff there is a path in the subsumption graph Ë ´¨½ ¨¾µ from Ú ½ ½ to Ú Ñ Ò .
Proof: The algorithm clearly runs in polynomial time.Lemma 8 tells us that line 2 of the algorithm will return TRUE iff there is a witnessing interdigitation.Combining this with Proposition 2 shows that the algorithm returns TRUE iff ¨½ ¨¾.¾ Given this polynomial-time algorithm for MA subsumption, Proposition 7 immediately suggests an exponential-time algorithm for deciding AMA subsumption-by computing MA subsumption between the exponentially many IS timelines of one formula and the timelines of the other formula.Our next theorem suggests that we cannot do any better than this in the worst case-we argue that AMA subsumption is coNP-complete by reduction from boolean satisfiability.Readers uninterested in the technical details of this argument may skip directly to Section 5.2.
To develop a correspondence between boolean satisfiability problems and AMA formulas, which lack negation, we imagine that each boolean variable has two AMA propositions, one for "true" and one for "false".In particular, given a boolean satisfiability problem over Ò variables Ô ½ Ô Ò , we take the set PROP Ò to be the set containing ¾Ò AMA propositions True and False for each between ½ and Ò.We can now represent a truth assignment to the Ô variables with an AMA state × given as follows: As Proposition 7 suggests, checking AMA subsumption critically involves the exponentially many interdigitation specializations of the timelines of one of the AMA formulas.In our proof, we design an AMA formula whose interdigitation specializations can be seen to correspond to truth assignments 5 to boolean variables, as shown in the following lemma.
Lemma 10 Given some Ò, let © be the conjunction of the timelines Ò ½ ´PROP Ò True False PROP Ò µ ´PROP Ò False True PROP Ò µ We have the following facts about truth assignments to the Boolean variables Ô ½ Ô Ò : 1.For any truth assignment , PROP Ò × PROP Ò is semantically equivalent to some member of IS´©µ.
2. For each ¨¾ IS´©µ there is a truth assignment such that ¨ PROP Ò × PROP Ò .
With this lemma in hand, we can now tackle the complexity of AMA subsumption.
Proof: We first show that deciding the AMA-subsumption of © ½ by © ¾ is in coNP by providing a polynomiallength certificate for any "no" answer.This certificate for non-subsumption is an interdigitation of the timelines of © ½ that yields a member of IS´© ½ µ not subsumed by © ¾ .Such a certificate can be checked in polynomial time: given such an interdigitation, the corresponding member of IS´© ½ µ can be computed in time polynomial in the size of © ½ , and we can then test whether the resulting timeline is subsumed by each timeline in © ¾ using the polynomial-time MA-subsumption algorithm.Proposition 7 guarantees that © ½ © ¾ iff there is a timeline in IS´© ½ µ that is not subsumed by every timeline in © ¾ , so that such a certificate will exist exactly when the answer to a subsumption query is "no".
To show coNP-hardness we reduce the problem of deciding the satisfiability of a 3-SAT formula Ë ½ ¡ ¡ ¡ Ñ to the problem of recognizing non-subsumption between AMA formulas.Here, each is 5.A truth assignment is a function mapping boolean variables to true or false.
´Ð ½ Ð ¾ Ð ¿ µ and each Ð either a proposition Ô chosen from È Ô ½ Ô Ò or its negation Ô.The idea of the reduction is to construct an AMA formula © for which we view the exponentially many members of IS´©µ as representing truth assignments.We then construct an MA timeline ¨that we view as representing Ë and show that Ë is satisfiable iff © ¨.Let © be as defined in Lemma 10.Let ¨be the formula × ½ × Ñ , where

×
False Ð Ô for some True Ð Ô for some Each × can be thought of as asserting "not ".We start by showing that if Ë is satisfiable then © ¨.
Assume that Ë is satisfied via a truth assignment -we know from Lemma 10 that there is a ¨¼ ¾ IS´©µ that is semantically equivalent to PROP Ò × PROP Ò .We show that PROP Ò × PROP Ò is not subsumed by ¨, to conclude © ¨using Proposition 7, as desired.Suppose for contradiction that PROP Ò × PROP Ò is subsumed by ¨-then the state × must be subsumed by some state × in ¨.Consider the corresponding clause of Ë.Since satisfies Ë we have that is satisfied and at least one of its literals Ð must be true.Assume that Ð Ô (a dual argument holds for Ð Ô ), then we have that × contains False while × contains True but not False -thus, we have that × × (since × × ), contradicting our choice of .To complete the proof, we now assume that Ë is unsatisfiable and show that © ¨.Using Proposition 7, we consider arbitrary ¨¼ in IS´©µ-we will show that ¨¼ ¨.From Lemma 10 we know there is some truth assignment such that ¨¼ PROP Ò × PROP Ò .Since Ë is unsatisfiable we know that some is not satisfied by and hence is satisfied by .This implies that each primitive proposition in × is in × .Let We see that in each tuple of co-occurring states given above that the state from Ì is subsumed by the state from ¨. Thus Ï is a witnessing interdigitation to PROP Ò × PROP Ò ¨, which then holds by Proposition 2combining this with ¨¼ PROP Ò × PROP Ò we get that ¨¼ ¨, as desired.¾ Given this hardness result we later define a weaker polynomial-time-computable subsumption notion for use in our learning algorithms.

Least-General Generalization.
The existence of an AMA LGG is nontrivial as there can be infinite chains of increasingly specific formulas all of which generalize given formulas.Example 2 demonstrated such chains for an MA subsumee and can be extended for AMA subsumees.For example, each member of the chain È É, È É È É, È É È É È É covers © ½ ´È Éµ É and © ¾ È ´È Éµ.Despite such complications, the AMA LGG does exist.
Theorem 12 There is an LGG for any finite set ¦ of AMA formulas that is subsumed by all other generalizations of ¦.
Proof: Let be the set Let © be the conjunction of all the MA timelines that generalize while having size no larger than .Since there are only a finite number of primitive propositions, there are only a finite number of such timelines, so © is well defined 6 .We show that © is a least-general generalization of ¦.First, note that each timeline in © generalizes and thus ¦ (by Proposition 6), so © must generalize ¦.Now, consider arbitrary generalization © ¼ of ¦.Proposition 7 implies that © ¼ must generalize each formula in .Lemma 5 then implies that each timeline of © ¼ must subsume a timeline ¨that is no longer than the size of and that also subsumes the timelines of .But then ¨must be a timeline of ©, by our choice of ©, 6.There must be at least one such timeline, the timeline with one state containing all propositions // The © are AMA formulas.
so that every timeline of © ¼ subsumes a timeline of ©.It follows that © ¼ subsumes ©, and that © is an LGG of ¦ subsumed by all other LGGs of ¦, as desired.¾ Given that the AMA LGG exists and is unique we now show how to compute it.Our first step is to strengthen "or-to-and" from Proposition 6 to get an LGG for the MA sublanguage.
Theorem 13 For a set ¦ of MA formulas, the conjunction of all MA timelines in IG´¦µ is an AMA LGG of ¦.
Proof: Let © be the specified conjunction.Since each timeline of IG´¦µ subsumes all timelines in ¦, © subsumes each member of ¦.To show © is a least-general such formula, consider an AMA formula © ¼ that also subsumes all members of ¦.Since each timeline of © ¼ must subsume all members of ¦, Lemma 5 implies that each timeline of © ¼ subsumes a member of IG´¦µ and thus each timeline of © ¼ subsumes ©.
This implies © © ¼ .¾ We can now characterize the AMA LGG using IS and IG.
Theorem 14 IG´Ë ©¾¦ IS´©µµ is an AMA LGG of the set ¦ of AMA formulas.
, or it would fail to subsume one of the © .Using "and-to-or" we can represent as a disjunction of MA timelines given by LGG must be a least-general formula that subsumes -i.e., an AMA LGG of the set of MA timelines Ë IS´©µ © ¾ ¦ .Theorem 13 tells us that an LGG of these timelines is given by IG´Ë IS´©µ © ¾ ¦ µ. ¾ Theorem 14 leads directly to an algorithm for computing the AMA LGG-figure 6 gives pseudo-code for the computation.Lines 4-9 of the pseudo-code correspond to the computation of Ë IS´©µ © ¾ ¦ , where timelines are not included in the set if they are subsumed by timelines already in the set (which can be checked with the polynomial time MA subsumption algorithm).This pruning, accomplished by the IF test in line 7, often drastically reduces the size of the timeline set for which we perform the subsequent IG computationthe final result is not affected by the pruning since the subsequent IG computation is a generalization step.
The remainder of the pseudo-code corresponds to the computation of IG´Ë IS´©µ © ¾ ¦ µ where we do not include timelines in the final result that subsume some other timeline in the set.This pruning step (the IF test in line 12) is sound since when one timeline subsumes another, the conjunction of those timelines is equivalent to the most specific one.Section 5.4.1 traces the computations of this algorithm for an example LGG calculation.
Since the sizes of both IS´¡µ and IG´¡µ are exponential in the sizes of their inputs, the code in Figure 6 is doubly exponential in the input size.We conjecture that we cannot do better than this, but we have not yet proven a doubly exponential lower bound for the AMA case.When the input formulas are MA timelines the algorithm takes singly exponential time, since IS´ ¨ µ ¨when ¨is in MA.We now prove an exponential lower bound when the input formulas are in MA.Again, readers uninterested in the technical details of this proof can safely skip forward to Section 5.3.
For this argument, we take the available primitive propositions to be those in the set Ô ½ Ò ½ Ò , and consider the MA timelines We will show that any AMA LGG of ¨½ and ¨¾ must contain an exponential number of timelines.In particular, we will show that any AMA LGG is equivalent to the conjunction of a subset of IG´ ¨½ ¨¾ µ, and that certain timelines may not be omitted from such a subset.
Lemma 15 Any AMA LGG © of a set ¦ of MA timelines is equivalent to a conjunction © ¼ of timelines from IG´¦µ with © ¼ © Proof: Lemma 5 implies that any timeline ¨in © must subsume some timeline ¨¼ ¾ IG´¦µ.But then the conjunction © ¼ of such ¨¼ must be equivalent to ©, since it clearly covers ¦ and is covered by the LGG ©.
Since © ¼ was formed by taking one timeline from IG´¦µ for each timeline in ©, we have © ¼

© . ¾
We can complete our argument then by showing that exponentially many timelines in IG´ ¨½ ¨¾ µ cannot be omitted from such a conjunction while it remains an LGG.
Notice that for any we have that × £ × £ Ô .This implies that any state in IG´ ¨½ ¨¾ µ contains exactly one proposition, since each such state is formed by intersecting a state from ¨½ and ¨¾.
Furthermore, the definition of interdigitation, applied here, implies the following two facts for any timeline Together these facts imply that any timeline in IG´ ¨½ ¨¾ µ is a sequence of propositions starting with Ô ½ ½ and ending with Ô Ò Ò such that any consecutive propositions Ô Ô ¼ ¼ are different with ¼ equal to or • ½ and ¼ equal to or •½.We call a timeline in IG´ ¨½ ¨¾ µ square if and only if for each pair of consecutive propositions Ô and Ô ¼ ¼ have either ¼ or ¼ .The following lemma implies that no square timeline can be omitted from the conjunction of timelines in IG´¨½ ¨¾µ if it is to remain an LGG of ¨½ and ¨¾.
Lemma 16 Let ¨½ and ¨¾ be as given above and let © Î IG´ ¨½ ¨¾ µ.For any © ¼ whose timelines are a subset of those in © that omits some square timeline, we have © © ¼ .
The number of square timelines in IG´ ¨½ ¨¾ µ is equal to ´¾Ò ¾µ ´Ò ½µ ´Ò ½µ and hence is exponential in the size of ¨½ and ¨¾.We have now completed the proof of the following result.
Theorem 17 The smallest LGG of two MA formulas can be exponentially large.
Proof: By Lemma 15, any AMA LGG © ¼ of ¨½ and ¨¾ is equivalent to a conjunction of the same number of timelines chosen from IG´ ¨½ ¨¾ µ.However, by Lemma 16, any such conjunction must have at least ´¾Ò ¾µ ´Ò ½µ ´Ò ½µ timelines, and then so must © ¼ , which must then be exponentially large.¾ Conjecture 18 The smallest LGG of two AMA formulas can be doubly-exponentially large.
We now show that our lower-bound on AMA LGG complexity is not merely a consequence of the existence of large AMA LGGs.Even when there is a small LGG, it can be expensive to compute due to the difficulty of testing AMA subsumption: Theorem 19 Determining whether a formula © is an AMA LGG for two given AMA formulas © ½ and © ¾ is co-NP-hard, and is in co-NEXP, in the size of all three formulas together.
Proof: To show co-BP-hardness we use a straightforward reduction from AMA subsumption.Given two AMA formulas © ½ and © ¾ we decide © ½ © ¾ by asking whether © ¾ is an AMA LGG of © ½ and © ¾ .Clearly © ½ © ¾ iff © ¾ is an LGG of the two formulas.
To show the co-NEAP upper bound, note that we can check in exponential time whether © ½ © and © ¾ © using Proposition 7 and the polynomial-time MA subsumption algorithm.It remains to show that we can check whether © is not the "least" subsumer.Since Theorem 14 shows that the LGG of © ½ and © ¾ is IG´IS´© ½ µ IS´© ¾ µµ, if © is not the LGG then © IG´IS´© ½ µ IS´© ¾ µµ.Thus, by Proposition 7, if © is not a least subsumer, there must be timelines ¨½ ¾ IS´©µ and ¨¾ ¾ IG´IS´© ½ µ IS´© ¾ µµ such that ¨½ ¨¾.We can then use exponentially long certificates for "No" answers: each certificate is a pair of an interdigitation Á ½ of © and an interdigitation Á ¾ of IS´© ½ µ IS´© ¾ µ, such that the corresponding members ¨½ ¾ IS´©µ and ¨¾ ¾ IG´IS´© ½ µ IS´© ¾ µµ have ¨½ ¨¾.Given the pair of certificates Á ½ and Á ¾ , ¨½ can be computed in polynomial time, ¨¾ can be computed in exponential time, and the subsumption between them can be checked in polynomial time (relative to their size, which can be exponential).If © is the LGG then © IG´IS´© ½ µ IS´© ¾ µµ, so that no such certificates will exist.¾

Syntactic Subsumption and Syntactic Least-General Generalization.
Given the intractability results for semantic AMA subsumption, we now introduce a tractable generality notion, syntactic subsumption, and discuss the corresponding LGG problem.The use of syntactic forms of generality for efficiency is familiar in ILP (Muggleton & De Raedt, 1994)-where, for example, -subsumption is often used in place of the entailment generality relation.Unlike AMA semantic subsumption, syntactic subsumption requires checking only polynomially many MA subsumptions, each in polynomial time (via Theorem 9).Definition 4 AMA © ½ is syntactically subsumed by AMA © ¾ (written © ½ syn © ¾ ) iff for each MA timeline ¨¾ ¾ © ¾ , there is an MA timeline ¨½ ¾ © ½ such that ¨½ ¨¾.
Proposition 20 AMA syntactic subsumption can be decided in polynomial time.
Syntactic subsumption trivially implies semantic subsumption-however, the converse does not hold in general.Consider the AMA formulas ´ µ ´ µ, and where and are primitive propositions.
We have ´ µ ´ µ ; however, we have neither nor , so that does not syntactically subsume ´ µ ´ µ.Syntactic subsumption fails to recognize constraints that are only derived from the interaction of timelines within a formula.
Syntactic Least-General Generalization.The syntactic AMA LGG is the syntactically least-general AMA formula that syntactically subsumes the input AMA formulas.Here, "least" means that no formula properly syntactically subsumed by a syntactic LGG can syntactically subsume the input formulas.Based on the hardness gap between syntactic and semantic AMA subsumption, one might conjecture that a similar gap exists between the syntactic and semantic LGG problems.Proving such a gap exists requires closing the gap between the lower and upper bounds on AMA LGG shown in Theorem 14 in favor of the upper bound, as suggested by Conjecture 18.While we cannot yet show a hardness gap between semantic and syntactic LGG, we do give a syntactic LGG algorithm that is exponentially more efficient than the best semantic LGG algorithm we have found (that of Theorem 14).First, we show that syntactic LGG's exist and are unique up to mutual syntactic subsumption (and hence up to semantic equivalence).
Theorem 21 There exists a syntactic LGG for any AMA formula set ¦ that is syntactically subsumed by all syntactic generalizations of ¦.
Proof: Let © be the conjunction of all the MA timelines that syntactically generalize ¦ while having size no larger than ¦.As in the proof of Theorem 12, © is well defined.We show that © is a syntactic LGG for ¦.First, note that © syntactically generalizes ¦ because each timeline of © generalizes a timeline in every member of ¦, by the choice of ©.Now consider an arbitrary syntactic generalization © ¼ of ¦.By the definition of syntactic subsumption, each timeline ¨in © ¼ must subsume some timeline ¨« in each member « of ¦.Lemma 5 then implies that there is a timeline ¨¼ of size no larger than ¦ that subsumes all the ¨« while being subsumed by ¨.By our choice of ©, the timeline ¨¼ must be a timeline of ©.It follows then that © ¼ syntactically subsumes ©, and that © is a syntactic LGG of ¦ subsumed by all other generalizations of ²

¾
In general, we know that semantic and syntactic LGGs are different, though clearly the syntactic LGG is a semantic generalization and so must subsume the semantic LGG.For example, ´ µ ´ µ, and have a semantic LGG of , as discussed above; but their syntactic LGG is ´ ØÖÙ µ ´ØÖÙ µ, which subsumes but is not subsumed by .Even so, for MA formulas: Proposition 22 For MA ¨and AMA ©, ¨ syn © is equivalent to ¨ ©.
Proof: The forward direction is immediate since we already know syntactic subsumption implies semantic subsumption.For the reverse direction, note that ¨ © implies that each timeline of © subsumes ¨-thus since ¨is a single timeline each timeline in © subsumes "some timeline" in ¨which is the definition of syntactic subsumption.¾ Proposition 23 Any syntactic AMA LGG for an MA formula set ¦ is also a semantic LGG for ¦.
Proof: Now, consider a syntactic LGG © for ¦.Proposition 22 implies that © is a semantic generalization of ¦.Consider any semantic LGG © ¼ of ¦.We show that © © ¼ to conclude that © is a semantic LGG for ¦.Proposition 22 implies that © ¼ syntactically subsumes ¦.It follows that © ¼ © syntactically subsumes ¦.But, © ¼ © is syntactically subsumed by ©, which is a syntactic LGG of ¦-it follows that © ¼ © syntactically subsumes ©, or © would not be a least syntactic generalization of ¦.But then © ´È × ¼ ©µ, which implies © © ¼ , as desired.¾ We note that the stronger result stating that a formula © is a syntactic LGG of a set ¦ of MA formulas if and only if it is a semantic LGG of ¦ is not an immediate consequence of our results above.At first examination, the strengthening appears trivial, given the equivalence of ¨ © and ¨ syn © for MA ¨.
However, being semantically least is not necessarily a stronger condition than being syntactically leastwe have not ruled out the possibility that a semantically least generalization © may syntactically subsume another generalization that is semantically (but not syntactically) equivalent.(This question is open, as we have not found an example of this phenomenon either.)Proposition 23 together with Theorem 21 have the nice consequence for our learning approach that the syntactic LGG of two AMA formulas is a semantic LGG of those formulas, as long as the original formulas are themselves syntactic LGGs of sets of MA timelines.Because our learning approach starts with training examples that are converted to MA timelines using the LGCF operation, the syntactic LGGs computed (whether combining all the training examples at once, or incrementally computing syntactic LGGs of parts of the training data) are always syntactic LGGs of sets of MA timelines and hence are also semantic LGGs, in spite of the fact that syntactic subsumption is weaker than semantic subsumption.We note, however, that the resulting semantic LGGs may be considerably larger than the smallest corresponding semantic LGG (which may not be a syntactic LGG at all).
Using Proposition 23, we now show that we cannot hope for a polynomial-time syntactic LGG algorithm.
Theorem 24 The smallest syntactic LGG of two MA formulas can be exponentially large.
Proof: Suppose there is always a syntactic LGG of two MA formulas that is not exponentially large.Since by Proposition 23 each such formula is also a semantic LGG, there is always a semantic LGG of two MA formulas that is not exponentially large.This contradicts Theorem 17. ¾ While this is discouraging, we have an algorithm for the syntactic LGG whose time complexity matches this lower-bound, unlike the semantic LGG case, where the best algorithm we have is doubly exponential in the worst case.Theorem 14 yields an exponential time method for computing the semantic LGG of a set of MA timelines ¦-since for a timeline ¨, IS´¨µ ¨, we can simply conjoin all the timelines of IG´¦µ.Given a set of AMA formulas, the syntactic LGG algorithm uses this method to compute the polynomially-many semantic LGGs of sets of timelines, one chosen from each input formula, and conjoins all the results.Theorem 25 The formula Each timeline ¨of © must subsume each © because ¨is an output of IG on a set containing a timeline of © -thus © syntactically subsumes each © .To show that © is a syntactically least such formula, consider a © ¼ that syntactically subsumes every © .We show that © syn © ¼ to conclude.Each timeline ¨¼ in © ¼ subsumes a timeline Ì ¾ © , for each , by our assumption that © syn © ¼ .But then by Lemma 5, ¨¼ must subsume a member of IG´ Ì ½ Ì Ò µ-and that member is a timeline of ©-so each timeline ¨¼ of © ¼ subsumes a timeline of ©.We conclude © syn © ¼ , as desired.

¾
This theorem yields an algorithm that computes a syntactic AMA LGG in exponential time-pseudocode for this method is given in Figure 7.The exponential time bound follows from the fact that there are exponentially many ways to choose ¨½ ¨Ñ in line 5, and for each of these there are exponentially many semantic-LGG members in line 6 (since the ¨ are all MA timelines)-the product of these two exponentials is still an exponential.
The formula returned by the algorithm shown is actually a subset of the syntactic LGG given by Theorem 25.This subset is syntactically (and hence semantically) equivalent to the formula specified by the theorem, but is possibly smaller due to the pruning achieved by the IF statement in lines 7-9.A timeline is pruned from the set if it is (semantically) subsumed by any other timeline in the set (one timeline is kept from any semantically equivalent group of timelines, at random).This pruning of timelines is sound, since a timeline is pruned from the output only if it subsumes some other formula in the output-this fact allows an easy argument that the pruned formula is syntactically equivalent to (i.e.mutually syntactically subsumed by) the unpruned formula.Section 5.4.2 traces the computations of this algorithm for an example LGG calculation.
The method does an exponential amount of work even if the result is small (typically because many timelines can be pruned from the output because they subsume what remains).It is still an open question as to whether there is an output-efficient algorithm for computing the syntactic AMA LGG-this problem is in coNP and we conjecture that it is coNP-complete.One route to settling this question is to determine the output complexity of semantic LGG for MA input formulas.We believe that problem also to be coNPcomplete, but have not proven this; if that problem is in P, there is an output-efficient method for computing syntactic AMA LGG based on Theorem 25.
A summary of the algorithmic complexity results from this section can be found in Table 3 in the conclusions section of this paper.

3:
// Outputs a syntactic LGG of the © .For each ¨in semantic-LGG´ ¨½ ¨Ñ µ 7: Figure 7: Pseudo-code that computes the syntactic AMA LGG of a set of AMA formulas.

Examples: Least-General Generalization Calculations
Below we work through the details of a semantic and a syntactic LGG calculation.We consider the AMA formulas © ´ µ ´ µ and ¨ , for which the semantic LGG is and the syntactic LGG is ´ ØÖÙ µ ´ØÖÙ µ.

SEMANTIC LGG EXAMPLE
The first step in calculating the semantic LGG, according to the algorithm given in Figure 6, is to compute the interdigitation-specializations of the input formulas (i.e., IS´¨µ and IS´©µ).Trivially, we have that IS´¨µ ¨ .To calculate IS´©µ, we must consider the possible interdigitations of ©, for which there are three, Each interdigitation leads to the corresponding member of IS´©µ by unioning (conjoining) the states in each tuple, so IS´©µ is Lines 5-9 of the semantic LGG algorithm compute the set Ë, which is equal to the union of the timelines in IS´©µ and IS´¨µ, with all subsumed timelines removed.For our formulas, we see that each timeline in IS´©µ is subsumed by ¨-thus, we have that Ë ¨ .
After computing Ë, the algorithm returns the conjunction of timelines in IG´Ëµ, with redundant timelines removed (i.e., all subsuming timelines are removed).In our case, IG´Ëµ , trivially, as there is only one timelin in Ë, thus the algorithm correctly computes the semantic LGG of © and ¨to be .

SYNTACTIC LGG EXAMPLE
The syntactic LGG algorithm, shown in Figure 7, computes a series of semantic LGGs for MA timeline sets, returning the conjunction of the results (after pruning).Line 5 of the algorithm, cycles through timeline tuples from the cross-product of the input AMA formulas.In our case the tuples in ¨¢ © are Ì ½ and Ì ¾ -for each tuple, the algorithm computes the semantic LGG of the tuple's timelines.
The semantic LGG computation for each tuple uses the algorithm given in Figure 6, but the argument is always a set of MA timelines rather than AMA formulas.For this reason, lines 4-9 are superfluous, as for an MA timeline ¨¼, IS´¨¼µ ¨¼.In the case of tuple Ì ½ , lines 4-9 of the algorithm just compute Ë . It remains to compute the interdigitation-generalizations of Ë (i.e., IG´Ëµ), returning the conjunction of those timelines after pruning (lines 10-15 in Figure 6).The set of all interdigitations of Ë

ØÖÙ and ØÖÙ
do not subsume one another, the set computed by lines 5-9 of the syntactic LGG algorithm is equal to

ØÖÙ ØÖÙ
. Thus, the algorithm computes the syntactic LGG of ¨and © to be ´ ØÖÙ µ ´ØÖÙ µ.Note that, in this case, the syntactic LGG is more general than the semantic LGG.

Practical Extensions
We have implemented a specific-to-general AMA learning algorithm based on the LGCF and syntactic LGG algorithms presented earlier.This implementation includes three practical extensions.The first extension aims at controlling the exponential complexity by limiting the length of the timelines we consider.The second extension deals with applying our propositional algorithm to relational data, as is necessary for the application domain of visual event recognition.Finally, we show how to gain the practical advantages of negation despite the fact that AMA does not include negation-this turns out to be crucial to achieving good performance in our experiments.

k-AMA Least-General Generalization
We have already indicated that our syntactic AMA LGG algorithm takes exponential time relative to the lengths of the timelines in the AMA input formulas.This motivates restricting the AMA language to -AMA in practice, where formulas contain timelines with no more than states.As is increased the algorithm is able to output increasingly specific formulas at the cost of an exponential increase in computational time.
In the visual-event-recognition experiments shown later, as we increased , the resulting formulas became overly specific before a computational bottle-neck is reached-i.e., for that application the best values of were practically computable and the ability to limit provided a useful language bias.We use a -cover operator in order to limit our syntactic LGG algorithm to -AMA.A -cover of an AMA formula is a syntactically least general -AMA formula that syntactically subsumes the input-it is easy to show that a -cover for a formula can be formed by conjoining all -MA timelines that syntactically subsume the formula (i.e., that subsume any timeline in the formula) .Figure 8 gives pseudo-code for computing the -cover of an AMA formula.It can be shown that this algorithm correctly computes a -cover for any input AMA formula.The algorithm calculates the set of least general -MA timelines that subsume each timeline in the input-the resulting -MA formulas are conjoined and "redundant" timelines are pruned // The ¨ are AMA formulas.

14:
// is an positive natural number.using a subsumption test.We note that the -cover of an AMA formula may itself be exponentially larger than that formula; however, in practice, we have found -covers not to exhibit undue size growth.
Given the -cover algorithm we restrict our learner to -AMA as follows: 1) Compute the -cover for each AMA input formulas.2) Compute the syntactic AMA LGG of the resulting -AMA formulas.3) Return the -cover of the resulting AMA formula.The primary bottleneck of the original syntactic LGG algorithm is computing the exponentially large set of interdigitation-generalizations-the -limited algorithm limits this complexity as it only computes interdigitation-generalizations involving -MA timelines.

Relational Data
LEONARD produces relational models that involve objects and (force dynamic) relations between those objects.Thus event definitions include variables to allow generalization over objects.For example, a definition for PICKUP´Ü Ý Þµ recognizes both PICKUP´hand block tableµ as well as PICKUP´man dumbbell floorµ.
Despite the fact that our -AMA learning algorithm is propositional, we are still able to use it to learn relational definitions.
We take a straightforward object-correspondence approach to relational learning.We view the models output by LEONARD as containing relations applied to constants.Since we (currently) support only supervised learning, we have a set of distinct training examples for each event type.There is an implicit correspondence between the objects filling the same role across the different training models for a given type.For example, models showing PICKUP´hand block tableµ and PICKUP´man dumbbell floorµ have implicit correspondences hand man , block dumbbell , and table floor .We outline two relational learning methods that differ in how much object-correspondence information they require as part of the training data.

COMPLETE OBJECT CORRESPONDENCE
This first approach assumes that a complete object correspondence is given, as input, along with the training examples.Given such information, we can propositionalize the training models by replacing corresponding objects with unique constants.The propositionalized models are then given to our propositional -AMA learning algorithm which returns a propositional -AMA formula.We then lift this propositional formula by replacing each constant with a distinct variable.Lavrac et al. (1991) has taken a similar approach.

PARTIAL OBJECT CORRESPONDENCE
The above approach assumes complete object-correspondence information.While it is sometimes possible to provide all correspondences (for example, by color-coding objects that fill identical roles when recording training movies), such information is not always available.When only a partial object correspondence (or even none at all) is available, we can automatically complete the correspondence and apply the above technique.
For the moment, assume that we have an evaluation function that takes two relational models and a candidate object correspondence, as input, and yields an evaluation of correspondence quality.Given a set of training examples with missing object correspondences, we perform a greedy search for the best set of objectcorrespondence completions over the models.Our method works by storing a set È of propositionalized training examples (initially empty) and a set Í of unpropositionalized training examples (initially the entire training set).For the first step, when È is empty, we evaluate all pairs of examples from Í, under all possible correspondences, select the pair that yields the highest score, remove the examples involved in that pair from Í, propositionalize them according to the best correspondence, and add them to È .For each subsequent step, we use the previously computed values of all pairs of examples, one from Í and one from È , under all possible correspondences.We then select the example from Í and correspondence that yields the highest average score relative to all models in È -this example is removed from Í, propositionalized according to the winning correspondence, and added to È .For a fixed number of objects, the effort expended here is polynomial in the size of the training set; however, if the number of objects that appear in a training example is allowed to grow, the number of correspondences that must be considered grows as .For this reason, it is important that the events involved manipulate only a modest number of objects.
Our evaluation function is based on the intuition that object roles for visual events (as well as events from other domains) can often be inferred by considering the changes between the initial and final moments of an event.Specifically, given two models and an object correspondence, we first propositionalize the models according to the correspondence.Next, we compute ADD and DELETE lists for each model.The ADD list is the set of propositions that are true at the final moment but not the initial moment.The DELETE list is the set of propositions that are true at the initial moment but not the final moment.(These add and delete lists are motivated by STRIPS action representations.(Fikes & Nilsson, 1971)) Given such ADD and DELETE lists for models ½ and ¾, the evaluation function returns the sum of the cardinalities of ADD ½ ADD ¾ and DELETE ½ DELETE ¾ .This heuristic measures the similarity between the ADD and DELETE lists of the two models.The intuition behind this heuristic is similar to the intuition behind the STRIPS action-description language-i.e., that most of the differences between the initial and final moments of an event occurrence are related to the target event, and that event effects can be described by ADD and DELETE lists.We have found that this evaluation function works well in the visual-event domain.
Note, that when full object correspondences are given to the learner (rather than automatically extracted by the learner), the training examples are interpreted as specifying that the target event took place as well as which objects filled the various event roles (e.g., PICKUP(a,b,c)).Rather, when no object correspondences are provided the training examples are interpreted as specifying the existence of a target event occurrence but do not specify which objects fill the roles (i.e., the training example is labeled by È Á ÃÍÈ rather than PICKUP(a,b,c)).Accordingly, the rules learned when no correspondences are provided only allow us to infer that a target event occurred and not which objects filled the event roles.For example when object correspondences are manually provided the learner might produce the rule, PICKUP´Ü Ý Þµ ´SUPPORTS´Þ Ýµ CONTACTS´Þ Ýµµ ´SUPPORTS´Ü Ýµ ATTACHED´Ü Ýµµ whereas a learner that automatically extracts the correspondences would instead produce the rule,

PICKUP
´SUPPORTS´Þ Ýµ CONTACTS´Þ Ýµµ ´SUPPORTS´Ü Ýµ ATTACHED´Ü Ýµµ Its worth noting, however, that upon producing the second rule the availability of a single training example with correspondence information allows the learner to determine the roles of the variables, upon which it can output the first rule.Thus, under the assumption that the learner can reliably extract object correspondences, we need not label all training examples with correspondence information in order to obtain definitions that explicitly recognize object roles.

Negative Information
The AMA language does not allow negated propositions.Negation, however, is sometimes necessary to adequately define an event type.It turns out that we can easily get the practical advantages of negation without incorporating negation into the AMA language.We do this by adding new propositions to our models that intuitively represent the proposition negations.Assume the training examples contain the propositions Ô ½ Ô Ò .We introduce a new set Ô ½ Ô Ò of propositions and add these into the training models.It is a design choice as to how we assign truth values to these new propositions.
In our experiments, we compare two methods for assigning a truth value to Ô .The first method, called full negation, assigns true to Ô in a model iff Ô is false in the model.The second method, called boundary negation, differs from full negation in that it only allows Ô to be true in the initial and final moments of a model (and then only if Ô is false).Ô must be false at all other times.We have found that boundary negation provides a good trade-off between no negation, which often produces overly general results, and full negation, which often produces overly specific and much more complicated results.Both methods share the property that they produce models where Ô and Ô are never simultaneously true.It follows that our learning methods will never produce formulas with states that contain both Ô and Ô .

Data Set
Our data set contains examples of different event classes: pick up, put down, stack, unstack, move, assemble, and disassemble.Each of these involve a hand and two to three blocks.For a detailed description and sample video sequences of these event types, see Siskind (2001).Key frames from sample video sequences of these event classes are shown in figure 9.The results of segmentation, tracking, and model reconstruction are overlayed on the video frames.We recorded ¿¼ movies for each of the event classes resulting in a total of ¾½¼ movies comprising ½½ frames.We replaced one assemble movie, with a duplicate copy of another because of segmentation and tracking errors.Some of the event classes are hierarchical in that occurrences of events in one class contain occurrences of events in one or more simpler classes.For example, a movie depicting a MOVE´ µ event (i.e.moves from to ) contains subintervals where PICKUP´ µ and PUTDOWN´ µ events occur.In our experiments, when learning the definition of an event class only the movies for that event class are used in training-we do not train on movies for other event classes that may also depict an occurrence of the event class being learned as a sub-event.However, in evaluating the learned definitions, we wish to detect both the events that correspond to an entire movie as well as subevents that correspond to portions of that movie.For example, given a movie depicting MOVE´ µ and PICKUP´ µ, but not PICKUP´ µ, it will be charged two false positives as well as one false negative.We evaluate our definitions in terms of false positive and negative rates as describe below.

Experimental Procedure
For each event type, we evaluate the -AMA learning algorithm using a leave-one-movie-out cross-validation technique with training-set sampling.The parameters to our learning algorithm are and the degree of negative information used: the value of is either P, for "positive propositions only", BN, for "boundary negation", or N, for "full negation".The parameters to our evaluation procedure include the target event type and the training-set size AE.Given this information, the evaluation proceeds as follows: For each movie Å (the held-out movie) from the ¾½¼ movies, apply the -AMA learning algorithm to a randomly drawn training sample of AE movies from the ¿¼ movies of event type (or ¾ movies if Å is one of the ¿¼).
Use LEONARD to detect all occurrences of the learned event definition in Å.Based on and the event type of Å, record the number of false positives and false negatives in Å, as detected by LEONARD.Let FP and FN be the total number of false positives and false negatives observed over all ¾½¼ held-out movies respectively.Repeat the entire process of calculating FP and FN ½¼ times and record the averages as FP and FN.
Since some event types occur more frequently in our data than others (because simpler events occur as subevents of more complex events but not vice versa), we do not report FP and FN directly.Instead, we normalize FP by dividing by the total number of times LEONARD detected the target event (correctly or incorrectly) within all ¾½¼ movies and we normalize FN by dividing by the total number of correct occurrences of the target event within all ¾½¼ movies (i.e., the human assessment of the number of occurrences of the target event).The normalized value of FP estimates the probability that the target event did not occur given that it was predicted to occur, while the normalized value of FN estimates the probability that the event was not predicted to occur given that it did occur.

Results
To evaluate our -AMA learning approach, we ran leave-one-movie-out experiments, as described above, for varying , , and AE.The ¾½¼ example movies were recorded with color-coded objects to provide complete object-correspondence information.We compared our learned event definitions to the performance of two sets of hand-coded definitions.The first set HD ½ of hand-coded definitions appeared in Siskind (2001).In response to subsequent deeper understanding of the behavior of Leonard's model reconstruction methods, we manually revised these definitions to yield another set HD ¾ of hand-coded definitions that gives a significantly better FN performance at some cost in FP performance.Appendix C gives the event definitions in HD ½ and HD ¾ along with a set of machine-generated definitions, produced by the -AMA learning algorithm, given all training data for ¿¼ and NPN.

OBJECT CORRESPONDENCE
To evaluate our algorithm for finding object correspondences, we ignored the correspondence information provided by color coding and applied the algorithm to all training models for each event type.The algorithm selected the correct correspondence for all ¾½¼ training models.Thus, for this data set, the learning results when no correspondence information is given will be identical to those where the correspondences are manually provided, except that in the first case the rules will not specify particular object roles (as discussed Table 1: FP and FN for learned definitions, varying both and , and for hand-coded definitions. in section 6.2.2).Since our evaluation procedure uses role information, the rest of our experiments use the manual correspondence information, provided by color-coding, rather than computing it.

VARYING
The first three rows of table 1 show the FP and FN values for all event types for ¾ ¾ ¿ , AE ¾ (the maximum), and BN.Similar trends were found for P and N. The general trend is that, as increases, FP decreases or remains the same and FN increases or remains the same.Such a trend is a consequence of our -cover approach.This is because, as increases, the -AMA language contains strictly more formulas.Thus for ½ ¾ , the ½ -cover of a formula will never be more general than the ¾ -cover.This strongly suggests, but does not prove, that FP will be non-increasing with and FN will be non-decreasing with .
Our results show that ¾-AMA is overly general for put down and assemble, i.e. it gives high FP.In contrast, ¿-AMA achieves FP ¼ for each event type but pays a penalty in FN compared to ¾-AMA.Since ¿-AMA achieves FP ¼ there is likely no advantage in moving to -AMA for ¿-i.e., the expected result is for FN to become larger.This effect is demonstrated for -AMA in the table.

VARYING
Rows four through six of table 1 show FP and FN for all event types for ¾ P BN N , AE ¾ , and ¿.Similar trends were observed for other values of .The general trend is that, as the degree of negative information increases, the learned event definitions become more specific.In other words, FP decreases and FN increases.This makes sense since, as more negative information is added to the training models, more specific structure can be found in the data and exploited by the -AMA formulas.We can see that, with P, the definitions for pick up and put down are overly general, as they produce high FP.Alternatively, with N, the learned definitions are overly specific, giving FP ¼, at the cost of high FN.In these experiments, as well as others, we have found that BN yields the best of both worlds: FP ¼ for all event types and lower FN than achieved with N. Experiments not shown here have demonstrated that, without negation for pick up and put down, we can increase arbitrarily, in an attempt to specialize the learned definitions, and never significantly reduce FP.This indicates that negative information plays a particularly important role in constructing definitions for these event types.

COMPARISON TO HAND-CODED DEFINITIONS
The bottom two rows of table 1 show the results for HD ½ and HD ¾ .We have not yet attempted to automati- cally select the parameters for learning (i.e. and ).Rather, here we focus on comparing the hand-coded definitions to the parameter set that we judged to be best performing across all event types.We believe, however, that these parameters could be selected reliably using cross-validation techniques on a larger data set.In that case, the parameters would be selected on a per-event-type basis and would likely result in an even more favorable comparison to the hand-coded definitions.
The results show that the learned definitions significantly outperform HD ½ on the current data set.The HD ½ definitions were found to produce a large number of false negatives on the current data set.Manual revision of HD ½ yielded HD ¾ .Notice that, although HD ¾ produces significantly fewer false negatives for all event types, it produces more false positives for pick up and put down.This is because the hand definitions utilize pick up and put down as macros for defining the other events.
The performance of the learned definitions is competitive with the performance of HD ¾ .The main dif- ferences in performance are: (a) for pick up and put down, the learned and HD ¾ definitions achieve nearly the same FN but the learned definitions achieve FP ¼ whereas HD ¾ has significant FP, (b) for unstack and disassemble, the learned definitions perform moderately worse than HD ¾ with respect to FN, and (c) the learned definitions perform significantly better than HD ¾ on assemble events.
We conjecture that further manual revision could improve HD ¾ to perform as well as (and perhaps better than) the learned definitions for every event class.Nonetheless, we view this experiment as promising, as it demonstrates that our learning technique is able to compete with, and sometimes outperform, significant hand-coding efforts by a domain expert.

VARYING AE
It is of practical interest to know how training set size affects our algorithm's performance.For this application, it is important that our method work well with fairly small data sets, as it can be tedious to collect event data.Table 2 shows the FN of our learning algorithm for each event type, as AE is reduced from ¾ to .For these experiments, we used ¿ and BN.Note that FP ¼ for all event types and all AE and hence is not shown.We expect FN to increase as AE is decreased, since, with specific-to-general learning, more data yields more-general definitions.Generally, FN is flat for AE ¾¼, increases slowly for ½¼ AE ¾¼, and increases abruptly for AE ½¼.We also see that, for several event types, FN decreases slowly, as AE is increased from ¾¼ to ¾ .This indicates that a larger data set might yield improved results for those event types.

Related Work
Here we discuss two bodies of related work.First, we present previous work in visual event recognition and how it relates to our experiments here.Second, we discuss previous approaches to learning temporal patterns from positive data.

¿,
BN, and various values of AE.

Visual Event Recognition
Prior work has investigated various subsets of the pieces of learning and using temporal, relational, and forcedynamic representations for recognizing events in video.But none, to date, combine all the pieces together.
The following is a representative list and not meant to be comprehensive.Borchardt (1985) presents temporal, relational, force-dynamic event definitions but these definitions are neither learned nor applied to video.Regier (1992) presents techniques for learning temporal event definitions but the learned definitions are neither relational, force dynamic, nor applied to video.Yamoto, Ohya, and Ishii (1992), Brand and Essa (1995), Brand, Oliver, and Pentland (1997), and Bobick and Ivanov (1998) present techniques for learning temporal event definitions from video but the learned definitions are neither relational nor force dynamic.Pinhanez and Bobick (1995) and Brand (1997a) present temporal, relational event definitions that recognize events in video but these definitions are neither learned nor force dynamic.Brand (1997b) and Mann and Jepson (1998) present techniques for analyzing force dynamics in video but neither formulate event definitions nor apply these techniques to recognizing events or learning event definitions.

Learning Temporal Patterns
We divide this body of work into three main categories: temporal data mining, inductive logic programming, and finite-state machine induction.
Temporal Data Mining.The sequence-mining literature contains many general-to-specific ("levelwise") algorithms for finding frequent sequences (Agrawal & Srikant, 1995;Mannila, Toivonen, & Verkamo, 1995;Kam & Fu, 2000;Cohen, 2001;Hoppner, 2001).Here we explore a specific-to-general approach.In this previous work, researchers have studied the problem of mining temporal patterns using languages that are interpreted as placing constraints on partially or totally ordered sets of time-points, e.g., sequential patterns (Agrawal & Srikant, 1995) and episodes (Mannila et al., 1995).These languages place constraints on timepoints rather than time-intervals as in our work here.More recently there has been work on mining temporal patterns using interval-based pattern languages (Kam & Fu, 2000;Cohen, 2001;Hoppner, 2001).
Though the languages and learning frameworks vary among these approaches they share two central features which distinguish them from our approach.First, they all typically have the goal of finding all "frequent" patterns (formulas) within a temporal data set-our approach is focused on finding patterns with a frequency of one (covering all positive examples).Our first learning application of visual event recognition has not yet required us to find patterns with frequency less than one; however, there are a number of ways in which we can extend our method in that direction when it becomes necessary (e.g., to deal with noisy training data).Second, these approaches all use standard general-to-specific "level-wise" search techniques, whereas we chose to take a specific-to-general approach.One direction for future work is to develop a generalto-specific "level-wise" algorithm for finding frequent MA formulas and to compare it with our specific-togeneral approach.Another direction is to design a "level-wise" version of our specific-to-general algorithmwhere for example, the results obtained for the k-AMA LGG can be used to more efficiently calculate the (k+1)-AMA LGG.Whereas a "level-wise" approach is conceptually straightforward in a general-to-specific framework it is not so clear in the specific-to-general case.We are not familiar with other temporal datamining systems that take a specific-to-general approach.
First-Order Learning In Section 4.3, we pointed out difficulties in using existing first-order clausal generalization techniques for learning AMA formulas.In spite of these difficulties, it is still possible to represent temporal events in first-order logic (either with or without capturing the AMA semantics precisely) and to apply general-purpose relational learning techniques, e.g., inductive logic programming (ILP) (Muggleton & De Raedt, 1994).Most ILP systems require both positive and negative training examples and hence are not suitable for our current positive-only framework-exceptions include Golem (Muggleton & Feng, 1992), Claudien (De Raedt & Dehaspe, 1997), and Progol (Muggleton, 1995), among others.While we have not performed a full evaluation of these systems, our early experiments in the visual event recognition domain confirmed our belief that horn clauses, lacking special handling of time, give a poor inductive bias.In particular, many of the learned clauses find patterns that simply "do not make sense" from a temporal perspective, and in turn generalize poorly.We believe a reasonable alternative to our approach may be to incorporate syntactic biases into ILP systems as done, for example, in (Cohen, 1994;Dehaspe & De Raedt, 1996;Klingspor, Morik, & Rieger, 1996).In this work, however, we chose to work directly in a temporal logic representation.

Finite-State Machines
Finally, we note there has been much theoretical and empirical research into learning finite-state machines (FSMs) (Angluin, 1987;Lang, Pearlmutter, & Price, 1998).We can view FSMs as describing properties of strings (symbol sequences).In our case, however, we are interested in describing sequences of propositional models rather than just sequences of symbols.This suggests learning a type of 'factored' FSM where the arcs are labeled by sets of propositions rather than by single symbols.Factored FSM may be a natural direction in which to extend the expressiveness of our current language (for example by allowing repetition).We are not aware of work concerned with learning 'factored' FSMs; however, it is likely that inspiration can be drawn from symbol-based FSM learning algorithms.

Conclusion
We have presented a simple logic for representing temporal events called AMA and have shown theoretical and empirical results for learning AMA formulas.Empirically, we've given the first system for learning temporal, relational, force-dynamic event definitions from positive-only input and we have applied that system to learn such definitions from real video inputs.The resulting performance matches that of event definitions that are hand-coded with substantial effort by human domain experts.On the theoretical side, Table 3 summarizes the upper and lower bounds we have shown for the subsumption and generalization problems associated with this logic.In each case, we have provided a provably correct algorithm matching the upper bound shown.The table also shows the worst-case size that the smallest LGG could possibly take relative to the input size, for both AMA and MA inputs.The key results in this table are the polynomial-time MA subsumption and AMA syntactic subsumption, the coNP lower bound for AMA subsumption, the exponential size of LGGs in the worst case, and the apparently lower complexity of syntactic AMA LGG versus semantic LGG.We described how to build a learner based on these results and applied it to the visual-event learning domain.To date, however, the definitions we learn are neither cross-modal nor perspicuous.And while the performance of the learned definitions matches that of hand-coded ones, we wish to surpass hand coding.In the future, we intend to address cross-modality by applying our learning technique to the planning domain.We also believe that addressing perspicuity will lead to improved performance.
The Thirteen Allen Relations (adapted to our semantics).
IPEL is a fragment of full propositional event logic that can only describe positive internal events.We conjecture, but have not yet proven, that all positive internal events representable in the full event logic of Siskind (2001) can be represented by some IPEL formula.Formally, the syntax of IPEL formulas is given by where the are IPEL formulas, prop is a primitive proposition (sometimes called primitive event-type), Ê is a subset of the thirteen Allen interval relations × Ñ Ó × Ó (Allen, 1983), and Ê ¼ is a subset of the restricted set of Allen relations × (the semantics for each Allen relation is given below).The difference between IPEL syntax and that of full propositional event logic is that event logic allows for a negation operator, and that, in full event logic, Ê ¼ can be any subset of all thirteen Allen relations.
The operators and used to define AMA formulas are merely abbreviations for the IPEL operators and m respectively, so AMA is a subset of IPEL (though a distinguished subset as indicated by Proposition 4).
Each of the thirteen interval Allen relations are binary relations on the set of closed natural-number intervals.¯¿Ê is satisfied by model Å Á iff for some Ö ¾ Ê there is an interval Á ¼ such that Á ¼ Ö Á and Å Á ¼ satisfies .
¯ ½ Ê ¾ is satisfied by model Å Á iff for some Ö ¾ Ê there exists intervals Á ½ and Á ¾ such that where prop is a primitive proposition, and are IPEL formulas, Ê is a set of Allen relations, and SPAN´Á ½ Á ¾ µ is the minimal interval that contains both Á ½ and Á ¾ .From this definition it is easy to show by induction on the number of operators and connectives in a formula that all IPEL formulas define internal events.One can also verify that the definition of satisfiability given earlier for AMA formulas corresponds to the one we give here.

Î ´Á¼ µ Á.
We sketch the proofs of these properties.1) Use induction on the length of Á ¼ ½ , with the definition of interdigitation.2) Since Î ´Á¼ ½ µ is an interval, MAP´ Å Î ´Á¼ ½ µ µ is well defined.MAP´ Å Î ´Á¼ ½ µ µ MAP´ Å ¼ Á ¼ ½ µ follows from the assumption that Å embeds Å ¼ .3) From Appendix A, we see that all Allen relations are defined in terms of the relation on the natural number endpoints of the intervals.We can show that Î preserves (but not ) on singleton sets (i.e., every member of Î ´ ¼ µ is every member of Î ´ ¼ µ when ¼ ¼ ) and that Î commutes with set union.It follows that Î preserves the Allen interval relations.4) Use the fact that Î preserves in the sense just argued, along with the fact that SPAN´Á ¼ ½ Á ¼ ¾ µ depends only on the minimum and maximum numbers in Á ¼ ½ and Á ¼ ¾ . 5)Follows from the definition of interdigitation and the construction of Î .
We now use induction on the number of operators and connectives in to prove that, if Å ¼ satisfies , then so must Å.The base case is when prop, where prop is a primitive proposition, or true.Since Å ¼ satisfies , we know that prop is true in all Å ¼ Ü ¼ ℄ for Ü ¼ ¾ Á ¼ .Since Ï witnesses ¨ ¨¼, we know that, if prop is true in Å ¼ Ü℄, then prop is true in all Å Ü℄, where Ü ¾ Î ´Ü¼ µ.Therefore, since Î ´Á¼ µ Á, prop is true for all Å ¼ Ü℄, where Ü ¾ Á, hence Å ¼ satisfies .
For the inductive case, assume that the claim holds for IPEL formulas with fewer than AE operators and connectives-let ½ ¾ be two such formulas.When ½ ¾ , the claim trivially holds.When ¿ Ê ½ , Ê must be a subset of the set of relations × . Notice that can be written as a disjunction of ¿ Ö ½ formulas, where Ö is a single Allen relation from Ê. Thus, it suffices to handle the case where Ê is a single Allen relation.Suppose ¿ × ½ .Since Å ¼ satisfies , there must be a sub-interval , we know from the properties of Î that Î ´Á¼ µ Á, and, hence, that Á ½ × Á.Furthermore, we know that Å Á ½ embeds Å ¼ Á ¼ ½ , and, thus, by the inductive hypothesis, Å Á ½ satisfies ½ .Combining these facts, we get that is satisfied by Å.Similar arguments hold for the remaining three Allen relations.Finally, consider the case when ½ Ê ¾ , where Ê can be any set of Allen relations.Again, it suffices to handle the case when Ê is a single Allen relation Ö.Since Å ¼ satisfies ½ Ö ¾ , we know that there are sub-intervals Á ¼ From these facts, and the properties of Î , it is easy to verify that Å satisfies .¾ Lemma 5 Given an MA formula ¨that subsumes each member of a set ¦ of MA formulas, ¨also subsumes some member ¨¼ of IG´¦µ.Dually, when ¨is subsumed by each member of ¦, we have that ¨is also subsumed by some member ¨¼ of IS´¦µ.In each case, the length of ¨¼ can be bounded by the size of ¦.
Proof: We prove the result for IG´¦µ-the proof for IS´¦µ follows similar lines.Let ¦ ¨½ ¨Ò , ¨ × ½ ¡ ¡ ¡ × Ñ , and assume that for each ½ Ò, ¨ ¨.From Proposition 2, for each , there is a witnessing interdigitation Ï for ¨ ¨.We will combine the Ï into an interdigitation of ¦, and show that the corresponding member of IG´¦µ is subsumed by ¨.
To construct an interdigitation of ¦, first notice that, for each × , each Ï specifies a set of states (possibly a single state but at least one) from ¨ that all co-occur with × .Furthermore, since Ï is an interdigitation, it is easy to show that this set of states corresponds to a consecutive sub-sequence of states from ¨ -let ¨ be the MA timeline corresponding to this subsequence.Now let ¦ ¨ ½ Ò , and « be any interdigitation of ¦ .We now take Á to be the union of all « , for ½ Ñ.We show that Á is an interdigitation of ¦.Since each state × appearing in ¦ must co-occur with at least one state × in ¨in at least one Ï , × will be in at least one tuple of « , and, hence, be in some tuple of Á-so Á is piecewise total.Now, define the restriction Á of Á to components and , with , to be the relation given by taking the set of all pairs formed by shortening tuples of Á by omitting all components except the 'th and the 'th.Likewise define « for each .
Note that there is a correspondence between vertices and state tuples-with vertex Ú corresponding to × Ø .
For the forward direction, assume that Ï is a witnessing interdigitation for ¨½ ¨¾.We know that, if the states × and Ø co-occur in Ï , then × Ø since Ï witnesses ¨½ ¨¾.The vertices corresponding to the tuples of Ï will be called co-occurrence vertices, and satisfy the first condition for belonging to some edge in (that × Ø ).It follows from the definition of interdigitation that both Ú ½ ½ and Ú Ñ Ò are both cooccurrence vertices.Consider a co-occurrence vertex Ú not equal to Ú Ñ Ò , and the lexicographically least co-occurrence vertex Ú ¼ ¼ after Ú (ordering vertices by ordering the pair of subscripts).We show that , , ¼ , and ¼ satisfy the requirements for Ú Ú then there can be no co-occurrence vertex Ú •½ ¼¼ , contradicting that Ï is piecewise total.If ¼ • ½, then since Ï is piecewise total, there must be a co-occurrence vertex Ú ¼¼ •½ : but if ¼¼ or ¼¼ ¼ , this contradicts the simultaneous consistency of Ï , and if ¼¼ , this contradicts the lexicographically least choice of Ú ¼ ¼ .It follows that every co-occurrence vertex but Ú Ñ Ò has an edge to another co-occurrence vertex closer in Manhattan distance to Ú Ñ Ò , and thus that there is a path from Ú ½ ½ to Ú Ñ Ò .
For the reverse direction assume there is a path of vertices in Ë ´¨½ ¨¾µ from Ú ½ ½ to Ú Ñ Ò given by, Ú ½ ½ Ú ¾ ¾ Ú Ö × with ½ ½ ½, Ö Ñ × Ò.Let Ï be the set of state tuples corresponding to the vertices along this path.Ï must be simultaneously consistent with the ¨ orderings because our directed edges are all non-decreasing in the ¨ orderings.Ï must be piecewise total because no edge can cross more than one state transition in either ¨½ or ¨¾, by the edge set definition.So Ï is an interdigitation.Finally, the definition of the edge set ensures that each tuple × Ø in Ï has the property × Ø , so that Ï is a witnessing interdigitation for ¨½ ¨¾, showing that ¨½ ¨¾, as desired.¾ Lemma 10 Given some Ò, let © be the conjunction of the timelines Ò ½ ´PROP Ò True False PROP Ò µ ´PROP Ò False True PROP Ò µ We have the following facts about truth assignments to the Boolean variables Ô ½ Ô Ò : 1.For any truth assignment , PROP Ò × PROP Ò is semantically equivalent to some member of IS´©µ.
2. For each ¨¾ IS´©µ there is a truth assignment such that ¨ PROP Ò × PROP Ò .
Proof: To prove the first part of the lemma, we construct an interdigitation Á of © such that the corresponding member of IS´©µ is equivalent to PROP Ò × PROP Ò .Intuitively, we construct Á by ensuring that some tuple of Á consists only of states of the form True or False that agree with the truth assignment-the union of all the states in this tuple, taken by IS´©µ will equal × .Let Á Ì ¼ Ì ½ Ì ¾ Ì ¿ Ì be an interdigitation of © with exactly five state tuples Ì .We assign the states of each timeline of © to the tuples as follows: 1.For any , such that ½ Ò and ´Ô µ is true, Ð× É, assign each state × to tuple Ì , and assign state × ½ to Ì ¼ as well, and ¯for the timeline × ¼ ½ × ¼ ¾ × ¼ ¿ × ¼ É Ð× Ì ÖÙ É, assign each state × ¼ to tuple Ì ½ , and state × ¼ to tuple Ì as well.
2. For any , such that ½ Ò and ´Ô µ is false, assign states to tuples as in item 1 while interchanging the roles of Ì ÖÙ and Ð× .
It should be clear that Á is piecewise total and simultaneously consistent with the state orderings in ©, and so is an interdigitation.The union of the states in each of Ì ¼ , Ì ½ , Ì ¿ , and Ì is equal to PROP Ò , since PROP Ò is included as a state in each of those tuples.Furthermore, we see that the union of the states in Ì ¾ is equal to × .Thus, the member of IS´©µ corresponding to Á is equal to PROP Ò PROP Ò × PROP Ò PROP Ò , which is semantically equivalent to PROP Ò × PROP Ò , as desired.
To prove the second part of the lemma, let ¨be any member of IS´©µ.We first argue that every state in must contain either True or False for each ½ Ò.For any , since © contains PROP Ò True False PROP Ò , every member of IS´©µ must be subsumed by PROP Ò True False PROP Ò .So, ¨is subsumed by PROP Ò True False PROP Ò .But every state in PROP Ò True False PROP Ò contains either True or False , implying that so does ¨, as desired.
Next, we claim that for each ½ Ò, either ¨ True or ¨ False -i.e., either all states in ïnclude True , or all states in ¨include False (and possibly both).To prove this claim, assume, for the sake of contradiction, that, for some , ¨ True and ¨ False .Combining this assumption with our first claim, we see there must be states × and × ¼ in ¨such that × contains Ì ÖÙ but not Ð× , and × ¼ contains Ð× but not Ì ÖÙ , respectively.Consider the interdigitation Á of © that corresponds to ¨(as a member of IS´©µ.We know that × and × ¼ are each equal to the union of states in tuples Ì and Ì ¼ , respectively, of Á. Ì and Ì ¼ must each include one state from each timeline × ½ × ¾ × ¿ × PROP Ò True False PROP Ò and × ¼ ½ × ¼ ¾ × ¼ ¿ × ¼ PROP Ò False True PROP Ò .Clearly, since × does not include False , Ì includes the states × ½ and × ¼ ¾ , and likewise Ì ¼ includes the states × ¾ and × ¼ ½ .It follows that Á is not simultaneously consistent with the state orderings in × ½ × ¾ × ¿ × and × ¼ ½ × ¼ ¾ × ¼ ¿ × ¼ , contradicting our choice of Á as an interdigitation.This shows that either ¨ True or ¨ False .
Define the truth assignment such that for all ½ Ò, ´Ô µ if and only if ¨ True .Since,for each , ¨ True or ¨ False , it follows that each state of ¨is subsumed by × .Furthermore, since begins and ends with PROP Ò , it is easy to give an interdigitation of ¨and PROP Ò × PROP Ò that witnesses ¨ PROP Ò × PROP Ò .Thus, we have that ¨ PROP Ò × PROP Ò .¾ Lemma 16 Let ¨½ and ¨¾ be as given on page 17, in the proof of Theorem 17, and let © Î IG´ ¨½ ¨¾ µ.For any © ¼ whose timelines are a subset of those in © that omits some square timeline, we have © © ¼ .
Proof: Since the timelines in © ¼ are a subset of the timelines in ©, we know that © © ¼ .It remains to show that © ¼ ©.We show this by constructing a timeline that is covered by © ¼ , but not by ©.
For the sake of contradiction, assume that ¨ ¨-then there must be a interdigitation Ï witnessing ¨ ¨.We show by induction on that, for ¾, Ï ´× × µ implies .For the base case, when ¾, we know that × ¾ × ¾ , since × ¾ × ¾ , and so Ï ´×¾ × ¾ µ is false, since Ï witnesses subsumption.For the inductive case, assume the claim holds for all ¼ , and that Ï ´× × µ.We know that × × , and thus .Because Ï is piecewise total, we must have Ï ´× ½ × ¼ µ for some ¼ , and, by the induction hypothesis, we must have ¼ ½.Since Ï is simultaneously consistent with the × and × ¼ state orderings, and ½ , we have ¼ . It follows that as desired.Given this claim, we see that × ¾Ò ¾ cannot co-occur in Ï with any state in ¨, contradicting the fact that Ï is piecewise total.Thus we have that ¨ ¨.
Let ¨¼ × ¼ ½ ¡ ¡ ¡ × ¼ Ñ be any timeline in © ¨ , we now construct an interdigitation that witnesses ¨ ¨¼.Note that while ¨is assumed to be square, ¨¼ need not be.Let be the smallest index where × × ¼ -since × ½ × ¼ ½ Ô ½ ½ , and ¨ ¨¼, we know that such a must exist, and is in the range ¾ Ñ.We use the index to guide our construction of an interdigitation.Let Ï be an interdigitation of ¨and ¨¼, with exactly the following co-occurring states (i.e., state tuples): 1.For ½ ½, × •½ co-occurs with × ¼ .
It is easy to check that Ï is both piecewise total and simultaneously consistent with the state orderings in änd ¨, and so is an interdigitation.We now show that Ï witnesses ¨ ¨¼ by showing that all states in äre subsumed by the states they co-occur with in Ï .For co-occurring states × •½ and × ¼ corresponding to the first item above we have that × ¼ × -this implies that × ¼ is contained in × •½ , giving that × •½ × ¼ .Now consider co-occurring states × and × ¼ from the second item above.Since ¨is square, choose and Ð so that × ½ Ô Ð , we have that × is either Ô •½ Ð or Ô Ð•½ .In addition, since × ½ × ¼ ½ we have that × ¼ is either Ô •½ Ð Ô Ð•½ or Ô •½ Ð•½ but that × × ¼ .In any of these cases, we find that no state in ¨¼ after × ¼ can equal × -this follows by noting that the proposition indices never decrease across the timeline ¨¼7 .We therefore have that, for , × × ¼ .Finally, for co-occurring states × and × ¼ Ñ from item three above, we have × × ¼ Ñ , since × ¼ Ñ Ô Ò Ò , which is in all states of ¨.Thus, we have shown that for all co-occurring states in Ï , the state from ¨is subsumed by the co-occurring state in ¨¼.

Figure 1 :
Figure1: The upper boxes represent the three primary components of LEONARD's pipeline.The lower box depicts the event-learning component described in this paper.The input to the learning component consists of training models of a target event (e.g., movies of PICKUP events) and the output is an event definition (e.g., a temporal logic formula defining PICKUP).

Figure 2 :
Figure 2: LEONARD recognizes a PICKUP event.(a) Frames from the raw video input with the automatically generated polygon movie overlaid.(b) The same frames with a visual depiction of the automatically generated force-dynamic properties.(c) The text input and output of the event classifier corresponding to the depicted movie.The top line is the output and the remaining lines make up the input that encodes the changing force-dynamic properties over time.

Figure 5 :
Figure 5: Pseudo-code for the MA subsumption algorithm.Ë ´¨½ ¨¾µ is the subsumption graph defined in the main text. 1

Figure 8 :
Figure 8: Pseudo-code for non-deterministically computing a k-cover of an AMA formula, along with a non-deterministic helper function for selecting a block partition of the states of a timeline.
Formally, the syntax of AMA formulas is given by, ½ , and we say © ½ properly subsumes © ¾ , written © ¾ © ½ , when we also have © Finally, it will be useful to associate a distinguished MA timeline to a model.The MA-projection of a model Å Å ℄ (written as MAP´Åµ) is an MA timeline × ¼ × ½ ¡ ¡ ¡ × where state × gives the true propositions in Å´ • µ for ¼ Siskind (2001)tively, we may state © ¾ © ½ by saying that © ½ is more general (or less specific) than © ¾ or that © ½ covers © ¾ .Siskind (2001)provides a method to determine whether a given model satisfies a given AMA formula.
We first use interdigitations to syntactically characterize subsumption between MA timelines.An interdigitation Á of two MA timelines ¨½ and ¨¾ is a witness to ¨½ ¨¾ if, for every pair of co-occurring states × ½ ¾ ¨½ and × ¾ ¾ ¨¾, we have × ½ × ¾ .The following lemma and proposition establish the equivalence For any MA timeline ¨and any model Å, if Å satisfies ¨then there is a witnessing interdigitation For MA timelines ¨½ and ¨¾, ¨½ ¨¾ iff there is an interdigitation that witnesses ¨½ ¨¾.We show the backward direction by induction on the number of states Ò in timeline ¨½.If Ò ½, then the existence of a witnessing interdigitation for ¨½ ¨¾ implies that every state in ¨¾ is a subset of the single state in ¨½, and thus that any model of ¨½ is a model of ¨¾ so that ¨½ ¨¾.Now, suppose for induction that the backward direction of the theorem holds whenever ¨½ has Ò or fewer states.Given an arbitrary model Å of an Ò • ½ state ¨½ and an interdigitation Ï that witnesses ¨½ ¨¾, we must show that Å is also a model of ¨¾ to conclude ¨½ ¨¾, as desired.
Proposition 2Proof: Reachable´ µ ´Ø × ´Reachable´ ½ µ THEN RETURN TRUE.For example, (a) Create an array Reachable( , ) of boolean values, all FALSE, for ¼ Ñ and ¼ Ò.(b) FOR ½ to Ñ, Reachable´ ¼µ TRUE; FOR ½ to Ò, Reachable´¼ µ TRUE; FOR ½ to Ñ FOR ½ to Ò ØÖÙ as the semantic LGG of the timelines in Ì ½ .Next the syntactic LGG algorithm computes the semantic LGG of the timelines in Ì ¾ .Following the same steps as for Ì ½ , we find that the semantic LGG of the timelines in Ì ¾ is ØÖÙ but also the PICKUP´ µ and PUTDOWN´ µ subevents as well.For each movie type in our data set, we have a set of intended events and subevents that should be detected.If a definition does not detect an intended event, we deem the error a false negative.If a definition detects an unintended event, we deem the error a false positive.For example, if a movie depicts a MOVE´ PICKUP´ µ, and PUTDOWN´ µ.If the definition for pick up detects the occurrence of PICKUP´ µ, we wish to detect not only the MOVE´ µ Figure 9: Key frames from sample videos of the seven event types.event

Table 2 :
FN for

Table 3 :
Complexity Results Summary.The LGG complexities are relative to input plus output size.The size column reports the worst-case smallest correct output size.The "?" indicates a conjecture.
Table 4 gives the definitions of these relations, defining Ñ ½ Ñ ¾ ℄ Ö Ò ½ Ò ¾ ℄ for each Allen relation Ö. Satisfiability for IPEL formulas can now be defined as follows, ¯true is satisfied by every model.¯prop is satisfied by model Å Á iff Å Ü℄ assigns prop true for every Ü ¾ Á. ¯ ½ ¾ is satisfied by a model Å iff Å satisfies ½ or Å satisfies ¾ .