Computing unsatisfiable cores for LTLf specifications

Linear-time temporal logic on finite traces (LTLf) is rapidly becoming a de-facto standard to produce specifications in many application domains (e.g., planning, business process management, run-time monitoring, reactive synthesis). Several studies approached the respective satisfiability problem. In this paper, we investigate the problem of extracting the unsatisfiable core in LTLf specifications. We provide four algorithms for extracting an unsatisfiable core leveraging the adaptation of state-of-the-art approaches to LTLf satisfiability checking. We implement the different approaches within the respective tools and carry out an experimental evaluation on a set of reference benchmarks, restricting to the unsatisfiable ones. The results show the feasibility, effectiveness, and complementarities of the different algorithms and tools.


Introduction
A growing body of literature evidences the adoption of linear-time temporal logic on finite traces (LTL f ) (De Giacomo & Vardi, 2013) to produce systems specifications (De Giacomo, De Masellis, & Montali, 2014).Its widespread use spans across several application domains, including business process management (BPM) for declarative process modelling (De Giacomo, De Masellis, Grasso, et al., 2014;Montali et al., 2010) and mining (Cecconi et al., 2018;Di Ciccio & Montali, 2022;Räim et al., 2014), run-time monitoring and verification (Bauer et al., 2010;De Giacomo, De Masellis, Grasso, et al., 2014;De Giacomo et al., 2020), and AI planning (Calvanese et al., 2002;Camacho et al., 2018;Camacho & McIlraith, 2019;Sohrabi et al., 2011).When it comes to verification techniques and tool support for LTL f , several studies approach the LTL f satisfiability problem via reduction to LTL (Pnueli, 1977) satisfiability on infinite traces (De Giacomo, De Masellis, & Montali, 2014), or via specific propositional satisfiability approaches (Fionda & Greco, 2018;Li et al., 2020).However, no efforts have been devoted thus far to the identification of the formulas that lead to unsatisfiability in LTL f specifications, with the consequence that no support has been offered for modellers and system designers to single out the causes of possible inconsistencies.
In this paper, we tackle the challenge of extracting unsatisfiable cores (UCs) from LTL f specifications.Investigating this problem is interesting both from practical and theoretical viewpoints.
On the practical side, if unsatisfiability signals that a specification is defective, the identification of unsatisfiable cores provides the users with the opportunity to isolate the source of inconsistency and debug the relevant code fragment.Notice that determining a reason for unsatisfiability without automated support may be unfeasible for a number of reasons that range from the sheer size of the formula to the lack of time and skills of the user (Schuppan, 2012(Schuppan, , 2018)).
On the theoretical side, dealing with the extraction of UCs in LTL f specifications is far from trivial.Indeed, there is no default strategy to move from the support provided for LTL to the one that has to be provided for LTL f .We can identify two clear alternative strategies to address this problem: the first one extends algorithms for the extraction of UCs in LTL to the case of LTL f (hereafter Strategy 1 or S1); the second one exploits algorithms that directly compute satisfiability in LTL f to provide support for the extraction of UCs (hereafter Strategy 2 or S2).These two different strategies are emphasised in the two grey streams in Fig. 1.When looking at the algorithms realising the two strategies, the starting point for S1 would be state-of-the-art (SOTA) algorithms for the extraction of UCs in LTL (to be extended for LTL f ), while the starting point for S2 would be stateof-the-art algorithms for the computation of satisfiability in LTL f (to be extended to the computation of UCs).If we look at these two classes of state-of-the-art algorithms, we can notice a substantial imbalance.Several state-of-the-art algorithms exist for S1, in particular based on the reduction of SAT to model-checking and on theorem proving.On the contrary, the number of algorithms that could enable the implementation of S2 is still rather limited and reduces to a reference work based on an explicit search complemented with several propositional satisfiability checks.Since recent works show that often a single universal best algorithm does not exist, and systems exhibit behaviours that complement each other (Li et al., 2020(Li et al., , 2019)), choosing a single strategy and a single algorithm from which to start is less than obvious.
In this work, we provide algorithms for the computation of UCs for LTL f by exploiting both strategies and, whenever possible, different reference algorithms within each strategy.For Strategy 1, we consider three LTL satisfiability checking algorithms as the starting points: A1, based on Binary Decision Diagrams (BDDs) as described in the work of Clarke et al. (1997); A2, based on propositional satisfiability and introduced by Biere et al. (2006); A3, a theorem proving algorithm based on temporal resolution first presented by Hustadt and Konev (2003) and Schuppan (2016).For Strategy 2, we resort to the reference work of Li et al. (2020), based on explicit search and propositional satisfiability (hereinafter, A4). Figure 1 lists these algorithms inside the "SOTA algorithm" box.We believe that leveraging reference state-of-the-art approaches provides a rich starting point for the investigation of the problem and the provision of effective tools for the extraction of UCs in LTL f specifications.
Our contributions thus consist of the following: 1. Four algorithms NA1, . . ., NA4 that allow for the computation of an unsatisfiable core through the adaptation of algorithms A1, . . ., A4, covering both Strategy 1 and Strategy 2 (Section 4).These algorithms are listed in the "New LTL f unsat core algorithms" box in Fig. 1.Note that the algorithms based on propositional satisfiability (that is, NA2 and NA4) aim at extracting a UC, which may not necessarily be the minimum one.Instead NA1 and NA3 already allow for the extraction of a minimum unsatisfiable core.
2. An implementation of the proposed four algorithms NA1, . . ., NA4 (Section 5.1).Three implementations extend existing tools for the corresponding original algorithms; instead, the implementation of NA3, based on temporal resolution, resorts to a preprocessing of the formula to reduce the input to the language restrictions of the original tool.
3. An experimental evaluation on a large set of reference benchmarks taken from (Li et al., 2020), restricted to the unsatisfiable ones (Sections 5.2 and 5.3).The results show an overall better time efficiency of algorithm NA4, based on Strategy 2. However, the cardinality of the UC extracted by the fastest approach is the smallest one in only about half of the cases.The experimental findings show that the proposed approaches are complementary on different specifications: depending on the varying number of propositional variables, number of conjuncts and degree of nesting of the temporal operators in the benchmarks, it is not rare that some of the implemented techniques achieve a noticeable performance when the other ones terminate with no result and vice-versa.The complementary behaviour of the different algorithms provides a further evidence of the challenge of providing an algorithmic support for the extraction of UCs from LTL f specifications and the adequacy of exploring different strategies and algorithmic solutions for this problem.
Since popular usages of LTL f leverage past temporal operators (see e.g., the Declare language (van der Aalst et al., 2009)), we also provide a way to handle LTL f with past temporal operators (see Definition 2 and all the respective technical parts).This results in the same expressive power as the one of the pure future version, though allowing for exponentially more succinct specifications (Gabbay, 1987;Laroussinie et al., 2002) and more natural encodings of LTL f based modelling languages that make use of these operators.To this aim, we leverage algorithms already supporting LTL with past temporal operators, or a reduction to LTL f with only future temporal operators to use existing approaches for LTL f satisfiability checking.
The remainder of the paper is structured as follows.Section 2 and Section 3 illustrate background concepts of relevance to our work and some enablers for the extension of SOTA approaches towards the extraction of LTL f unsatisfiable cores, respectively.Section 4 introduces the four algorithms NA1, . . ., NA4, while Section 5 reports about the algorithms' implementation and experimental evaluation.Finally, related works, as well as conclusions and future works are described in Section 6 and Section 7, respectively.

Background
We outline here the main concepts upon which the remainder of the paper is built upon.

LTL f Syntax and Semantics
Given a finite set of propositional variables AP, we provide the following definitions.A state s over propositional variables in AP is a complete assignment of a Boolean value ⊤ or ⊥ to the variables in AP.For a set P ⊆ AP, we denote with s| P the projection (restriction) of the complete assignments in s to consider only the propositional variables in P .
Definition 1.We say that variable x ∈ AP holds in a state s iff x is assigned the truth value ⊤ in s, x = ⊤, and we denote this as s |= p x (where the 'p' subscript indicates that s is a model of x in a propositional sense).
An LTL f formula φ is built over the propositional variables in AP by using the classical Boolean connectives "∧", "∨", and "¬", complemented with the future temporal operators "X" (next), "N" (weak next), "G" (always/globally), "F" (eventually/finally), "U" (until) and "R" (release), and with the past temporal operators "Y" (yesterday), "Z" (weak yesterday), "H" (historically), "O" (once), "S" (since), and "T" (trigger).The N operator is similar to X and solely differs in the way the final state is dealt with: in the last state, X φ is false, while N φ is true.Similarly, the Z operator is analogous to Y as in the sole initial state a difference occurs: in s 0 , Y φ is false, whereas Z φ is true.(See Definition 2 for the semantics of all the LTL f operators.) The grammar for building LTL f formulas is: Past temporal operators where x ∈ AP is a propositional variable, φ 1 and φ 2 are LTL f formulas.Classical implication → and equivalence ↔ connectives can be obtained in standard ways in terms of the ∧, ∨, ¬ connectives.In the following, we use round parentheses as auxiliary symbols to clarify or alter the precedence of evaluation.Otherwise, we might omit them for the sake of readability.
Remark 2.1.The following equivalences hold: In the remainder of this paper, we leverage the above equivalences whenever needed to simplify the presentation and the proofs.
The language of an LTL f formula φ over AP is defined as L(φ) = {π | π, 0 |= φ}.Thus, the satisfiability problem for an LTL f formula φ can be reduced to checking that L(φ) ̸ = ∅.
Let us consider, e.g., the formula φ is not satisfied by either of the traces.Indeed, the last state of π 1 is such that s 1 3 |= p a, but it is not followed by any state.As for π 2 , s 2 2 |= p a but in the next state s 2 3 ̸ |= p b; also, s 1 3 |= p a but that is the last state, so no next state exists.Formulas that contain both past and future temporal operators in the same formula are widely used, as they allow for the expression of requirements or behavioural rules in a more concise and natural way (Cecconi et al., 2018;Fuxman et al., 2004;van Lamsweerde & Letier, 2000).For instance, as also discussed in (Cimatti et al., 2004), a requirement like if a problem is diagnosed, then a failure must have previously occurred can be naturally formalised as G(problem → O failure).This formalisation can be interpreted more intuitively than the pure future counterpart ¬(¬failure U problem).Similarly, the requirement grants are issued only upon requests can be easily specified as G(grant → Y(¬grant S request)), which is more compact than the pure-future formulation:

Commonalities and Differences between LTL and LTL f
The syntax of LTL f formulas is almost identical to the original LTL one.Semantics differ, instead, due to the finite length of traces in LTL f , as a last state occurs only in a finite trace.Only in LTL f , then, N and X are satisfied under different conditions.Thus, while introducing the semantics for LTL we report only the semantics for the X operator.In the following, we will consider the semantics for LTL and highlight the differences with LTL f whenever necessary.
Definition 3 (LTL Satisfiability).Given an infinite trace π, the LTL formula φ is true in Future temporal operators: • π, i |= LTL F φ iff for some j with i ≤ j it holds that π, j |= LTL φ; • π, i |= LTL G φ iff for every j with i ≤ j it holds that π, j |= LTL φ; • π, i |= LTL φ 1 U φ 2 iff for some j with i ≤ j it holds that π, j |= LTL φ 2 and for every k with i ≤ k < j it holds that π, k |= LTL φ 1 ; • π, i |= LTL φ 1 R φ 2 iff for every j with i ≤ j it holds that π, j |= LTL φ 2 , or for some j with i ≤ j it holds that π, j |= LTL φ 1 and for every k with i ≤ k ≤ j it holds that π, k |= LTL φ 2 ; Past temporal operators: • π, i |= LTL O φ iff for some j with 0 ≤ j ≤ i it holds that π, j |= LTL φ; • π, i |= LTL H φ iff for every j with 0 ≤ j ≤ i it holds that π, j |= LTL φ; • π, i |= LTL φ 1 S φ 2 iff for some k with 0 ≤ k ≤ i it holds that π, k |= LTL φ 2 and for every j with k < j ≤ i it holds that π, j |= LTL φ 1 ; We say that the infinite trace π is a model of φ (denoted with π |= LTL φ) whenever π, 0 |= LTL φ, and that the LTL property φ is satisfiable whenever there exists a π such that π, 0 |= LTL φ.
When clear from the context, for an LTL property φ, we abuse notation and use π, i |= φ in place of π, i |= LTL φ.
As noticed in (De Giacomo, De Masellis, & Montali, 2014), the evaluation of an LTL formula on an infinite trace may lead to an opposite outcome to the evaluation of an identical expression in LTL f on finite traces.For example, F a∧G(a → F b)∧G(b → F a)∧G ¬(a∧b) is satisfiable in LTL and unsatisfiable in LTL f .A satisfying infinite trace π in LTL is implies that eventually, both a and b shall be true at the same time, and this is in contradiction with G ¬(a ∧ b) which requires that both a and b are never true simultaneously.
Definition 4 (Unsatisfiable core).Let Γ = {φ 1 , ..., φ N } be an LTL f unsatisfiable specification.Φ ⊆ Γ is an unsatisfiable core of Γ iff Φ is unsatisfiable.A minimal unsatisfiable core Φ is such that Φ i = Φ \ {φ i } for every φ i ∈ Φ is satisfiable.A minimum unsatisfiable core is a minimal unsatisfiable core with the smallest possible cardinality.Consider, e.g., the specification Γ = {φ 1 , . . ., φ 6 } of LTL f formulas where Intuitively, the specification Γ is unsatisfiable because of circular dependencies that require a to be eventually followed by b, b by c and a, and c by a. Since φ 1 requires that at least one among a, b or c is eventually satisfied in the trace, only an infinite trace could satisfy Γ as a whole.Γ is a trivial unsatisfiable core, then.The specification {φ 1 , φ 2 , φ 3 , φ 4 } ⊆ Γ is a minimal unsatisfiable core (since the removal of any of φ 1 , φ 2 , φ 3 , φ 4 breaks the circular dependency).The specification {φ 1 , φ 2 , φ 5 } ⊆ Γ is not only minimal but also a minimum unsatisfiable core as it bears the lowest cardinality.

Checking Satisfiability of an LTL f Formula
Checking the satisfiability of an LTL f formula φ can be reduced to checking language emptiness of a nondeterministic finite state automaton (De Giacomo, De Masellis, & Montali, 2014).Alternative approaches for LTL f formulas without past temporal operators (De Giacomo, De Masellis, & Montali, 2014;De Giacomo & Vardi, 2013;Fionda & Greco, 2018) address this problem by checking the satisfiability of an equi-satisfiable LTL formula over infinite traces (see Def. 3) leveraging on existing well-established techniques (see, e.g., Biere et al. 2006;Clarke et al. 1997).These approaches proceed as follows: (i) they introduce a new fresh propositional variable end ̸ ∈ AP used to denote the trace has ended; (ii) they require that end eventually holds (i.e., F end); (iii) they require that once end becomes true, it stays true forever (i.e., G(end → X end)); (iv) they translate the LTL f formula φ into an LTL formula by means of a rewriting function f2l(φ) that is defined recursively on the structure of the LTL f formula φ as follows: Theorem 1 (De Giacomo, De Masellis, and Montali 2014).Any LTL f formula without past temporal operators φ is satisfiable iff the LTL formula is satisfiable.
Finally, in SAT-based frameworks for LTL f satisfiability checking like the one proposed by Li et al. 2020, propositional SAT solving techniques are used to construct a transition system T φ for a given LTL f formula φ, and LTL f satisfiability checking reduces to a path search problem over the constructed transition system.1 Theorem 2 (Li et al. 2020).Let φ be an LTL f formula without past temporal operators.φ is satisfiable iff there is a final state in T φ .
A final state for T φ is any state satisfying the Boolean formula end ∧ (xnf(φ)) p , where (i) end is a new propositional variable such that end ̸ ∈ AP to identify the last state of satisfying traces (similarly to De Giacomo, De Masellis, and Montali 2014); (ii) xnf(φ) is the neXt Normal Form of φ, an equi-satisfiable formula to φ such that there are no Until/Release sub-formulas in the propositional atoms2 of xnf(φ), built linearly from φ; and (iii) (xnf(φ)) p is a propositional formula3 over the propositional atoms of xnf(φ).
This approach uses a conflict driven algorithm, leveraging on propositional unsatisfiable cores, to perform the explicit path-search.Next, we report some useful definitions, and we refer to Li et al. (2020) for the full details of this approach.
Definition 5 (Conflict Sequence, Li et al. 2020).Given an LTL f formula φ, a conflict sequence C for the transition system T φ is a finite sequence of sets of states such that: We call each C[i] a frame, and i is the frame level.
For a given conflict sequence C, the set 0≤j<i C[j] (for 0 ≤ i < |C|) represents a set of states that cannot reach a final state of T φ in up to i steps.
Theorem 3 (Li et al. 2020).The LTL f formula φ is unsatisfiable iff there are a conflict sequence C and an We refer the reader to Li et al. (2020) for further details about the construction of T φ , for the SAT-based algorithm to check for the existence of a final state in T φ , and for the correctness and termination of that algorithm.

Symbolic Approaches to Check Language Emptiness for LTL
A standard symbolic approach to check language emptiness for a given LTL formula φ was proposed by Clarke et al. (1997) in the context of model checking with fairness constraints.It proceeds as follows: (i) first, it builds a Symbolic Non-Deterministic Büchi automaton for φ; (ii) then, it computes the set of fair states according to this automaton; finally, (iii) it intersects it with the set of initial states.The resulting set, denoted with φ , is a propositional formula whose models represent all states that are the initial state of some infinite trace that accepts φ.
More precisely, let M φ be a symbolic fair transition system over a set of Boolean variables AP φ that encodes the formula φ, as discussed for instance in the work of Clarke et al. (1997).In this setting, AP φ = AP ∪ AP B(φ) contains all the propositional variables AP and the Boolean variables AP B(φ) (such that AP B(φ) ∩ AP = ∅) needed to encode a symbolic fair transition system representing the Büchi automaton for φ. 4 We denote with φ the set of states of this symbolic fair transition system such that the following assumptions hold: (AssN1) All states in φ are the starting point of some trace accepting φ; (AssN2) All words accepted by φ are accepted by some trace starting from φ .
Notice that this approach is suitable both for BDD-based and for SAT-based approaches to LTL satisfiability.

Temporal Resolution Approaches for LTL Satisfiability
LTL satisfiability can also be addressed with temporal resolution (Fisher, 1991;Fisher et al., 2001).Temporal resolution extends classical propositional resolution with specific inference rules for each temporal operator.Temporal resolution has been implemented in solvers like trp++ (Hustadt & Konev, 2003) showing effectiveness in analysing unsatisfiable LTL formulas (Schuppan & Darmawan, 2011).We refer the reader to the work of Fisher (1991); Fisher et al. (2001); Hustadt and Konev (2003) for further details.We remark that Schuppan (2016) showed how the temporal resolution proof graph constructed to prove unsatisfiability of an LTL formula without past temporal operators could be used to compute a minimal unsatisfiable core for the respective LTL formula.

Enablers
This section presents three enablers that allow for the extension of existing algorithms in the scientific literature towards the extraction of LTL f unsatisfiable cores.We will resort to these enablers for the design and realisation of four new algorithms, as described in the next section.In particular, here we illustrate: (i) the extension of the translation function f2l(φ) presented in Section 2 to handle LTL f past temporal operators; (ii) a translation that allows for the transformation of any LTL f formula with past temporal operators in an equi-satisfiable one with only future temporal operators; (iii) the use of an activation variable associated to each LTL f formula in Γ to extract unsatisfiable cores from existing frameworks for LTL/LTL f satisfiability.The first result enables the use of any framework for LTL satisfiability checking that supports both past and future temporal operators.The second result enables the use of any framework for LTL/LTL f satisfiability checking that supports only future temporal operators.Finally, the third result enables the computation of unsatisfiable cores of Γ leveraging existing LTL/LTL f satisfiability frameworks.Next, we describe the three enablers in detail.

Extending f2l to Handle Past Temporal Operators
We remark that the semantics for past temporal operators over finite traces coincides with the respective semantics on infinite traces, as it refers to the prefix of the trace.Therefore, given an LTL f formula φ, we can extend the f2l(φ) encoding to handle LTL f past temporal operators as follows: Basically, the encoding of a past operator is propagated recursively to the sub-formulas without modifications on the past operator itself.Together with Theorem 1, this extension allows us to prove the following corollary.
Corollary 1.Any LTL f formula φ is satisfiable iff the following LTL formula is satisfiable: This corollary enables the use of any framework for LTL satisfiability checking that supports both past and future temporal operators.

Removing Past Temporal Operators
Given an LTL f formula φ with past operators, we can build an equi-satisfiable LTL f formula over only future operators using the function p2f(φ, ∅) = ⟨φ ′ , Υ⟩ that takes an LTL f formula φ with past operators, and builds a new LTL f formula φ ′ and a set of LTL f formulas without past operators Υ as follows: Intuitively, p2f(φ, Υ) recursively replaces each sub-formula φ i of φ with a top-level past temporal operator with a new distinct fresh propostional variable v φ i ̸ ∈ AP, and accumulates formulas capturing the semantics of the substituted past temporal sub-formulas in Υ.We remark that the same approach can be equivalently applied to LTL formulas too.In light of this translation, the following theorem follows.
Proof.The proof is by cases on the structure of the formula.We consider only the Y and S past temporal operators since p2f either preserves the formula or rewrites it leveraging the equivalences in Remark 2.1.
=⇒ Let us assume that there exists a trace π such that π, i |= Y φ for some i ≥ 1 (i.e., such that π, i ⇐= Let us assume that there exists a trace and there exists an i ≥ 1 such that π, i |= v Y φ .This trace will be such that • φ 1 S φ 2 =⇒ Let us assume that there exists a trace π such that π, i |= φ 1 S φ 2 for some i ≥ 0. This trace is such that there exists a k with 0 ≤ k ≤ i such that π, k |= φ 2 and for every j with k < j ≤ i it holds that π, j |= φ 1 .We can build a new trace π ′ extending the trace π to consider a new fresh variable v φ 1 S φ 2 ̸ ∈ AP.This new trace π ′ is such that π ′ [0] |= p ¬v φ 1 S φ 2 , and, for every ), and at time point i ≥ 0 it holds that π ′ , i |= φ 2 or π ′ , i |= φ 1 ∧ v φ 1 S φ 2 by construction.Thus, it also holds that ⇐= Let us assume there is a trace π such that π |= ¬v and there exists a time point i such that π, i |= (φ 2 ∨(φ 1 ∧v φ 1 S φ 2 )).This trace will be such that there exists a k with 0 ≤ k ≤ i such that π, k |= φ 2 and for every j with k < j ≤ i it holds that π, j |= φ 1 , thus π, i |= φ 1 S φ 2 .
This result enables the use of any framework for LTL/LTL f satisfiability checking that does not support past temporal operators.

Activation Variables
To compute the unsatisfiable core for a given specification Γ = {φ 1 , ..., φ N } of LTL f formulas over AP, we proceed as follows.For each LTL f formula φ i ∈ Γ we introduce a distinct activation variable a i , i.e., a fresh propositional variable a i ∈ A, where A ∩ AP = ∅.Thereupon, we define the LTL f formula Ψ = i=1..N (a i → φ i ) over AP ∪ A. We make the following observation: the satisfiability of Γ is conditioned by the activation variables A, and we have the following theorems.
Theorem 5. Let Γ = {φ 1 , ..., φ N } be a set of LTL f formulas over AP, A = {a 1 , ..., a N } a set of propositional variables such that A ∩ AP = ∅, and Ψ = i=1..N (a i → φ i ) an LTL f formula defined over AP ∪ A. It holds that Γ is unsatisfiable if and only if Ψ ∧ a i ∈A a i is unsatisfiable.
Proof.=⇒ Let us assume that Γ is unsatisfiable.This entails that φ i ∈Γ φ i is unsatisfiable.Let us now consider Ψ ∧ a i ∈A a i , with Ψ and A ∋ a i defined as above, and let us assume it is satisfiable.This means that there exists a trace π such that Ψ is satisfied in the initial state π[0] and every a i ∈ A is set to true.As a consequence, a i → φ i is satisfied by π for every i = 1..N .Therefore, also every φ i ∈ Γ (hence the conjunction of all φ i formulas, and the set Γ), are satisfiable.However, this contradicts the initial hypothesis.⇐= Let us assume that Ψ∧ a i ∈A a i is unsatisfiable.This entails that for all subsets A ′ ⊆ A such that each a ′ i ∈ A ′ is true and all the other variables in A \ A ′ are false, the conjunction Let us consider Γ, and assume it is satisfiable.This means that there exists a trace π such that π, 0 |= φ i ∈Γ φ i .Therefore, for all φ i ∈ Γ, it holds that π, 0 |= φ i , thus contradicting the above statement that i:a ′ i ∈A ′ φ i is unsatisfiable, and hence the hypothesis that Ψ ∧ a i ∈A a i is unsatisfiable.
Proof.The proof is analogous to, and a direct consequence of, the proof of Theorem 5.This theorem allows us to obtain the unsatisfiable cores (UCs) of Γ by looking at the activation variables that will make Ψ unsatisfiable.

Extracting Unsatisfiable Cores for LTL f
We present here how algorithms A1, . . ., A4 (see below for their definition) can be leveraged to define four new algorithms for the extraction of unsatisfiable cores for a given set of LTL f formulas, following the two different strategies S1 and S2 highlighted in Fig. 1.Section 4.1 provides the results for Strategy S1 and contains three new algorithms that extend three state-of-the-art algorithms originally developed for LTL, either relying on LTL model checking or on temporal resolution; Section 4.2 instead provides the results for Strategy S2 and contains a new algorithm that extends a reference approach developed for LTL f in a native manner.

Strategy S1: LTL f Unsatisfiable Core Extraction via Reduction to LTL Unsatisfiable Core Extraction
This section provides details of how we extract LTL f unsatisfiable cores via reduction to LTL satisfiability checking over infinite traces, and via LTL temporal resolution.The first two algorithms we present leverage two different state-of-the-art techniques for LTL satisfiability checking: A1) an approach based on Binary Decision Diagrams (BDDs, Bryant 1992) as in the work of Clarke et al. (1997); and A2) a SAT-based approach as presented by Biere et al. (2006).
The third algorithm is based on temporal resolution for LTL (A3, Hustadt and Konev 2003;Schuppan 2016), extended to support past temporal operators.We leverage Theorem 6 to obtain the unsatisfiable cores (UCs) of Γ = {φ 1 , ..., φ N } by looking at the activation variables A = {a 1 , ..., a N }, with A ∩ AP = ∅, which makes the formula i=1..N (a i → φ i ) ∧ a i ∈A a i unsatisfiable.In the following, we show how to obtain the UCs using different solving techniques.

Algorithm NA1: BDD-based LTL f Unsatisfiable Core Extraction
Given the set Γ = {φ 1 , ..., φ N } of LTL f formulas, we build the formula Ψ as discussed in Theorem 5: Ψ = i=1..N (a i → φ i ) for a set A = {a 1 , ..., a N } of fresh variables such that A ∩ AP = ∅.Then, we consider the following LTL formula built leveraging Corollary 1: The set Ψ ′ resulting from applying language emptiness algorithms on Ψ ′ (i.e., BddLtlSat (Ψ ′ ) in Algorithm NA1) can be symbolically represented as a propositional formula whose models encode all states that are the initial state of some infinite trace satisfying Ψ ′ .Notice that formula Ψ ′ contains the activation variables in A and the variables in AP alongside the variables in AP B(Ψ ′ ) .The latter is needed to encode the symbolic Büchi automaton for Ψ ′ (see Section 2.2.1 for details).
Theorem 7. Let Γ = {φ 1 , ..., φ N } be a set of LTL f formulas over AP, A = {a 1 , ..., a N } a set of propositional variables with A∩AP = ∅, Ψ an LTL f formula defined as Ψ = i=1..N (a i → φ i ), and Ψ ′ an LTL formula defined as There exists a state s ∈ Ψ ′ and a set C ⊆ A such that s |= p a i ∈C a i if and only if i,a i ∈C φ i is satisfiable.
Proof.=⇒ Suppose there exists a state s in Ψ ′ such that s |= p a i ∈C a i .For (AssN1), there exists a trace π starting from s satisfying Ψ ′ .Since s |= p a i for all a i ∈ C, then trace π also satisfies i,a i ∈C φ i .⇐= Suppose that i,a i ∈C φ i is satisfiable by some word w over the alphabet 2 AP .We extend w to w ′ such that w ′ satisfies a i ∈C a i .As a result, w ′ satisfies Ψ ′ .For (AssN2), there exists a trace π starting from Ψ ′ that satisfies w ′ .Then, π[0] satisfies a i ∈C a i .
This theorem allows for the extraction from Ψ ′ of all possible subsets of the implicants φ 1 , ..., φ N that are consistent or inconsistent.In particular, given the set of states Ψ ′ corresponding to the assignments of variables in A, AP and AP B(Ψ ′ ) , the set {s| A ∈ 2 A | i,s∈ Ψ ′ ,s |=p a i φ i is unsatisfiable} can be obtained from Ψ ′ by quantifying existentially (projecting) the variables corresponding to AP and AP B(Ψ ′ ) and negating (complementing) the result. UCS UCS Γ (A) is a propositional formula over variables in A where each satisfying assignment corresponds to an unsatisfiable core for Γ.
Algorithm NA1 BDD-based LTL f UC extraction with the approach of Clarke et al. (1997) Input: .N (a i → φ i ), and Ψ ′ an LTL formula defined as Equation ( 4) can be easily implemented with BDDs through the respective existential quantification and negation BDD operations (Bryant, 1992;Cimatti et al., 2007).
Algorithm NA1 computes the set of all the unsatisfiable cores (UCS) for a set of LTL f formulas Γ by leveraging the BDD-based approach discussed by Clarke et al. (1997).It takes as input a set Γ = {φ 1 , ..., φ N } of LTL f formulas over AP.First, it builds a set A = {a 1 , .., a N } of distinct activation variables such that A ∩ AP = ∅.Then it builds the LTL f formula Ψ = i=1..N (a i → φ i ), and converts it into the LTL formula Ψ ′ ← F end∧G(end → X end)∧f2l(Ψ) leveraging Corollary 1. Finally, it employs the BddLtlSat algorithm (Clarke et al., 1997) to compute Ψ ′ .Algorithm NA1 returns the empty set ∅ if the formula is satisfiable, otherwise it returns a UC ∈ UCS ⊆ 2 A such that for every s ∈ UCS the formula i,s |=p a i φ i is unsatisfiable.
For a more thorough discussion on how the check for language emptiness is performed, see Section 2.2.1 and the work of Clarke et al. (1997).
Theorem 8. Let Γ = {φ 1 , ..., φ N } be a set of LTL f formulas over AP, A = {a 1 , ..., a N } a set of propositional variables with A ∩ AP = ∅, Ψ an LTL f formula defined as Ψ = i=1..N (a i → φ i ), and Ψ ′ an LTL formula defined as Ψ ′ = F end∧G(end → X end)∧f2l(Ψ).Algorithm NA1 returns ∅ if the set of LTL f formulas Γ is satisfiable.Otherwise, it computes a set UCS ̸ = ∅ such that, for every UC ∈ UCS, Φ UC = {φ i | a i ∈ UC} is an unsatisfiable core for Γ, and then returns a UC ∈ UCS, which corresponds to an unsatisfiable core for Γ.
Proof.The proof is a direct consequence of Theorem 7 and Corollary 2. Indeed, BddLtlSat(Ψ ′ ) computes Ψ ′ , i.e., the set of states that are a starting point of some trace satisfying Ψ ′ .If Γ is satisfiable, then any of its subsets is satisfiable as well.Therefore, any possible assignment to a i will be such that Ψ ′ is satisfiable.As a consequence, SAT Γ (A) ̸ = ∅, and thus UCS Γ (A) = ∅ as per Line 3 of Algorithm NA1.On the other hand, if Γ is unsatisfiable, Equation 4 extracts the formula over variables a i such that every satisfying assignment for such formula corresponds to an unsatisfiable core for Ψ ′ .Notice that an unsatisfiable core for Ψ ′ is an unsatisfiable core for Γ, in turn.Therefore, the algorithm selects one of these assignments with the PickOne(UCS) operation on Line 4 of Algorithm NA1, thereby yielding an unsatisfiable core for Γ.

Algorithm NA2: SAT-based LTL f Unsatisfiable Core Extraction
Determining language emptiness of an LTL formula can also be performed leveraging any off-the-shelf technique for SAT-based bounded model checking (BMC) equipped with completeness check (Biere et al., 2006;Claessen & Sörensson, 2012).We observe that all these approaches can be easily extended to extract an unsatisfiable core from a conjunction of temporal constraints by leveraging the ability of propositional SAT solvers to check the satisfiability of a propositional formula ψ under a set of assumptions specified in the form of literals L ∋ l j , ψ ′ turns out to be unsatisfiable, then the SAT solver can return a subset UC ⊆ L such that ψ ∧ l j ∈UC l j is unsatisfiable.SAT-based bounded model checking (Biere et al., 2003) encodes a finite trace of length k with a propositional formula over the set of variables representing the AP at each time step from 0 to k.To check for completeness, they typically encode the fact that the trace cannot be extended with states not yet visited (Biere et al., 2006).We remark that, in model checking, one considers both a transition system (i.e., a model) and a temporal logic formula.However, since we focus on satisfiability of LTL formulas only, we consider as a transition system the universal model (i.e., given a set of propositional variables AP, the initial set of states is 2 AP , and the transitions relation is 2 AP × 2 AP ).Notice that this operation corresponds to encoding symbolically both the initial set of states and the transition relation with ⊤.
The approach proceeds as illustrated in Algorithm NA2.Similarly to Algorithm NA1, Algorithm NA2 takes as input a set Γ = {φ 1 , ..., φ N } of LTL f formulas over AP.First, it builds a set A = {a 1 , .., a N } of distinct activation variables such that A ∩ AP = ∅.Then it builds the LTL f formula Ψ = i=1..N (a i → φ i ), and converts it into the LTL formula Ψ ′ ← F end ∧ G(end → X end) ∧ f2l(Ψ) leveraging Corollary 1.It computes an unsatisfiable core for the set of LTL f formulas Γ leveraging the bounded model checking encoding defined by Biere et al. (2006).To this end, it uses a completeness formula EncC(ϕ, k), which is unsatisfiable iff the LTL formula ϕ is unsatisfiable, and a witness formula EncP(ϕ, k), which is satisfiable iff ϕ is satisfiable by a trace of length k. 5 For increasing values of k, we submit to the SAT solver a propositional encoding up to the considered k of a trace satisfying the formula Ψ ′ under the assumption that all the literals in A are true in the initial time step 0. In Algorithm NA2, we denote these literals with i and call SAT Assume(EncC(Ψ ′ , k), A [0] ), which checks the satisfiability of EncC(Ψ ′ , k) under the assumption that the literals in A [0] are true.When the call proves the formula unsatisfiable by returning UNSAT , it is straightforward to get one propositional unsatisfiable core from the SAT solver in terms of a subset of the variables in A [0] , thus concluding the search.However, if the call returns SAT , we cannot conclude that the LTL formula is unsatisfiable.
5. We refer the reader to the work of Biere et al. (2006) for details on how the propositional formulas are constructed byEncC(ϕ, k) and EncP(ϕ, k).
Algorithm NA2 SAT BMC-based LTL f UC extraction with the approach of Biere et al. ( 2006) res, UC ← SAT Assume(EncC(Ψ ′ , k), A [0] ) 7: if (res = UNSAT ) then return UC 8: if (res = SAT ) then return ∅ 10: In such a case, we check whether a lasso-shaped trace of length k exists that satisfies the propositional formula EncP(Ψ ′ , k) ∧ a i ∈A a [0] i .This is achieved with the new call If SAT is returned, then we can conclude that the LTL formula is satisfiable and this information can be noted.Otherwise, we increase k and iterate.This approach is guaranteed to eventually terminate (Biere et al., 2006).
To sum up, Algorithm NA2 takes as input a set of LTL f formulas Γ = {φ 1 , .., φ N } and returns the empty set (∅) if the specification is satisfiable, otherwise it returns a subset UC ⊆ A such that i,a i ∈UC φ i is unsatisfiable.Equipped with these notions, we prove the following.
Lemma 1.Let Γ = {φ 1 , ..., φ N } be a set of LTL f formulas over AP, A = {a 1 , ..., a N } a set of propositional variables with A∩AP = ∅, Ψ an LTL f formula defined as Ψ = i=1..N (a i → φ i ), and Ψ ′ an LTL formula defined as Proof.=⇒ Let π[0, k] be a finite trace corresponding to a satisfying assignment for EncP(Ψ ′ , k) ∧ a i ∈A a [0] i .From assumption (AssN1), there exists an l (with 0 i and the construction of Ψ ′ , we conclude that the projection of π i holds in the initial state, is a trace satisfying Ψ ′ .As a consequence, also EncP(Ψ ′ , k) ∧ a i ∈A a [0] i is satisfiable.
Proof.Let EncC(Ψ ′ , k) be unsatisfiable under the assumption a i ∈UC a [0] i .Let us assume there exists a witness π of i,a i ∈UC φ i .We can extend the π trace to a trace π ′ over variables AP ∪ A such that a i ∈UC a [0] i is a trace satisfying Ψ ′ , thus contradicting the assumption (AssN2).
Proof.The proof is a direct consequence of Lemma 1 and Lemma 2.
Algorithm NA2 uses the encoding of Biere et al. (2006) for both EncC(Ψ ′ , k) and EncP(Ψ ′ , k).We remark that the schema can also be adapted to leverage other algorithms such as the ones based on k-liveness (Claessen & Sörensson, 2012) or liveness to safety (Biere et al., 2002), both relying on the IC3 algorithm (Bradley, 2011).What intuitively changes is the propositional encoding of the LTL formula and the calls to the SAT solver to reflect the IC3 algorithm.However, this is left as future work.

Algorithm NA3: Temporal Resolution based LTL f Unsatisfiable Core Extraction
We can extract the unsatisfiable core of a set of LTL f formulas Γ = {φ 1 , ..., φ N } via LTL temporal resolution (TR, Hustadt and Konev 2003) by leveraging the results previously discussed in this paper and existing LTL temporal resolution engines equipped for temporal unsatisfiable core extraction (Schuppan, 2016).Algorithm NA3 computes an unsatisfiable core for a set of LTL f formulas Γ = {φ 1 , ..., φ N } as follows.First, it creates a set A of fresh propositional variables A = {a 1 , .., a N } such that A ∩ AP = ∅.Second, it builds the formula Ψ = i=1..N (a i → φ i ).Third, it leverages Theorem 1 to convert the LTL f formula into an equi-satisfiable LTL formula Ψ ′ = F end ∧ G(end → X end) ∧ f2l(Ψ).Fourth, it applies Theorem 4 to remove the past temporal operators in Ψ ′ , and enforces each activation variable a i ∈ A to hold, thus it builds the LTL formula ψ ← ϕ ′ ∧ φ∈Υ φ ∧ a i ∈A a i .Finally, the resulting LTL formula ψ is given as input to any LTL temporal resolution solver suitable to extract a temporal unsatisfiable core.In particular, we rely on the trp++ temporal resolution solver (Schuppan, 2016).If the LTL temporal resolution solver responds UNSAT , we get an unsatisfiable core of the original set of LTL f formulas by looking at the activation variables in the extracted temporal unsatisfiable core U C ψ .Algorithm NA3 takes as input the set Γ = {φ 1 , .., φ N } and returns the empty set (∅) if Γ is satisfiable.Otherwise, it returns a subset UC ⊆ A such that i,a i ∈UC φ i is unsatisfiable.It uses the trp++ algorithm (Schuppan, 2016) to compute the unsatisfiable core UC ψ , and then it extracts from it only the formulas corresponding to a i ∈ A (denoted in the algorithm with UC ψ | A ).The trp++(ψ) algorithm introduced by Schuppan (2016) first converts the LTL formula ψ into an equi-satisfiable set of Separated Normal Form (SNF, Fisher 1991) clauses C, and then it checks whether this set is satisfiable or not.In case of unsatisfiability, it computes an unsatisfiable core C uc ⊆ C and returns the set UC ψ obtained from C uc by Algorithm NA3 TR LTL f UC Extraction with the approach of Schuppan ( 2016) applying a reconstruction with respect to the original set of LTL formulas.We refer the reader to the work of Schuppan (2016) for more details on the algorithm and the proof of correctness of the trp++ algorithm.
On the other hand, if Γ is unsatisfiable, then also ψ = ϕ ′ ∧ φ∈Υ φ ∧ a i ∈A a i is unsatisfiable.Therefore, trp++(ψ) returns UNSAT together with an unsatisfiable core for the formula ψ (denoted U C ψ in the algorithm).We remark that, given the structure of ψ, each a i will be then converted into an SNF clause c a i = a i which will thus be part of the set of SNF clauses C used internally by trp++.Since this formula is unsatisfiable, the trp++ algorithm will extract an unsatisfiable core UC ψ = C uc ⊆ C such that C uc is unsatisfiable.C uc , among other clauses, will contain some c a i for a i ∈ A that correspond to the respective formulas in Γ (thanks also to Theorem 6).Therefore, the set UC ψ = C uc restricted to the only variables a i ∈ A (denoted with UC ψ | A in the algorithm) will represent an unsatisfiable core for Γ.

Strategy 2: LTL f Unsatisfiable Core Extraction via Native SAT
This section provides details on how we adapted the native SAT-based LTL f satisfiability approach discussed by Li et al. (2020) to extract the unsatisfiable core.Since the original approach for LTL f satisfiability checking was not supporting past temporal operators (Li et al., 2020), we rely on Theorem 4 to get rid of the past temporal operators, thus obtaining an equi-satisfiable LTL f formula without them.
Algorithm NA4 SAT LTL f UC Extraction with the approach of Li et al. ( 2020) The algorithm introduced by Li et al. ( 2020) can be extended to extract the unsatisfiable core of a set Γ = {φ 1 , ..., φ N } of LTL f formulas over AP as follows (see Algorithm NA4).
The resulting LTL f formula (without past temporal operators) is then passed as input to the SATLTLF algorithm discussed in the work of Li et al. (2020).The SATLTLF algorithm in Algorithm NA4 is almost identical to the original one presented in (Li et al., 2020).We modify each internal call SAT Assume by enforcing that each of those calls also assumes that the activation variables in A are all true.We refer the reader to the work of Li et al. (2020) for a thorough description of the algorithm (which is out of the scope of this paper).We remark that the only modifications performed to the Li et al. (2020) algorithm consist in changing each call to SAT Assume to also enforce that the activation variables in A are all true.Intuitively, given an LTL f formula ϕ, the algorithm by Li et al. (2020) constructs a conflict sequence C = C[0], ..., C[k] (i.e., a sequence of states that cannot reach a final state of the transition system T ϕ constructed from the formula ϕ given as input).This sequence is extracted from the unsatisfiable cores resulting from different propositional unsatisfiable queries performed within the algorithm itself.As per Theorem 3, the input LTL f formula is unsatisfiable iff there exists a conflict sequence C and an integer i ≥ 0 such When the SATLTLF algorithm returns UNSAT , we extract from the last element of the computed conflict sequence (i.e., C[i + 1]) the unsatisfiable core UC ⊆ A, and the set Γ ′ = {φ i |a i ∈ UC} is an unsatisfiable core for Γ leveraging Theorem 3.
Proof.If the set Γ is satisfiable, then the formula ψ = ϕ ′ ∧ φ∈Υ φ, where ⟨ϕ ′ , Υ⟩ = p2f(Ψ, ∅) and Ψ = i=1..N (a i → φ i ), is also satisfiable since it resorts to transformations that preserve satisfiability (see Theorems 4, 1 and 6).Thus, as SATLTLF is correct and complete (Li et ] will represent the states from which it is not possible to reach a final state for T ψ (i.e., there is no trace starting from these states that satisfies ψ, or, put in other words, all the traces from these states do not satisfy ψ).
Thus, for all states s ∈ C[i+1] there exists a set C ⊆ A such that s |= p a i ∈C a i , the formula i,a i ∈C φ i is unsatisfiable, and the set C corresponds to an unsatisfiable core extracted from C[i + 1] by construction of the SATLTLF algorithm.We refer the reader to the work of Li et al. 2020 for further details in this regard.

Discussion
We make the following observations.All the described approaches extract one unsatisfiable core, though not necessarily a minimum/minimal one.Algorithm NA1 could also be easily extended to get the minimal UC from the UCS set of all possible unsatisfiable cores for the given formula.For the SAT-based approaches, a minimum/minimal unsatisfiable core could be extracted by leveraging the ability of the SAT solver to get a minimum/minimal propositional unsatisfiable core.Similarly, the temporal resolution solver could be instrumented to get a minimum/minimal core.In all cases, it might be possible to get a minimum/minimal one with specialised solvers and/or with additional search.However, this is left for future work.

Experimental Evaluation
In this section, we provide details on the implementations of the proposed algorithms (Section 5.1), and then we describe the setup and the data sets used for the experimental evaluation (Section 5.2).We conclude with a report on the results alongside an examination thereof (Section 5.3).

Implementation of the Algorithms
Table 1 summarises our implementations of the four algorithms described in Section 4. We realise Algorithms NA1 and NA2 as extensions of the NuSMV model checker (Cimatti et al., 2002) exploiting the built-in support for past temporal operators, the f2l(φ) conversion, and Eq. ( 2).In particular, we enhanced (i) the BDD-based algorithm for LTL language emptiness (Clarke et al., 1997) and (ii) the SAT-based approaches (Biere et al., 2006).We shall henceforth refer to these tools as NuSMV-B and NuSMV-S, respectively.The source code for the extended version of NuSMV with these implementations is available at https://github.com/roveri-marco/ltlfuc.We create a toolchain for Algorithm NA3.First, our variant of aaltaf generates a file that is suitable for the trp++ temporal resolution solver (Hustadt & Konev, 2003) using the f2l(φ) conversion as per Eq. ( 2), and p2f(φ, ∅).Then, the resulting file is submitted to trp++.Finally, the generated UC is post-processed to extract the auxiliary variables A.
For our experiments, we use the latest version of trp++. 6 Finally, we implement Algorithm NA4 within an extended version of the aaltaf tool (Li et al., 2020), with a novel dedicated module supporting past temporal operators through p2f(φ, ∅).The source code for our extended version of aaltaf is available at https:// github.com/roveri-marco/aaltaf-uc.

The Experimental Setup
For the experimental evaluation, we considered all the unsatisfiable problems reported in the work of Li et al. (2020), for a total of 1377 problems.To select the specifications of interest to our analysis from the original testbed, we included only those for which at least one solver declared that the set was unsatisfiable and no other tool contradicted the result, as per the experimental data reported by Li et al. (2020).To compute the Γ set, we considered all the top-level conjuncts of each formula in the benchmark set.For every benchmark, we used the variant of the formula in the aaltaf format as an input.For the other tools except aaltaf, we implemented dedicated modules within aaltaf to convert the native input encodings into its accepted format.
We carried out the experimental evaluation considering the four implementations provided by NuSMV-B, NuSMV-S, our variant of aaltaf, and the trp++ toolchain.We ran all experiments on an Ubuntu 18.04.5 LTS machine, 8-Core Intel ® Xeon ® at 2.2 GHz, equipped with 64 GB of RAM.We set a memory occupation limit of 4 GB, and a CPU usage 6. http://www.schuppan.de/viktor/trp++uc/. limit of 60 min. 7Additionally, we considered k = 50 as the maximum depth for NuSMV-S, and we ran NuSMV-B with the BDD dynamic variable reordering mode active (Felt et al., 1993) to dynamically reduce the size of the BDDs and thus save space over time. 8Whenever the wall-clock timing reported by the implemented technique fell under the lowest sensitivity of the tool, we replaced the timing with the minimum non-zero timing reported overall (i.e., 3.78 × 10 −4 s).
Finally, we categorised the benchmarking specifications into 25 families, according to their characteristics and provenance.Table 2 shows the number of specifications per family, along with the minimum, maximum and average number of clauses within it.In particular, the LTLf-specific/benchmarks/LTLFRandomConjunction/V20 and LTLfspecific/benchmarks/LTLFRandomConjunction/C100 benchmarks are conjunctions of formulas, each selected randomly from standard patterns.They are characterised by a temporal depth (i.e., the maximum nesting of temporal operators) of up to 3, with 20 propositional variables.The number of conjunctions ranges from 20 to 100 for the former, and from 10 to 100 for the latter.The LTL-as-LTLf/rozier/counter/* benchmarks 9 are characterised by the fact that they have temporal formulas of different temporal depths (from 2 to 20) with a small number of propositional variables.These formulas are a conjunction of subformulas characterised by a top level G whose body contains a nested chain of 2 to 20 X's.The benchmarks in LTL-as-LTLf/schuppan/O1Formula contain a large number of propositional variables (from 1 to 1000) with temporal formulas of small depth (2 to 3) and different operators.The benchmarks in LTL-as-LTLf/schuppan/O2Formula are big conjunctions of formulas in the form G F a i ↔ a j with a i ̸ = a j .

The Results
In the experimental evaluation, we consider the following evaluation metrics: (i) the result of the check (expecting all the tools to return unsatisfiability and extract an unsatisfiable core if no resource limit is reached); (ii) the search time to compute and return an unsatisfiable core; (iii) the size of the computed unsatisfiable core.We remark that none of the presented algorithms strives for finding a minimum unsatisfiable core.The approach based on trp++ may be used to that end, and NuSMV-B could in principle be easily adapted to select from the intermediate computed set UCS a minimum unsatisfiable core (as discussed previously).Nevertheless, we pick the first returned UC for all tools so as to have a fair comparison among them.
7. These settings are motivated by similar choices performed in the experimental evaluations carried out in Li et al. (2020).8.These settings are motivated by similar choices performed in the experimental evaluations carried out by Cimatti et al. (2007); Schuppan (2016).9.For the sake of conciseness, we use the /* shorthand notation to compactly indicate all benchmark families under a given root.For example, LTL-as-LTLf/rozier/counter/* is a collective identifier for the LTL-as-LTLf/rozier/counter/counter, LTL-as-LTLf/rozier/counter/counterCarry, LTL-as-LTLf/rozier/counter/counterCarryLinear, and LTL-as-LTLf/rozier/counter/counterLinear benchmark families.
The first result is that, as expected, all the tools reported consistent output when terminating without reaching a resource limit (being it memory, time or search-space depth).In other words, for all the considered benchmarks it was never the case that an algorithm declared the specification as satisfiable.This outcome is in line with the original findings of Li et al. (2020).We also checked that every computed core was unsatisfiable by feeding it into the aaltaf algorithm.However, we remark that individual algorithms could extract different unsatisfiable cores among the diverse possible ones.
In the following, we further investigate the performance of the algorithms in terms of efficiency and efficacy.We gauge the former taking into account the time the tools take to find a UC.We measure the latter the cardinality of the returned UC.In principle, indeed, a tool could just return the whole input set of constraints as unsatisfiable.Such an output would be very fast to compute, and the tool would be deemed as highly efficient.However, the outcome would be of scarce informativeness.
In the remainder of the section, we shall consider the sole cases in which the tools were able to return a UC within the given resource limits (thus excluding timeouts and unknown answers), unless explicitly stated otherwise.The presentation of the experimental results proceeds as follows.Section 5.3.1 shows how tools rank in terms of both criteria (i.e., execution time and UC cardinality).That section serves as a general introduction to the following, more detailed analysis.Section 5.3.2focus on how algorithms cumulatively perform in terms of execution time.Section 5.3.3delves deeper into this topic by illustrating pairwise comparisons of the tools' efficiency.Then, Section 5.3.4categorizes the performance of tools based on the benchmark families, and Section 5.3.5 reports on our studies pertaining to the effect that the number of conjuncts in formulas have on execution time.Section 5.3.6 moves the focus away from time perfomance and shifts it onto the efficacy of tools, with a comparative evaluation of the cardinality of the UCs returned by the tools.Analogously to Section 5.3.4,the section also categorises the results with respect to the benchmark family of the input.Afterwards, Section 5.3.7 illustrates pairwise comparisons of the tools' efficacy, akin to our analysis reported in Section 5.3.3 for efficiency.To conclude, Section 5.3.8offers a summary of the findings.

Time Performance and Cardinality of the UC
The Sankey chart in Fig. 2 compactly depicts two-staged rankings: speed of computation and cardinality of the computed UC.In the rankings here we allow for ties (for example, two tools can occupy the first position together).As it turns out, aaltaf is the fastest tool in the majority of tests.However, it returns the UCs that are smallest in size with about half of the test cases.Although trp++ usually occupies the second position in the time ranking, the cardinality of the returned UC usually is the lowest whenever it finds one.Also, notice that trp++ returns the smallest UC in 680 cases, that is basically as many times as aaltaf (with 681 cases).NuSMV-B and NuSMV-S manage to return an unsatisfiable core less often than the other two tools.However, especially NuSMV-B yields comparably small unsatisfiable cores whenever it succeeds in finding one under the imposed time constraints.We recall that we exclude from the plot also the points representing returned unknown answers, which motivates the low number of entries for  NuSMV-S overall.Next, we discuss the factors that led to the different performance levels in a more in-depth comparative assessment, starting with time.

Execution Time
Figure 3 shows on the x-axis the number of problems solved by each algorithm (within the 60 min timeout), and on the y-axis the time taken to solve them cumulatively.Alongside the aforementioned tools, the figure illustrates the performance of the virtual best, that is the minimum time required for each solved instance among the four implementations.As above, we exclude from the plot the points representing runs that do not return a UC.The overall minimum, maximum, average, and median timings to return a UC are 0.0004 s, 3478.96s, 50.7472 s and 0.0812 s, respectively.The maximum, average, and median timings shrink to 2029.7347 s, 4.8534 s and 0.0274 s, respectively, for the virtual best.We observe that aaltaf outperforms the other implementations in the majority of cases, although the tail of the virtual-best curve on the right-hand side of both plots exhibits an influence from trp++ and NuSMV-B, thus witnessing that the proposed approaches are complementary.

Pairwise Efficiency Comparison
Figure 4 illustrates pairwise comparisons of time efficiency of the considered tools.Here, we also include cases in which a certain answer is not returned.Our objective is to provide a visual clue of tests in which only one (or none) of the two tools in the plot was able to extract the UC.  Figure 4(c), in particular, shows that aaltaf outperforms trp++ in terms of computation speed: most of the points, indeed, are located above the diagonal, thus indicating that aaltaf demands less time than trp++ to return the unsatisfiable cores.The plot also shows that trp++ exceeds the timeout in several cases (points on the red line marked with "3600 sec.timeout").Furthermore, we remark that trp++ operates a pre-processing phase on the input specification prior to the actual identification of UCs.If it manages to reduce the given set of conjuncts to false at that stage, it stops the computation before returning any UCs and raises an alert.The points lying on the line marked with "Input formula simplified to False" indicate those cases (see Figs. 4(c), (e) and (f)).This simplification occurred 26 times in total.
NuSMV-S was able to conclude that the formula was unsatisfiable and return a UC in 44 cases, whereas it yielded an unknown answer (i.e., it reached k = 50 without being able to decide on unsatisfiability) in 1278 cases (see the line labelled with "Unknown" in Figs.4(b), (d), and (f)).Finally, we report that aaltaf, trp++, NuSMV-B, and NuSMV-S reached the timeout in 84, 562, 990, and 55 cases, respectively.2, we determine only one best performer among all tools, without ex-aequo leading positions.When solvers take the same time, we associate the best result to the tool that returned the UC that is smaller in cardinality.as-LTLf/schuppan/O2Formula family, and the LTL-as-LTLf/rozier/counter/* benchmarks, with which NuSMV-B is the best performer.

Best Time Performance per Number of Conjuncts
In order to further inspect the correlation between the time performance of the tools and the type of problems solved, we analysed the relationship between the number of conjuncts of the problems and the corresponding computation time.Figure 6 plots the number of conjuncts in Γ (i.e., its cardinality) against the computation time (in seconds) of each of the considered algorithms.Figure 7 isolates the points stemming from three families in particular: LTLf-specific/benchmarks/LTLFRandomConjunction/V20 (Fig. 7(a)), LTLfspecific/benchmarks/LTLFRandomConjunction/C100 (Fig. 7(b)), and LTL-as-LTLf/schuppan/O1Formula (Fig. 7(c)).
The plots show that a relationship exists between the number of LTL f clauses and the computation time for all the four tools: the required overall time increases when the number of clauses increases.However, the number of clauses is not the only factor affecting the computation time.For instance, for the LTLf-specific/benchmarks/LTLFRandomConjunction/C100 benchmark family (Fig. 7(b)), the computation time varies independently of the number of conjuncts, which ranges in a short interval (118 to 154 clauses, as per Table 2).Also, we can observe that neither NuSMV-S nor NuSMV-B could return a UC under the imposed experimental resource constraints with this benchmark family, while aaltaf appears to be faster than trp++, following the general trend.Especially in Fig. 7(c) (and, to a lesser extent, in Fig. 7(a)) we can observe the different rapidity with which curves increase with the number of conjuncts: the steepest slope is associated with NuSMV-B, followed by trp++ and NuSMV-S.NuSMV-S performs better than trp++ with smaller sets of conjuncts, though.The most gradual upward trend belongs to the curve of aaltaf.
Next, we shift our focus from efficiency to efficacy, i.e., from execution time to the cardinality of the returned UCs.

Extraction of the Smallest UC per Benchmark Family
Figure 8 depicts the result of our efficacy analysis in the different benchmarks families.As shown in the pie chart of Fig. 8(a),aaltaf, trp++, NuSMV-B, and NuSMV-S extract UCs that are the smallest in size10 in 667, 556, 119 and 12 cases, respectively.As in Section 5.3.4,here we declare only one tool per test as the most effective.When multiple solvers return a UC of the same cardinality, we consider the one that took the lower computation time as prevailing.The overall minimum, maximum, average, and median cardinality of the smallest computed UCs were 1, 74, 6.513, and 4, respectively.
NuSMV-B computes the unsatisfiable core with the smallest size with the majority of test cases in the LTL-as-LTLf/rozier/counter/* benchmarks.Notice that it is also the one that performs best most often in terms of search time (see Fig. 5(b)).Furthermore, NuSMV-B is able to obtain the smallest UC with most of the benchmarks within the LTL-as-LTLf/schuppan/O2formula family.On all other benchmarks, aaltaf outperforms the other algorithms, and with the LTLf-specific/* benchmarks, trp++ is the second best solver to find the smallest UCs after aaltaf.
These results suggest that NuSMV-B could be preferred on benchmarks with fewer propositional variables and larger temporal depth.However, the SAT-based approaches seem to work better on benchmarks with a higher number of propositional variables that are not always directly correlated with one another.In these cases, BDDs may suffer a blow-up in size due to the canonicity of the representation indeed, as BDD dynamic variable ordering could help though to a limited extent (Felt et al., 1993).Notice that none of the solvers was capable of dealing with most of the big conjunctions of formulas in the LTL-as-LTLf/schuppan/O2Formula family (the corresponding tallest stacked bar in Fig. 8 is labelled with "None", indeed).plot.Overall, we can observe that the UCs extracted by trp++ and NuSMV-B (when these tools manage to return a certain answer) are often lower in size than the UCs returned by aaltaf.

Summary of the Findings
To conclude, we remark that these results (i) demonstrate an overall better performance of aaltaf in terms of time efficiency, (ii) show a tie between aaltaf and trp++ with respect to the cardinality of the extracted UCs, and (iii) emphasise a complementarity of the proposed approaches depending on the characteristics of the specifications at hand.
Tables 3 and 4 summarise the above findings.We can observe that none of the algorithms outperforms all the others on every benchmark.For example, NuSMV-S and NuSMV-B end up in a timeout and return an unknown answer in considerably many cases, and a number of problems are solved by only one of them.However, NuSMV-B is capable of handling the LTL-as-LTLf/rozier/counter/* benchmarks well.aaltaf does not always turn out to extract the smallest UC: in a number of cases, trp++, NuSMV-B and NuSMV-S extract UCs of a lower cardinality, excelling in particular in those cases in which aaltaf ends in a timeout.A deeper investigation of the characteristics that lead to such behaviours paves the path for future research endeavours.

Related Work
To the best of our knowledge, this is the first research endeavour aimed at extracting unsatisfiable cores for LTL f .In the following, we review the most relevant literature concerning LTL/LTL f satisfiability, and LTL SAT-based UC extraction.
The LTL satisfiability problem has been addressed through tableau-based methods (e.g., Janssen, 1999), temporal resolution (e.g., Fisher et al., 2001), and reduction to model checking (e.g., Cimatti et al., 2007;Rozier & Vardi, 2010, 2011).In (Rozier & Vardi, 2010), a reduction of the LTL satisfiability problem to a model checking problem, and a comparison of different model checkers (explicit/symbolic) is carried out, resulting in better performance and quality for symbolic approaches.A thorough comparison of the main tools dealing with the LTL satisfiability problem is reported in (Schuppan & Darmawan, 2011).The paper considers also tableau and temporal resolution based solvers, revealing a complementary behaviour between some of the considered solvers.
The problem of checking the satisfiability of LTL f properties has been the subject of several works (Fionda & Greco, 2018;Li et al., 2020Li et al., , 2014)).Li et al. (2014) leverage the Table 4: Best results as per the cardinality of UCs and wall-clock timings obtained via NuSMV-S and NuSMV-B, alongside the benchmarks for which no tools returned a UC (marked with "None").The top achievements with aaltaf and trp++ are in Table 3 finite semantics of traces for introducing a propositional SAT based algorithm for the LTL f satisfiability problem together with some heuristics to guide the search.The approach is implemented in the aalta-finite tool, which is shown to outperform other existing approaches based on the reduction to the LTL satisfiability problem.An extension of that work is presented in (Li et al., 2020).The new approach leverages a transition system (TS) for the input LTL f formula, thereby reducing satisfiability checking to a SAT-based path-search problem over this TS.Implemented in aalta-finite, it is shown to provide the best results in checking unsatisfiable formulas and comparable results for satisfiable ones.Fionda and Greco (2018) investigate the complexity of some fragments of LTL f , and present a SAT-based algorithm that outperforms the aalta-finite version in (Li et al., 2014).Our algorithm NA4 is based upon the work of Li et al. (2020) as a state-of-the-art tool for checking the satisfiability of LTL f properties.
The UC extraction for LTL has also been the subject of several studies (Awad et al., 2011;Goré et al., 2013;Narizzano et al., 2018;Schuppan, 2016).Goré et al. (2013) present a BDD based approach that leverages a method to determine minimal UCs for SAT (Huang, 2005) to find minimal UCs in LTL.In the work of Awad et al. (2011), UCs are extracted by leveraging a tableau-based solver to obtain an initial subset of unsatisfiable LTL formulas and then applying a deletion-based minimisation to the subset.The approach, implemented in procmine is part of a tool for the synthesis of business process templates.Schuppan (2016) propose a technique to extract fine-grained UCs by constructing and optimising resolution graphs for temporal resolution.Finally, Narizzano et al. (2018) presents a SAT-based encoding suitable for the unsatisfiable core extraction of LTL-based property specification patterns (Dwyer et al., 1999) extended with inequality statements on Boolean and numeric variables.Algorithm NA3, presented here, is built upon the work of Schuppan (2016) to compute UCs using temporal resolution.
In the context of process mining, Corea and Delfmann (2019); Di Ciccio et al. ( 2017) identify inconsistencies for specific LTL f -based constraints contained in the Declare modelling language (van der Aalst et al., 2009).They rely on automata language and language inclusion techniques to identify the inconsistencies, and are specific to the precise structure of Declare.Thus, they cannot be generalised to address generic LTL f -based specifications.
Finally, we remark that works on propositional UC extraction (e.g., Goldberg and Novikov 2003;Huang 2005; Marques-Silva and Janota 2014) could be used to improve the quality of the computed cores.We leave this investigation for future developments.

Conclusions and Future Work
In this paper, we have addressed the problem of LTL f unsatisfiable core extraction, presenting four algorithms based on different state-of-the-art techniques for LTL and LTL f satisfiability checking.We have implemented each of them based on existing tools, and we have carried out an experimental evaluation on a set of reference benchmarks for unsatisfiable temporal formulas.The results show a consistent output when terminating without reaching a resource limit and the feasibility of the proposed algorithms.The extensive evaluation evidences an overall better performance of aaltaf (and thus of algorithm NA4), both in terms of time efficiency and cardinality of the extracted UCs.Nonetheless, none of the algorithms outperforms all the others on every benchmark and in a number of cases, trp++, NuSMV-B and NuSMV-S extract UCs of a lower cardinality, excelling in particular in those cases in which aaltaf ends in a timeout.Furthermore, the evaluation shows that in 28 cases, no tool was able to return an unsatisfiable core, and that the problems of the LTL-as-LTLf/schuppan/O2Formula benchmark family are the most challenging ones for all the implemented techniques.Indeed, 14 out of the 28 problems that were not solved by any tools belong to this benchmark family.
These results show the adequacy of exploring different strategies and algorithmic solutions for this problem, and -at the same time -provide a first extensive baseline for future algorithms for the extraction of LTL f unsatisfiable cores.
For future work, we envisage the following research endeavours.Firstly, addressing the problem of extracting minimal UCs is in our plans.It is also our objective to extend the approach to other LTL/LTL f algorithms based on k-liveness (Claessen & Sörensson, 2012), liveness to safety (Biere et al., 2002), or tableau constructions (Geatti et al., 2021), Furthermore, we intend to extend the problems set with benchmarks from other domains (including AI planning, requirements engineering, and process management), also including

Remark 4. 1 .
Definition 4 can be seamlessly applied to LTL by considering Def. 3 in place of Def. 2 for satisfiability.

Figure 2 :Figure 3 :
Figure 2: Sankey chart displaying the tool rankings based on the computation time and the cardinality of the returned UCs.
Figure4illustrates pairwise comparisons of time efficiency of the considered tools.Here, we also include cases in which a certain answer is not returned.Our objective is to provide a visual clue of tests in which only one (or none) of the two tools in the plot was able to extract the UC.Figure4(a) compares aaltaf with NuSMV-B, Fig. 4(b) compares aaltaf with NuSMV-S, Fig. 4(c) compares aaltaf with trp++, Fig. 4(d) compares NuSMV-B with NuSMV-S, Fig. 4(e) compares NuSMV-B with trp++, and finally Fig. 4(f) compares NuSMV-S with trp++.Figure4(c), in particular, shows that aaltaf outperforms trp++ in terms of computation speed: most of the points, indeed, are located above the diagonal, thus indicating that aaltaf demands less time than trp++ to return the unsatisfiable cores.The plot also shows that trp++ exceeds the timeout in several cases (points on the red line marked with "3600 sec.timeout").Furthermore, we remark that trp++ operates a pre-processing phase on the input specification prior to the actual identification of UCs.If it manages to reduce the given set of conjuncts to false at that stage, it stops the computation before returning any UCs and raises an alert.The points lying on the line marked with "Input formula simplified to False" indicate those cases (seeFigs.4(c), (e) and (f)).This simplification occurred 26 times in total.NuSMV-S was able to conclude that the formula was unsatisfiable and return a UC in 44 cases, whereas it yielded an unknown answer (i.e., it reached k = 50 without being able to decide on unsatisfiability) in 1278 cases (see the line labelled with "Unknown" in Figs.4(b), (d), and (f)).Finally, we report that aaltaf, trp++, NuSMV-B, and NuSMV-S reached the timeout in 84, 562, 990, and 55 cases, respectively.

Figure 5
Figure 5 focuses again on computation time, contrasting the overall best performance with the benchmark family the input data stem from.In particular, Fig 5(a) shows a pie chart with an overview of the number of tests in which a tool was the fastest, and Fig 5(b) depicts a stacked bar chart in which the results are grouped by benchmark family.Differently from the Sankey diagram in Fig.2, we determine only one best performer among all tools, without ex-aequo leading positions.When solvers take the same time, we associate the best result to the tool that returned the UC that is smaller in cardinality.

Figure 5 :
Figure 5: Number of tests in which the solver took the lowest computation time (a) for the entire set of benchmarks, and (b) per benchmark family.

Figure 6 :
Figure 6: Number of conjuncts in Γ and respective computation time (in seconds) for each algorithm.

Figure 8 :
Figure 8: Number of tests in which the solver returned an unsatisfiable core of the smallest size (a) for the entire set of benchmarks, and (b) per benchmark family.

Figure 9 Figure 9 :
Figure9plots the pairwise comparison between different tools on the subset of the cases where both approaches were able to compute the UC.For instance, Fig.9(a) compares the cardinality of the UCs returned by aaltaf with the cardinality of the UCs returned by NuSMV-B.The plot shows that the UCs returned by NuSMV-B have a lower cardinality than the ones returned by aaltaf: most of the points are indeed located below the diagonal line.The opacity of the points represents the number of cases for which the two algorithms returned UCs with the specific cardinality corresponding to the point's coordinates in the

Table 1 :
The tools implementing the LTL f unsatisfiable core extraction algorithms , it declares that Γ is satisfiable.In turn, Algorithm NA4 returns the empty set ∅ to indicate this.On the other hand, if Γ is unsatisfiable, then also the formula ψ = ϕ ′ ∧ φ∈Υ φ is unsatisfiable.As per Theorem 3, the SATLTLF algorithm will build a conflict sequence C, and i

Table 2 :
The benchmarks used in our experiments.For the sake of readability, we compactly denote with $A the LTL-as-LTLf root, and with $B the LTLf-specific root.

Table 3 :
Best results as per the cardinality of UCs and wall-clock timings obtained via aaltaf and trp++.The top achievements with NuSMV-S and NuSMV-B are in Table4alongside the benchmarks for which no tools returned a UC. .