Practical and Parallelizable Algorithms for Non-Monotone Submodular Maximization with Size Constraint

We present combinatorial and parallelizable algorithms for maximization of a submodular function, not necessarily monotone, with respect to a size constraint. We improve the best approximation factor achieved by an algorithm that has optimal adaptivity and nearly optimal query complexity to $0.193 - \varepsilon$. The conference version of this work mistakenly employed a subroutine that does not work for non-monotone, submodular functions. In this version, we propose a fixed and improved subroutine to add a set with high average marginal gain, ThreshSeq, which returns a solution in $O( \log(n) )$ adaptive rounds with high probability. Moreover, we provide two approximation algorithms. The first has approximation ratio $1/6 - \varepsilon$, adaptivity $O( \log (n) )$, and query complexity $O( n \log (k) )$, while the second has approximation ratio $0.193 - \varepsilon$, adaptivity $O( \log^2 (n) )$, and query complexity $O(n \log (k))$. Our algorithms are empirically validated to use a low number of adaptive rounds and total queries while obtaining solutions with high objective value in comparison with state-of-the-art approximation algorithms, including continuous algorithms that use the multilinear extension.

As the amount of data in applications has exhibited exponential growth in recent years (e.g. the growth of social networks (Mislove, Koppula, Gummadi, Druschel, & Bhattacharjee, 2008) or genomic data (Libbrecht et al., 2018)), it is necessary to design algorithms for SMCC that can scale to these large datasets.One aspect of algorithmic efficiency is the query complexity, the total number of queries to the oracle for f .Since evaluation of f is often expensive, the queries to f often dominate the runtime of an algorithm.In addition to low query complexity, it is necessary to design algorithms that parallelize well to take advantage of modern computer architectures.To quantify the degree of parallelizability of an algorithm, the adaptivity or adaptivive complexity of an algorithm is the minimum number of sequential rounds such that in each round the algorithm makes O (poly(n)) independent queries to the evaluation oracle.The lower the adaptivive complexity of an algorithm, the more suited the algorithm is to parallelization, as within each adaptive round, the queries to f are independent and may be easily parallelized.
The design of algorithms with nontrivial adaptivity for SMCC when f is monotone was initiated by Balkanski and Singer (2018), who also prove a lower bound of Ω(log(n)/ log log(n)) adaptive rounds to achieve a constant approximation ratio.Recently, much work has focused on the design of adaptive algorithms for SMCC with (not necessarily monotone) submodular functions, as summarized in Table 1.However, although many algorithms with low adaptivity have been proposed, most of these algorithms exhibit at least a quadratic dependence of the query complexity on the size n of the ground set, for k = Ω(n).For many applications, instances have grown too large for quadratic query complexity to be practical.Therefore, it is necessary to design adaptive algorithms that also have nearly linear query complexity.An algorithm in prior literature that meets this requirement is the algorithm developed by Fahrbach, Mirrokni, and Zadimoghaddam (2019), which has O (n log(k)) query complexity and O (log(n)) adaptivity.However, the approximation ratio stated in Fahrbach et al. (2019) for this algorithm does not hold, as discussed in Section 1.1 and Appendix B. During our revision of this paper, Fahrbach, Mirrokni, and Zadimoghaddam (2023) fixed it, ensuring that the approximation ratio holds now.
The above algorithms both employ a lowly-adaptive subroutine to add multiple elements that satisfy a given marginal gain, on average.The conference version (Kuhnle, 2021) of this paper used the Threshold-Sampling subroutine of Fahrbach, Mirrokni, and Zadimoghaddam (2019), Fahrbach et al. (2019) for this purpose.However, the theoretical guarantee (Lemma 2. 3 of Fahrbach et al. (2019)) for non-monotone functions does not hold due to a bug that has since been fixed in Fahrbach et al. (2023).In Appendix B, we give a counterexample to the performance guarantee of Threshold-Sampling.In this version, we introduce a new threshold subroutine ThreshSeq, which not only fixes the problem that Table 1: Adaptive algorithms for SMCC where objective f is not necessarily monotone.We consider three metrics here."Approximation Ratio" reflects the accuracy of the algorithm, where a higher value signifies greater accuracy."Adaptivity" measures the algorithm's parallelizability, with a lower value indicating higher parallelizability."Queries" represents the total number of queries to the oracle for f , dominating the algorithm's runtime.
A lower query count implies faster algorithmic performance.
Threshold-Sampling faced, but achieves its guarantees with high probability as opposed to in expectation; the high probability guarantees simplify the analysis of our approximation algorithms that rely upon the ThreshSeq subroutine.
Our algorithm AST uses a double-threshold procedure to obtain its ratio of 1/6 − ε.Our second algorithm ATG is a low-adaptivity modification of the algorithm of Gupta, Roth, Schoenebeck, and Talwar (2010), for which we improve the ratio from 1/6 to 0.193 through a novel analysis.Both of our algorithms use the low-adaptivity, threshold sampling procedure ThreshSeq and a subroutine for unconstrained maximization of a submodular function (Feige, Mirrokni, & Vondrák, 2011;Chen, Feldman, & Karbasi, 2019) as components.More details are given in the related work discussion below and in Section 4.
The new ThreshSeq does not rely on sampling to achieve concentration bounds, which significantly improves the practical efficiency of our algorithms over the conference version (Kuhnle, 2021).Empirically, we demonstrate that both of our algorithms achieve superior objective value to current state-of-the-art algorithms while using a small number of queries and adaptive rounds on two applications of SMCC.

Related Work
Theshold Procedures.A recurring subproblem of SMCC (and other submodular optimization problems) is to add to a candidate solution S those elements x of the ground set N that give a marginal gain of at least τ , for some constant threshold τ .To solve this subproblem, the algorithm Threshold-Sampling is proposed in Fahrbach et al. (2019) for monotone submodular functions and applied in Fahrbach et al. (2019) and the conference version of this work (Kuhnle, 2021) as subroutines for non-monotone SMCC.However, theoretical guarantee (Lemma 2. In this work, we propose the ThreshSeq algorithm (Section 2) that fixes the problems of Threshold-Sampling and runs in O (n) queries and O (log n) adaptive rounds.We solve these problems by introducing two sets found by the algorithm: an auxilliary set A separate from the solution set A ′ found by ThreshSeq that solves Threshold (Def.2) separately.
The algorithm maintains that A ′ ⊆ A, and the larger set is used for filtering from the ground set, while the smaller set maintains desired bounds on the average marginal gain.
Algorithms with Low Adaptivive Complexity.Since the study of parallelizable algorithms for submodular optimization was initiated by Balkanski and Singer (2018), there have been a number of O (log n)-adaptive algorithms designed for SMCC.When f is monotone, adaptive algorithms that obtain the optimal ratio (Nemhauser & Wolsey, 1978) of 1−1/e−ε have been designed by Balkanski, Rubinstein, and Singer (2019a), Fahrbach et al. (2019), Ene and Nguyen (2019), Chen, Dey, and Kuhnle (2021).Of these, the algorithm of Chen et al. ( 2021) also has the state-of-the-art sublinear adapativity and linear query complexity.
However, when the function f is not monotone, the best approximation ratio with polynomial query complexity for SMCC is unknown, but falls within the range [0.385, 0.491] (Buchbinder & Feldman, 2019;Gharan & Vondrák, 2011).For SMCC, algorithms with nearly optimal adaptivity have been designed by Balkanski et al. (2018), Chekuri and Quanrud (2019), Ene, Nguyen, and Vladu (2019), Fahrbach et al. (2019), Amanatidis et al. (2021); for the query complexity and approximation factors of these algorithms, see Table 1.Of these, the best approximation ratio of (1/e − ε) ≈ 0.368 is obtained by the algorithm of Ene and Nguyen (2020).However, this algorithm requires access to an oracle for the gradient of the continuous extension of a submodular set function, which requires Ω(nk 2 log 2 (n)) queries to sufficiently approximate.The practical performance of the algorithm of Ene and Nguyen (2020) is investigated in our empirical evaluation of Section 5. Other than the algorithms of Fahrbach et al. (2019) and Amanatidis et al. ( 2021), all parallelizable algorithms exhibit a runtime of at least quadratic dependence on n.In contrast, our algorithms have query complexity of O (n log k) and have O (log n) or O log 2 n adaptivity.
After the conference version (Kuhnle, 2021)  The IteratedGreedy Algorithm.Although the standard greedy algorithm performs arbitrarily badly for SMCC, Gupta et al. (2010) showed that multiple repetitions of the greedy algorithm, combined with an approximation for the unconstrained maximization problem, yields an approximation for SMCC.Specifically, Gupta et al. (2010) provided the IteratedGreedy algorithm, which achieves an approximation ratio of 1/6 for SMCC when the 1/2-approximation of Buchbinder, Feldman, Naor, and Schwartz (2012) is used for the unconstrained maximization subproblems.Our algorithm AdaptiveThreshold-Greedy uses ThreshSeq combined with the descending thresholds technique of Badanidiyuru and Vondrák (2014) to obtain an adaptive version of IteratedGreedy, as described in Section 4. Pseudocode for IteratedGreedy is given in Appendix E, where an improved ratio of ≈0.193 is proven for this algorithm; we also prove the ratio of nearly 0.193 for our adaptive algorithm ATG in Section 4.

Preliminaries
A submodular set function defined on all subsets of ground set N is denoted by f .The marginal gain of adding an element x to a set S is denoted by ∆ (x | S) = f (S ∪ {x}) − f (S).
Let OPT = max |S|≤k f (S), the optimal value of the SMCC problem for ground set N and size constraint k.The restriction of f to all subsets of a set S ⊆ N is denoted by f ↾ S .Next, we describe two subproblems both of our algorithms need to solve: namely, unconstrained maximization subproblems and a threshold sampling subproblem.For both of these subproblems, procedures with low adaptivity are needed.
The Unconstrained Maximization Problem.The first subproblem is unconstrained maximization of a submodular function.When the function f is non-monotone, the problem of maximizing f without any constraints is NP-hard (Feige et al., 2011).Recently, Chen et al. ( 2019) developed an algorithm that achieves nearly the optimal ratio of 1/2 with constant adaptivity, as summarized in the following theorem.
To achieve the approximation factor listed for our algorithms in Table 1, the algorithm of Chen et al. (2019) is employed for unconstrained maximization subproblems.
The Threshold Problem.The second subproblem is the following: Definition 2 (Threshold).Given a threshold τ ∈ R and integer k, choose a set S such that 1) Algorithms that can use a solution to this subproblem occur frequently, and so multiple algorithms in the literature for this subproblem have been formulated (Fahrbach et al., 2019;Balkanski, Rubinstein, & Singer, 2019b;Kazemi, Mitrovic, Zadimoghaddam, Lattanzi, & Karbasi, 2019;Amanatidis et al., 2021;Chen et al., 2021).We want a procedure that can solve Threshold with the following three properties: Organization.In Section 2, we introduce our threshold sampling algorithm: ThreshSeq.Then, in Sections 3 and 4, we analyze our algorithms using the ThreshSeq and Uncon-strainedMax procedures.Our empirical evaluation is reported in Section 5 with more discussions in Appendix G.1.

The ThreshSeq Algorithm
In this section, we introduce the linear and highly parallelizable threshold sampling algorithm ThreshSeq (Alg.2).ThreshSeq takes as input oracle f , constraint k, error rate ε, threshold τ , and failure probability parameter δ which reflects the success probability.This algorithm has logarithmic adaptive rounds and linear query calls with high probability.Rather than directly solving Threshold (Def.2) with one solution set, it returns two relevant sets that deal with the two properties separately.
Algorithm 1 A general framework of threshold sampling algorithms 1: procedure (f, N , k) 2: T ← a subset of V ▷ make a decision on selecting a good subset from V

7:
A ← A ∪ T 8: return A

Algorithm Overview
The state-of-the-art threshold sampling algorithms, whether for monotone or non-monotone functions, share a common structure (Alg. 1) that works as follows: 1) The algorithm initial-izes a candidate set V with the whole ground set N and an empty solution set A (Line 3); 2) During each iteration, it filters out elements in the candidate set V that either make negligible contributions to A or violate the given constraint, and then selects a prefix of V to add to A (Line 5-7); 3) Then, the algorithm repeats the last step until the candidate set V is empty.The difference between those algorithms lies in how they select the prefix in Step (2) on Line 6. Threshold-Sampling in Fahrbach et al. (2019) applies a random sampling procedure for each prefix considered at that iteration.Threshold sampling algorithms in Balkanski et al. (2019b) and Amanatidis et al. (2021) explicitly check all the candidate elements for a given prefix.Later, Kazemi et al. (2019) and Chen et al. (2021) proposed threshold sampling algorithms that work by performing a uniformly random permutation of elements and making the decision after querying once for each prefix.This makes them comparably much more practical in performance and demonstrates that multiple query calls of a given prefix are redundant.Subsequently, we are able to keep a solution with the same threshold and fewer query calls.
To efficiently obtain large sequences of elements with gains above τ , an approach inspired by monotone threshold sampling algorithms in Kazemi et al. (2019) and Chen et al. ( 2021) is proposed.As discussed above, these algorithms work by adaptively adding sequences of elements to a set A, where the sequence has been checked in parallel to have at most an ε fraction of the sequence failing the marginal gain condition.A uniformly random permutation of elements is considered, where the average marginal gain being below τ is detected by a high proportion of failures in the sequence.This step leads to a constant fraction of elements being filtered out at the next iteration with high probability.When combined with an exponentially decreasing candidate set and a constant number of adaptive rounds for each iteration, these algorithms achieve logarithmic adaptivity and linear query complexity.
The intuitive reason why this does not directly work for non-monotone functions (i.e.A is not a solution to Threshold (Def.2)) is: if one of the elements added fails the marginal gain condition, it may do so arbitrarily badly and have a large negative marginal gain.Moreover, one cannot simply exclude such elements from consideration, because they are needed to ensure that the filtering step at the next iteration will discard a large enough fraction of elements.Deleting such elements requires recalculating the marginal gains with respect to the updated sets, which increases the number of adaptive rounds required in each iteration by a factor of O (k).Our solution is to keep these elements in the set A which is used for filtering and responsible for Property (2) of Threshold (Def.2), but only include those elements with a nonnegative marginal gain in the candidate solution set A ′ , which is responsible for Property (1) of Threshold (Def.2).The membership of A ′ is known since the gain of every element was computed in parallel.Moreover, gives the needed relationship on the average marginal gain of each element of A ′ .Due to submodularity, the objective value does not decrease when we exclude elements with negative marginal gains.
Discussion of δ.Different from other threshold sampling algorithms, ThreshSeq incorporates an additional input parameter, δ.This parameter reflects the number of iterations in the outer for loop, or specifically the adaptive rounds achieved by the algorithm.As Algorithm 2 A parallelizable threshold algorithm for threshold τ 1: procedure ThreshSeq(f, N , k, δ, ε, τ )

2:
Input: evaluation oracle f : 2 N → R + , constraint k, failure probability parameter δ, error ε, threshold τ 3: 11: for i ← 1 to s in parallel do ▷ Parallel gain computation 12: return failure the algorithm progresses and more elements are added to the solution set, the size of A increases while the size of V decreases.Then, the algorithm stops successfully once |A| = k or |V | = 0.The more iterations, the more likely it is to succeed.Intuitively, the higher δ is, the lower is the probability of ThreshSeq choosing a subset that improves on costs and satisfies the constraint.As stated in Theorem 3 Property (1), ThreshSeq succeeds with a probability greater than 1 − δ/n.For downstream approximation algorithms that use ThreshSeq as a subroutine with a specific δ value, the more calls made to Thresh-Seq, the lower success probability it achieves.The adoption of δ makes such probability manageable.

Theoretical Guarantees
Theorem 3. Let (f, k) be an instance of SMCC.For any constant ε, the algorithm ThreshSeq outputs A ′ ⊆ A ⊆ N such that the following properties hold: 1) The algorithm succeeds with probability at least 1 − δ/n.
2) There are O (n/ε) oracle queries in expectation and O (log(n/δ)/ε) adaptive rounds. 3 The performance of ThreshSeq is derived mainly by answering two questions: 1) if a constant fraction of elements can be filtered out at any iteration with a high probability; 2) if the two sets returned solve Threshold (Def.2) indirectly.In Lemma 4 below, it is certified that the number of elements being deleted in the next iteration monotonically increases from 0 to |V | as the size of the selected set increases.Then, by probability lemma and concentration bounds (in Appendix A), Lemma 5 answers the first question.
Lemma 4. Given V after random-permutation on Line 8, let Furthermore, with enough iterations, the candidate set V becomes empty at some point with a high probability.Also, since the size of the candidate set |V | exponentially decreases, intuitively, the total queries is linear in expectation.
A downside of this bifurcated approach is that a downstream algorithm receives two sets A, A ′ instead of one from ThreshSeq.It is obvious that the second property of Threshold (Def.2) holds naturally with set A. Lemma 6 below shows how we can relate set A ′ with set A. Therefore, by discarding the elements with negative gains in A, the gains of the rest elements, denoted by A ′ , increase and follow the first property of Threshold (Def.2).Lemma 6. Say an element added to the solution set is good if its gain is greater than τ .Suppose that Algorithm 2 terminates successfully.A and A ′ returned by Algorithm 2 hold the following properties: 1) There are at least (1 − ε)-fraction of A that is good.
2) A good element in A is always a good element in A ′ .
3) And, any element in A ′ has non-negative marginal gain when added.
The proofs of the lemmas above can be found in Appendix C. Now, we provide the proof concerning the performance of ThreshSeq.
Proof of Success Probability (Property 1).The algorithm succeeds if |V | = 0 or |A| = k at termination.If we can filter out a constant fraction of V or select a subset with k − |A| elements at any iteration with a constant probability, then, with enough iterations, the algorithm successfully terminates with a high probability.
From Lemma 4, there exists a point t such that t = min{i : |S i | ≥ ε|V |/2}, where the next iteration filters out more than ε/2-fraction of elements if i * ≥ t.Intuitively, when i ≤ t, there is a constant probability that the fraction of trues in B[1 : i] exceeds 1 − ε.According to Lemma 4, Lemma 5 is provided to give the probability that whether |A| = k or ε/2−fraction of V are filtered out at the next iteration.
For the purposes of the analysis, consider a version of the algorithm that does not break on Line 7 when |V | = 0.If so, in subsequent iterations following |V | = 0, it is always the case that s = 0 and T i * = ∅.Lemma 5 still holds in this case.As a result, the algorithm returns the same solution set as the original one.
When the algorithm fails to terminate, at each iteration, it always holds that i * < s; and there are no more than m = ⌈log 1−ε/2 (1/n)⌉ iterations that i * ≥ t.Therefore, there are no more than m iterations that i * ≥ min{s, t}.Otherwise, with more than m iterations that i * ≥ min{s, t}, if there is an iteration that s ≤ t, the algorithm terminates with |A| = k.Otherwise, with more than m iterations that i * ≥ t, the algorithm terminates with |V | = 0. Define a successful iteration as an iteration that i * ≥ min{s, t}, which means it successfully filters out ε/2-fraction of V or the algorithm stops here.Let X be the number of successes in the ℓ iterations.Then, X can be regarded as a sum of dependent Bernoulli trails, where the success probability is larger than 1/2 from Lemma 5. Let Y be a sum of independent Bernoulli trials, where the success probability is equal to 1/2.Then, the probability of failure can be bounded as follows, where Inequality (a) follows from Lemma 13, and Inequality (b) follows from Lemma 12.
Proof of Adaptivity and Query Complexity (Property 2).In Alg. 2, the oracle queries occur on Line 5 and 13.Since filtering and inner for loop can be done in parallel, there are constant adaptive rounds in an iteration.Therefore, the adaptivity is O (ℓ) = O (log(n/δ)/ε).
As for the query complexity, let V j be the set V after filtering on Line 5 in iteration j.Let j i be the i-th successful iterations, Y i = j i − j i−1 .By Lemma 13 in Appendix A, it holds that E [Y i ] ≤ 2. For any iteration j that j i−1 + 1 ≤ j ≤ j i , there are i − 1 successes before it.Thus, it holds that At any iteration j, there are |V j−1 | + 1 oracle queries on Line 5. As for the inner for loop, there are no more than |V j | + 1 oracle queries.The expected number of total queries can be bounded as follows: Therefore, the total queries are O (n/ε), where ε ∈ (0, 1).
Proof of Marginal Gains (Property 3 and 4).The algorithm terminates successfully if either |V | = 0 or |A| = k during its execution.As proved above, this happens with a probability of as least 1 − δ/n.In the proof below, we condition on the event that the algorithm terminates successfully and returns A, A ′ .
If the algorithm returns A such that |A| < k, then it must be the case that the algorithm terminates with |V | = 0. So, for any x ∈ N , there exists an iteration j (x) + 1 such that x is filtered out at iteration j (x) + 1.Let A j (x) be A after iteration j (x) .Then, due to submodularity and Lemma 6 applies to any case in which the algorithm terminates successfully.As a reminder, an added element is considered good if its gain is greater than τ with respect to the solution prior to its inclusion.As per Line 17, A ′ includes all such good elements that are in A.
Based on Property 1 of Lemma 6, it is guaranteed that the number of good elements in Furthermore, due to the diminishing returns property of submodular functions, removing an element from a set in a sequence will result in non-increasing marginal gains for the remaining elements.For any x ∈ A, let A (x) be a subsequence of A before x is added into A. Define A ′ (x) analogously.Then, consider any By Property 2 and 3 of Lemma 6, if an element

The AdaptiveSimpleThreshold Algorithm
In this section, we present the simple algorithm AdaptiveSimpleThreshold (AST, Alg. 3) and show it obtains an approximation ratio of 1/6−ε with nearly optimal query and adaptivive complexity.This algorithm relies on running ThreshSeq for a suitably chosen threshold value.A procedure for unconstrained maximization is also required.
Overview of Algorithm.Algorithm AST works as follows.First, the for loop guesses a value of τ close to OPT (4+α)k , where 1/α is the ratio of the algorithm used for the unconstrained maximization subproblem.Next, ThreshSeq is called with parameter τ to yield set A and A ′ ; followed by a second call to ThreshSeq with f restricted to N \ A to yield set B and B ′ .Next, an unconstrained maximization is performed with f restricted to A to yield set A ′′ .Finally, the best of the three candidate sets A ′ , B ′ , A ′′ is returned.
We prove the following theorem concerning the performance of AST.
Overview of Proof.The proof uses the following strategy: either ThreshSeq finds a set A ′ or B ′ with value approximately τ k, which is sufficient to achieve the ratio, or we have two disjoint sets A, B of size less than k, such that for any x ̸ ∈ A ∪ B, ∆ (x | A) < τ and ∆ (x | B) < τ .In this case, for any set O, we have by submodularity and nonnegativity, The first term is bounded by the unconstrained maximization, and the second term is bounded by an application of submodularity and the fact that the maximum marginal gain of adding an element into A or B is below τ .The choice of constant c balances the trade-off between the two cases of the proof.
Proof of Theorem 7. Let (f, k) be an instance of SMCC, and let ε > 0. Suppose algorithm AST uses a procedure for UnconstrainedMax with expected ratio 1/α.We will show that, with some events that happen with probability of at least (1−1/n), the set C returned by algorithm where OPT is the optimal solution value on the instance (f, k).

Observe that τ
To better explain it, we attach Fig. 1 above.Because τ i decreases by a factor of 1 − ε, there exists i 0 such that For the rest of the proof, we assume that the properties of Theorem 3 hold for the calls to ThreshSeq with threshold τ i 0 , which happens with at least probability 1 − 1/n by the union bound.
Case |A| = k or |B| = k.We suppose that |A| = k, the proof for the case |B| = k is directly analogous.By Theorem 3 and the value of τ i 0 , it holds that, Hence, by submodularity, Next, from ( 1), (2), submodularity, nonnegativity, Theorem 3, and the fact that A ∩ B = ∅, it holds that, Since UnconstrainedMax is an α-approximation, we have From Inequalities (3), (4), and submodularity, we have Adaptivity and Query Complexities.The adaptivity of AST is twice the adaptivity of ThreshSeq plus the adaptivity of UnconstrainedMax plus a constant.Further, the total query complexity is log 1−ε (1/(ck)) times the sum of twice the query complexity of ThreshSeq and the query complexity of UnconstrainedMax.

The AdaptiveThresholdGreedy Algorithm
In this section, we present the algorithm AdaptiveThresholdGreedy (ATG, Alg.4), which achieves ratio ≈ 0.193 − ε in nearly optimal query and adaptivive complexity.The price of improving the ratio of the preceding section is an extra log(k) factor in the adaptivity.
Overview of Algorithm.Our algorithm (pseudocode in Alg. 4) works as follows.Each for loop corresponds to a low-adaptivity greedy procedure using ThreshSeq with descending thresholds.Thus, the algorithm is structured as two iterated calls to a greedy algorithm, where the second greedy call is restricted to select elements outside the auxiliary set A returned by the first.Finally, an unconstrained maximization procedure is used within the first greedily-selected auxiliary set A. Then, the best of three candidate sets is returned.In the pseudocode for ATG, Alg. 4, ThreshSeq is called with functions of the form f S , which is defined to be the submodular function At a high level, our approach is the following: the IteratedGreedy framework of Gupta et al. (2010) runs two standard greedy algorithms followed by an unconstrained maximization, which yields an algorithm with O (nk) query complexity and O (k) adaptivity.We adopt this framework but replace the standard greedy algorithm with a novel greedy approach with low adaptivity and query complexity.To design this novel greedy approach, we modify the descending thresholds algorithm of Badanidiyuru and Vondrák (2014), which has query complexity O (n log k) but very high adaptivity of Ω(n log k).We use Thresh-Seq to lower the adaptivity of the descending thresholds greedy algorithm (see Appendix D for pseudocode and a detailed discussion).
For the resulting algorithm ATG, we prove a ratio of 0.193 − ε (Theorem 8), which improves the 1/6 ratio for IteratedGreedy proven in Gupta et al. (2010).Also, by adopting ThreshSeq proposed in this paper, the analysis of approximation ratio is simplified.Thanks to the fact that the contribution of each element added to the solution set A ′ is determined, at least (1 − ε)|A| elements in the solution set A ′ have marginal gains which exactly exceed the threshold τ , while the rest of it have non-negative marginal gains.Therefore, it is not needed to analyze the marginal gain in expectation anymore.An exact lower bound is given by the analysis of the two greedy procedures.
A simpler form of our arguments shows that the improved ratio also holds for the original IteratedGreedy of Gupta et al. (2010); this analysis is given in Appendix E. We prove the following theorem concerning the performance of ATG.
Theorem 8. Suppose there exists an (1/α)-approximation for UnconstrainedMax with adaptivity Φ and query complexity Ξ, and let ε > 0. Then the algorithm AdaptiveThresh-oldGreedy for SMCC has expected approximation ratio Proof of Theorem 8.In this proof, we assume that the guarantees of Theorem 3 hold for each call to ThreshSeq made by ATG; this occurs with probability at least (1 − 1/n) by the union bound and the choice of δ.
Overview of Proof.For the proof, a substantial amount of machinery is necessary to lower bound the marginal gain.The necessary notations are made first; then, in Lemmas 9 -10, we formulate the necessary lower bounds on the marginal gains for the first and second greedy procedures.For each respective greedy procedure, this is accomplished by considering the good elements in the selected set returned by ThreshSeq, or the dummy element if the size of selected set is limited.This allows us to formulate a recurrence on the sum of the marginal gains (Lemma 11).Finally, the recurrence allows us to proceed similarly to our proof in Appendix E after a careful analysis of the error introduced (Lemma 18 in Appendix F).
Notations.Followed by the notations in the pseudocode of Alg. 4, A and A ′ are returned by the first greedy procedure, while B and B ′ are returned by the second one.Let A i be the set A after iteration i, a ′ j be the j-th element in A ′ , and i(j) be the iteration that returns a ′ j .If j > |A ′ |, let a ′ j be a dummy element, and i(j) = ℓ + 1.Furthermore, define A ′ j = {a ′ 1 , . . ., a ′ j }.Then, we define B i(j) and B ′ j analogously.
And for any j, f The proof of the above lemma can be found in Appendix F. Following the notations and the proof of Lemma 9, we can get an analogous result for the gain of B ′ as follows.
Lemma 10.For 1 ≤ j ≤ k, there are at least .
The next lemma proved in Appendix F establishes the main recurrence.
), where j(u) is the u-th j which satisfies Lemma 9 or Lemma 10.Then, there are at least .
Lemma 11 yields a recurrence of the form (b − u i+1 ) ≤ a (b − u i ), u 0 = 0, and has the solution u i ≥ b(1 − a i ).Consequently, we have From the choice of C on line 16, we have 2f (C) ≥ f (A ′ ) + f (B ′ ) and so from (5), we have Since an (1/α)-approximation is used for UnconstrainedMax, for any A, f For any set ) by submodularity and nonnegativity.Therefore, by Inequalities 6 and 7, .
Therefore, we have from Lemma 18 in Appendix F,

Empirical Evaluation
In this section, we evaluate our algorithm in comparison with the state-of-the-art parallelizable algorithms: AdaptiveNonmonotoneMax of • Our algorithm ATG obtains the best objective value of any of the parallelizable algorithms; obtaining an improvement of up to 19% over the next algorithm, our AST.Both Fahrbach et al. (2019) and Ene and Nguyen (2020) exhibit a large loss of objective value at both small and large k values.
• Both our algorithm AST, ParCardinal(v1), and AdaptiveNonmonotoneMax use a very small number of adaptive rounds.Both ATG and the algorithm of Ene and Nguyen (2020) use roughly an order of magnitude more adaptive rounds.
• The algorithm of Ene and Nguyen (2020) is the most query efficient if access is provided to an exact oracle for the multilinear extension of a submodular function and its gradient2 .However, if these oracles must be approximated with the set function, their algorithm becomes very inefficient and does not scale beyond small instances (n ≤ 100).
• Our algorithms used fewer queries to the submodular set function than the lineartime algorithm FastRandomGreedy in Buchbinder et al. (2015).Both versions of ParCardinal are the most query inefficient.
• Comparing AST with four threshold sampling algorithms, our ThreshSeq proposed in this paper is the most query and round efficient without loss of objective values.
If running Threshold-Sampling theoretically, with a large amount of sampling in ReducedMean, we empirically establish that the query complexity of algorithms using Threshold-Sampling can be three to four orders of magnitude worse than other algorithms over the SMCC instances in our benchmark.

Algorithm Setup for AST and ATG
In the pseudocodes for AST and ATG, M is used as the upper bound of OPT/k which is set to max x∈N f (x).In the experiment, we used a sharper upper bound, the average of the top k singleton values, maintaining the analysis of approximation ratio.Additionally, the (1/2)−approximation UnconstrainedMax algorithm is substituted with a random set, which is a (1/4)-approximation by Feige et al. (2011).Consequently, the obtained approximation ratios for AST and ATG in the actual experiment are 1/8 − ε and 0.139 − ε, respectively.

Comparison Algorithms and Other Settings
In addition to the algorithms discussed in the preceding paragraphs, we evaluate the following baselines: the IteratedGreedy algorithm of Gupta et al. (2010), and the lineartime (1/e − ε)-approximation algorithm FastRandomGreedy of Buchbinder et al. (2015).These algorithms are both O (k)-adaptive, where k is the cardinality constraint.
The algorithm of Ene and Nguyen (2020) requires access to an oracle for the multilinear extension and its gradient.In the case of maximum cut, the multilinear extension and its gradient can be computed in closed form in time linear in the size of the graph, as described in Appendix G.This fact enables us to evaluate the algorithm of Ene and Nguyen (2020) using direct oracle access to the multilinear extension and its gradient on the maximum cut application.However, no closed form exists for the multilinear extension of the revenue maximization objective.In this case, we found (see Appendix G.1) that sampling to approximate the multilinear extension is exorbitant in terms of runtime; hence, we were unable to evaluate Ene and Nguyen (2020) on revenue maximization.
For all algorithms, the accuracy parameter ε was set to 0.1; the failure probability parameter δ was set to 0.1; 100 samples were used to evaluate expectations for Threshold-Sampling in AdaptiveNonmonotoneMax (thus, this algorithm was run as heuristics with no performance guarantee).Further, in the algorithms AdaptiveThresholdGreedy, ParCardinal, and AdaptiveNonmonotoneMax, we ignored the smaller values of ε, δ in each algorithm, and simply used the input values of ε and δ.For AdaptiveThresholdGreedy and ParCardinal, by using the best solution value found so far as a lower bound on OPT, we used an early termination condition to check if the threshold value τ < αOPT(1 − ε)/k, where α is the approximation ratio for each algorithm.This early termination condition is responsible for the high variance in total queries.We attempted to use the same sharper upper bound on OPT/k as our algorithms in AdaptiveNonmonotoneMax, but it resulted in significantly worse objective values, so we simply used the maximum singleton as described in Fahrbach et al. (2019).ParCardinal is generalized by an algorithm that deals with knapsack constraints.By calling threshold sampling algorithm a large number of times, ParCardinal is able to achieve a constant probability on certain events.Hence, it is comparatively less efficient than algorithms dealing with cardinality constraints.For our experiments, we ran only ParCardinal(v1) and ParCardinal(v2) on BA and ca-GrQc datasets.
Randomized algorithms are averaged over 20 independent repetitions, and the mean is reported.The standard deviation is indicated by a shaded region in the plots.Any algorithm that requires a subroutine for UnconstrainedMax is implemented to use a random set, following the setting used for AST and ATG.

Applications and Datasets
Maxcut.The cardinality-constrained maximum cut function is defined as follows.Given graph G = (V, E), and nonnegative edge weight w ij on each edge (i, j) ∈ E. For S ⊆ V , let In general, this is a non-monotone, submodular function.In our implementation, all edges have a weight of 1.
Revmax.The revenue maximization objective is defined as follows.Let graph G = (V, E) represent a social network, with nonnegative edge weight w ij on each edge (i, j) ∈ E. We use the concave graph model introduced by Hartline et al. (2008).In this model, each user i ∈ V is associated with a non-negative, concave function f i : R + → R + .The value v i (S) = f i ( j∈S w ij ) encodes how likely the user i is to buy a product if the set S has adopted it.Then the total revenue for seeding a set S is This is a non-monotone, submodular function.In our implementation, each edge weight w ij ∈ (0, 1) is chosen uniformly randomly; further, f i (•) = (•) α i , where α i ∈ (0, 1) is chosen uniformly randomly for each user i ∈ V .2020) is run with oracle access to the multilinear extension and its gradient; total queries reported for this algorithm are queries to these oracles, rather than the original set function.The legend in Fig. 2(b) applies to all other subfigures.

Main Results
In Fig. 2, we show representative results for cardinality-constrained maximum cut on web-Google (n = 875713) for both small and large k values.Results on other datasets and revenue maximization are given in Fig. 4 and 3.In addition, results for Ene and Nguyen (2020) when the multilinear extension is approximated via sampling are given in Appendix G.1.
The algorithms are evaluated by objective value of solution, total queries made to the oracle, and the number of adaptive rounds (lower is better).Objective value is normalized by that of IteratedGreedy.
In terms of objective value (Figs.2(a) and 2(d)), our algorithm ATG maintained better than 0.99 of the IteratedGreedy value, while all other algorithms fell below 0.95 of the IteratedGreedy value on some instances.Our algorithm AST obtained similar objective    is interesting to observe that the two algorithms with the best approximation ratio of 1/e, Ene and Nguyen (2020) and FastRandomGreedy, returned the worst objective values on larger k (Fig. 2(d)).For total queries (Fig. 2(e)), the most efficient is Ene and Nguyen (2020), although it does not query the set function directly, but the multilinear extension and its gradient.The most efficient of the combinatorial algorithms was AST, followed by ATG.Finally, with respect to the number of adaptive rounds (Fig. 2(f)), the best was AdaptiveNonmonotoneMax, closely followed by AST; the next lowest was ATG, followed by Ene and Nguyen (2020).
The results in Fig. 4 and 3 are qualitatively similar.Regarding the ParCardinal algorithms, the results in Fig. 4 demonstrate that ParCardinal(v2) is highly parallelizable.However, despite achieving a 0.172 approximation ratio, the objective values of ParCardinal(v1) and ParCardinal(v2) fell below 0.85 of the IteratedGreedy.Because of a constant number of call repetitions to ThreshSeq in ParCardinal, these two algorithms are the most query inefficient and are roughly two to three orders of magnitude worse than our algorithms.
5.5 Comparison of Different Threshold Sampling Procedures.As for adaptive rounds, ThreshSeq, Threshold-Sampling, and TS-AMA-v1 all run in O (log(n)) rounds, while TS-AMA-v2 runs in O log 2 (n) rounds.By the results in Figs.5(b) and 5(e), our ThreshSeq is the most highly parallelizable algorithm, followed by TS-AMA-v1.TS-AMA-v2 is significantly worst as what it is in theory.Perhaps the reason why our algorithm performed better in practical settings can be attributed to the following factors.Theoretically, all algorithms except for TS-AMA-v2 have the same order of However, the adaptivity of each algorithm is associated with different constants, which in turn depend on the design of the algorithm.Our algorithm maintains two sets A, A ′ ⊆ A during its execution.At the beginning of each iteration, the filtration step is with respect to set A, which contains elements with negative marginal gains; at the end of the algorithm, elements with negative marginal gains are excluded from A to get A ′ , which is used to get the average marginal gains of the solution.From an experimental point of view, this implementation allows us to filter out more elements after one round while maintaining the same average marginal gain.With respect to the query calls, while our ThreshSeq only queries once for each prefix, Threshold-Sampling queries 16⌈log(2/ δ)/ε 2 ⌉ times, and both TS-AMA-v1 and TS-AMA-v2 query |V | times.According to Figs. 5(c) and 5(f), our ThreshSeq is the most query efficient one among all.Also, the total queries do not increase a lot when k increases.With binary search, TS-AMA-v2 is the second best one which has O n log 2 (n) query complexity.As for  Threshold-Sampling, with the input values as n = 968, k = 10, and ε = δ = 0.1, it queries about 2 × 10 5 times for each prefix which is significantly large.
Regarding to different ATG algorithms, they all return the competitive solutions compared with IteratedGreedy; see Figs. 5(g) and 5(j).Since each iteration of ATG calls a threshold sampling subroutine which is based on the solution of previous iterations and a slowly decreasing threshold τ , after the first filtration of the subroutine, the size of the candidate set is limited.Thus, there is no significant difference between different ATGs concerning rounds and queries.However, there are two exceptions.First, since TS-AMA-v2 is the only one who has O log 2 (n) adaptive rounds, it still runs with more rounds; see Figs. 5(h) and 5(k).Also, the number of queries of ATG with Threshold-Sampling is significantly large with the same reason discussed before.
Among all, ThreshSeq proposed in this paper is not only the best theoretically, but also performs well in experiments compared with the pre-existing threshold sampling algorithms.

Discussion and Future Directions
In this paper, we propose a new threshold sampling algorithm, ThreshSeq, which solves Threshold on non-monotone instances with high probability, optimal adaptivity, and query complexity.Different from other state-of-the-art thresholding algorithms, Thresh-Seq is based on maintaining two sets that separately solve Threshold.Then, we propose two approximation algorithms AdaptiveSimpleThreshold and AdaptiveThreshold-Greedy that are inspired by IteratedGreedy.
Compared to state-of-the-art algorithms, our ThreshSeq exhibits the highest query efficiency with relatively fewer adaptive rounds; ATG produces results that are almost identical to IteratedGreedy in terms of objective value, and, relatively speaking, is the most query efficient combinatorial algorithm; AST is the second most parallelizable algorithm among all algorithms and delivers reasonably good objective values.Despite demonstrating good results, it should be noted that our approximation algorithms rely on the Unconstrained-Max subroutine, which requires access to the multilinear extension and may be impractical in certain settings.So, in the experiment, we substituted it with a random subset sampling approach which provides an expected (1/4)-approximation ratio.However, this substitution may result in a decrease in the objective value during the experiment.
Further investigations are needed in our work and there is still significant room for improvement.For instance, in non-monotone submodular maximization problems, using the objective value of max singleton to guess OPT is a common practice that involves O (log(n)) guesses.If we can reduce the number of guesses to a constant, the query complexity can be improved significantly by a factor of O (log(n)).Additionally, the current theoretical best approximation ratio is 0.385.In our paper, the best we proposed is a (0.193 − ε)−approximation algorithm ((0.139 − ε)−approximation in our experiment with a random subset unconstraint algorithm).Hence, the question remains interesting: Can we parallelize other algorithms that provide a better approximation ratio?
In our paper, we focus on the number of queries and the query parallelizability assuming that the oracle computation time dominates the overall computation duration.However, in practical scenarios, the function representation significantly influences algorithm performance (e.g. the multilinear extension and its gradient have closed forms for maximum cut).Thus, for any particular application, there is a lot of room for improvement based on such specific representation of the submodular function.
In Fahrbach et al. (2019) and Kuhnle (2021), the above lemma is used with non-monotone submodular functions; however, in the case that f is non-monotone, the lemma does not hold.Alg. 5 only checks (on Line 13) if there is more than a constant fraction of elements whose marginal gains are larger than the threshold τ .If there exist elements with large magnitude, negative marginal gains, then the average marginal gain may fail to satisfy the lower bound in Lemma 16.As for the proof in Fahrbach et al. (2019), the following inequality does not hold (needed for the proof of Lemma 3.3 of Fahrbach et al. (2019)): where |T | = t * and t ≥ t * /(1 + ε).Next, we give a counterexample for the two versions of Threshold-Sampling used in Fahrbach et al. (2019) and Fahrbach et al. (2019) where the only difference is that the if condition in Alg.6 on Line 9 changes to |A| < 3k in Fahrbach et al. (2019).
Counterexample 1. Define a set function f : 2 N → R + as follows, where a ∈ N , Thus, f is a non-negative, non-monotone submodular function.
Consider the first iteration of the outer for loop, where S = ∅, and A = N after Line 8. For any So, with any value of ε, ReducedMean returns true when t > εn/2.The first round of Threshold-Sampling samples a set T 1 with t ′ 1 = |T 1 | > εn/2.Then update S by S = T 1 .
For the Threshold-Sampling in Fahrbach et al. (2019) with stop condition |A| < 3k, the algorithm stopped here after the first iteration, no matter what is sampled.In this case, the expectation of marginal gains of the set returned by the algorithm would be as follows, Next, we consider the Threshold-Sampling with stop condition |A| = 0.After the first iteration discussed above, if a ∈ T 1 , all the elements would be filtered out at the second round.Algorithm stoped here and returned S, say S 1 .If a ̸ ∈ T 1 , T 1 and a would be filtered out at the second round, which means A = N \(S ∪ {a}).And for any T ⊆ A and x ∈ A\T , Therefore, E [I t ] = 1 for all t.After several iterations, S = N \{a} would be returned, say S 2 .
The expectation of objective value of the set returned would be as follows, Appendix C. Proofs for Section 2 Lemma 4. Given V after random-permutation on Line 8, let proof of Lemma 4. After filtering on Line 5, any element Thus, for any x ∈ S i−1 , it holds that x ∈ S i , which means S i−1 ⊆ S i .
proof of Lemma 5.Call an element v i ∈ V bad iff ∆ (v i | A ∪ T i−1 ) < τ ; and good, otherwise.
The random permutation of V can be regarded as |V | dependent Bernoulli trials, with IteratedGreedy works as follows.First a standard greedy procedure is run which produces set A of size k.Next, a second greedy procedure is run to yield set B; during this second procedure, elements of A are ignored.A subroutine for UnconstrainedMax is used on f restricted to A, which yields set A ′ .Finally the set of {A, A ′ , B} that maximizes f is returned.
Then by using this procedure as a subroutine, the algorithm IteratedGreedy has approximation ratio e−1 e(2+α)−α for SMCC.
Proof.For 1 ≤ i ≤ k, let a i , b i be as chosen during the run of IteratedGreedy.Define where the first inequality follows from the greedy choices, the second follows from submodularity, and the third follows from submodularity and the fact that A i ∩ B i = ∅.Hence, from this recurrence and standard arguments, Proof of Lemma 9. Since each element in A ′ has nonnegative marginal gain, it always holds that f (A ′ j ) ≥ f (A ′ j−1 ).From Lemma 6, there are at least (1 − ε)-fraction of A are good elements.Therefore, there are at least (1 − ε)k of a ′ j which is good element or dummy element.Next, let's consider the following 3 cases of a ′ j .Case i(j) = 1 and a ′ j is good.By Theorem 3 and Lemma 6, it holds that Case i(j) > 1 and a ′ j is good.Since a ′ j is returned at iteration i(j) and a ′ j is good, it holds that: (1) f (A ′ j ) − f (A ′ j−1 ) ≥ τ i(j) ; (2) at previous iteration i(j) − 1, ThreshSeq returns S i(j)−1 that |S i(j)−1 | < k − |A i(j)−2 |.By property (2) and Theorem 3, for any o ∈ O\A i(j)−1 , ∆ o | A i(j)−1 < τ i(j)−1 .Then, where Inequality 10 follows from the proof of Lemma 6, and Inequality 11 follows from A ′ i(j)−1 ⊆ A ′ j−1 .
Case i(j) = ℓ + 1 (or a ′ j is dummy element).In this case, |A| < k when the first for loop ends.So, ThreshSeq in the last iteration returns S ℓ that |S ℓ | < k − |A ℓ−1 |.From Theorem 3, it holds that ∆ (o | A ℓ ) < τ ℓ < M ck , for any o ∈ O\A ℓ .Thus, where Inequality (a) follows from A ℓ = A and f (A ′ j ) = f (A ′ ).The first inequality of Lemma 9 holds in those three cases with at least (1 − ε ′ )k of j.
Proof of ( 14).Let λ = 1 − 1/e, κ = e −(1−ε ′ ) 2 .Inequality ( 14) is satisfied iff.Appendix G. Multilinear Extension and Implementation of Ene and Nguyen (2020) In this section, we describe the multilinear extension and implementation of Ene and Nguyen (2020).The multilinear extension F of set function f is defined to be, for x ∈ [0, 1] n : ).In all plots, the x-axis shows the number of samples used to approximate the multilinear extension.
unless using this approximation required evaluations outside the unit cube, in which case the forward or backward difference approximations were used.The parameter γ is set to 0.5.
Finally, for the maximum cut application, closed forms expressions exist for both the multilinear extension and its gradient.These are: (1 − 2x v ).
Implementation.The algorithm was implemented as specified in the pseudocode on page 19 of the arXiv version of Ene and Nguyen (2020).We followed the same parameter choices as in Ene and Nguyen (2020), although we set ε = 0.1 as setting it to 0.05 did not improve the objective value significantly but caused a large increase in runtime and adaptive rounds.The value of δ = ε 3 was used after communications with the authors.

G.1 Additional Experiments
In this section, we further investigate the performance of Ene and Nguyen (2020) when closed-form evaluation of the multilinear extension and its gradient are impossible.It is known that sampling to approximate the multilinear extension and its gradient is extremely inefficient or yields poor solution quality with a small number of samples.For this reason, we exclude this algorithm from our revenue maximization experiments.To perform this evaluation, we compared versions of the algorithm of Ene and Nguyen (2020) that use varying number of samples to approximate the multilinear extension.
Results are shown in Fig. 6 on a very small random graph with n = 87 and k = 10.The figure shows the objective value and total queries to the set function vs. the number of samples used to approximate the multilinear extension.There is a clear tradeoff between the solution quality and the number of queries required; at 10 3 samples per evaluation, the algorithm matches the objective value of the version with the exact oracle; however, even at roughly 10 11 queries (corresponding to 10 4 samples for each evaluation of the multilinear extension), the algorithm of Ene and Nguyen (2020) is unable to exceed 0.8 of the IteratedGreedy value.
On the other hand, if ≤ 10 samples are used to approximate the multilinear extension, the algorithm is unable to exceed 0.5 of the IteratedGreedy value and still requires on the order of 10 7 queries.
Fahrbach et al. (2019), the algorithm ofEne and Nguyen (2020), and two versions of ParCardinal in Amanatidis et al. (2021) (ParCardinal(v1) represents the one without binary search and ParCardinal(v2) is the one with binary search).Also, we compare four versions of our algorithms with different threshold procedures: Threshold-Sampling of Fahrbach et al. (2019), two versions of threshold sampling algorithms of Amanatidis et al. (2021), and ThreshSeq proposed in this paper.Our results are summarized as follows. 1

Figure 2 :
Figure 2: Comparison of objective value (normalized by the IteratedGreedy objective value), total queries, and adaptive rounds on web-Google for the maxcut application for both small and large k values.The large k values are given as a fraction of the number of nodes in the network.The algorithm of Ene and Nguyen (2020) is run with oracle access to the multilinear extension and its gradient; total queries reported for this algorithm are queries to these oracles, rather than the original set function.The legend in Fig.2(b) applies to all other subfigures.

Figure 3 :
Figure 3: Results for revenue maximization on ca-Astro, for both small and large k values.Large k values are indicated by a fraction of the total number n of nodes.The legends in Fig. 2 and 4 apply.

Figure 4 :
Figure 4: Additional results for maximum cut on BA and ca-GrQc with ParCardinal algorithms.

Fig. 5
Fig.5shows the results of AST and ATG with different threshold sampling procedures for cardinality-constrained maximum cut on two datasets, BA (n = 968) and ca-GrQc (n = 5242).All the algorithms are run according to pseudocode without any modification.TS-AMA-v1 and TS-AMA-v2 represent the ThreshSeq algorithms without and with binary search proposed inAmanatidis et al. (2021).

Figure 5 :
Figure 5: Results of AST and ATG with four threshold sampling procedures on two datasets.The algorithms are run strictly following pseudocode.The legends in Fig. 5(a) and 5(g) apply to all other subfigures.

( 1 −Figure 6 :
Figure 6: Comparison of our algorithms with Ene and Nguyen (2020) on a very small random graph (n = 87, k = 10).In all plots, the x-axis shows the number of samples used to approximate the multilinear extension.
Both the adaptivity and query complexity values presented in this table are asymptotic.
Fahrbach et al. (2023)2019)) does not hold when the objective function is non-monotone.Counterexamples and pseudocode for Threshold-Sampling are given in Appendix B. A recent work byFahrbach et al. (2023)has modified the Threshold-Sampling algorithm and fixed the problem discussed above.Two alternative solutions to the non-monotone threshold problem were proposed in Amanatidis et al. (2021) for the case of non-monotone, submodular maximization subject to a knapsack constraint.Due to the complexity of the constraints, the thresholding procedures in Amanatidis et al. (2021) have a high time complexity and require O n 2 query calls within one iteration even when restricted to a size constraint.Although a variant with binary search is proposed to get fewer queries, the sequential binary search worsens the adaptivity of the algorithm.
Compared to our nearly linear algorithms, the first variant of ParCardinal requires total queries with more than quadratic dependence on n; and the second variant gets a worse approximation ratio and worse number of queries than our algorithm (ATG) with the same adaptivity.
Amanatidis et al. (2021) et al. (2021)proposed a parallelizable algorithm, ParCardinal, for knapsack constraints, which is the first constant factor approximation with optimal adaptivive complexity.In the paper, ParCardinal is directly applied to cardinality constraints.It achieves a 0.172 − ε ratio with two different variants: one has O (log(n)) adaptive rounds and O (nk log(n) log(k)) queries; another one has O (log(n) log(k)) adaptive rounds and O n log(n) log 2 (k) queries.
Algorithm 5 The ReducedMean algorithm ofFahrbach et al. (2019)1: Input: access to a Bernoulli distribution D, error ε, failure probability δ 2: Set number of samples m