1 Introduction

Coverage problems, with a multitude of variants, are fundamental in theoretical computer science, combinatorics, and operations research. These problems capture numerous resource-allocation applications, such as electricity division [2, 22], sensor allocation [20], program testing [18], and plant location [9].

Coverage problems entail identifying—for a given threshold \(T \in \mathbb {Z}_+\) and a set of elements [n]—a collection of subsets, \(F_1, F_2, \ldots , F_T \subseteq [n]\), that respect particular combinatorial constraints. Here, the problem objective is specified by considering, for each element \(i \in [n]\), the number of selected subsets, \(F_t\)-s, that contain i. For instance, in the classic maximum coverage problem [16], the subsets, \(F_1, \ldots , F_T\), are constrained to be from a given set family and the objective is to maximize the number of elements \(i \in [n]\) that are contained in at least one of the \(F_t\)-s, i.e., maximize \(| \cup _t F_t|\).

We study coverage problems where the ground set corresponds to a population of n agents and the cardinal valuation of each agent \(i \in [n]\) depends on the number of selected subsets that contain i, i.e., the valuation of i depends on the coverage that i receives across the \(F_t\)-s. Our overarching goal is to select subsets that, while satisfying combinatorial constraints, achieve fair and efficient coverage among the n agents.

Before detailing the model, we describe a stylized example that illustrates the applicability of the coverage framework. Consider an electricity grid operator tasked with apportioning electricity for T time periods among a set of n agents (consumers with varying electricity requirements). In a time period \(t \in [T]\), the total demand of the n agents can exceed the available supply and, hence, the grid operator must select a subset of agents, \(F_t \subseteq [n]\), whose electricity consumption can be fulfilled–agents in the subset \(F_t\) receive electricity during the tth time period and the remaining agents do not. An important desideratum in such load shedding scenarios is to achieve fairness along with economic efficiency; see the motivating work of Baghel et al. [2] for a thorough treatment of load shedding and its connections with the fair division literature. Indeed, the coverage framework provides an abstraction for this load shedding environment: for each \(t \in T\), the selected subset \(F_t\) must satisfy a knapsack constraintFootnote 1 and the cardinal preference of each agent \(i \in [n]\) is captured by the number of subsets that contain i, i.e., the number of time periods that i receives electricity.

Combinatorial Constraints. We study a coverage framework wherein, for each \(t \in [T]\), the tth selected subset, \(F_t \subseteq [n]\), must belong to a set family \(\mathcal {I}_t\), i.e., each \(\mathcal {I}_t \subseteq 2^{[n]}\) specifies the possible choices for the tth selection. Our results do not require the families \(\mathcal {I}_t\)-s to be given explicitly as input. Our results hold for any \(\mathcal {I}_t\)-s that admit a fully polynomial-time approximation scheme (FPTAS) for the weight maximization problem: given weights \(w_1, \ldots , w_n \in \mathbb {R}_+\), for the n agents, find \(\mathop {\mathrm {arg\,max}}\limits \nolimits _{X \in \mathcal {I}_t} \ \sum _{i \in X} w_i\).

For instance, if each \(\mathcal {I}_t\) contains the subsets that satisfy a knapsack constraint, then an FPTAS for weight maximization is known to exist [26]; in such a case weight maximization corresponds to the standard knapsack problem.Footnote 2 Furthermore, if the families \(\mathcal {I}_t\)-s are independent sets of matroids, then one can exactly solve the weight maximization problem in polynomial time [24]. It is relevant to note that matroids provide an expressive construct for numerous combinatorial constraints, e.g., cardinality and partition constraints. Hence, the coverage framework with matroids provides, by itself, an encompassing class of instances. Also, in instances wherein the sizes of the families \(\mathcal {I}_t\)-s are polynomially large, weight maximization can be efficiently solved by direct enumeration.

In addition, our result applies to settings that entail two-sided matchings: say, for each \(t \in [T]\), we have a bipartite graph \(G=(L \cup R, E)\), with \(L \cup R = [n]\), and the goal is to select a matching, i.e., agents covered by the matching constitute the tth selected subset. We can express this matching setting in the current framework by including, in each \(\mathcal {I}_t\), every subset of agents (i.e., subset of vertices in G) that is covered by some matching in G. Notably, such a formulation models two-sided markets [14, 25] such as (i) ridesharing platforms, wherein the agent set consists of both the vehicle drivers and the passengers and (ii) recommendation engines, in which producers are recommended to consumers. Our result holds in such matching settings, since here weight maximization can be optimally solved in polynomial time via a maximum-weight matching algorithm.Footnote 3

Agents’ Valuations. As mentioned previously, we address settings in which each agent’s valuation depends on the number of times it is covered among the selected subsets \(F_t\)-s. Specifically, for a solution \(\mathcal {F} = (F_1,\ldots , F_T) \in \mathcal {I}_1 \times \ldots \times \mathcal {I}_T\), agent i’s valuation is defined as \(v_i(\mathcal {F}) {:}{=}| \{t\in [T] : i \in F_t \} | + 1\). Note that the valuation of each agent is smoothed by adding 1. This smoothing enables us to achieve meaningful (multiplicative) approximation guarantees by shifting the valuations and, hence, the collective welfare away from zero. We also note that valuation smoothing has been considered in prior works in fair division; see, e.g., [10, 12], and [17].

Nash Social Welfare. With the overarching aim of achieving fairness along with economic efficiency in coverage instances, we address the problem of maximizing Nash social welfare (NSW). This welfare function is defined as the geometric mean of agents’ valuations and it achieves a balance between the extremes of social welfare (a well-studied objective for economic efficiency) and egalitarian welfare (a prominent fairness notion). NSW stands as a fundamental metric for quantifying the extent of fairness in numerous resource-allocation contexts; indeed, in recent years, NSW has been extensively studied in the fair division literature; see, e.g., [5, 7, 15, 19, 23] and many references therein.

Nash social welfare satisfies key fairness axioms, including scale freeness, symmetry, and the Pigou-Dalton transfer principle [21]. The Pigou-Dalton principle requires that the collective welfare should increase under a bounded transfer of value from a well-off agent i to a worse-off agent j. NSW satisfies this principle, since the geometric mean of a more balanced valuation profile (of the n agents) is higher than that of a skewed one. At the same time, if the increase in agent j’s value is significantly less than the drop experienced by i, then NSW does not increase. That is, NSW prefers solutionsFootnote 4 that have reduced inequality and, simultaneously, it accommodates for economic efficiency.

Furthermore, in various fair division contexts, prior works have shown that a solution that maximizes NSW satisfies additional fairness properties, e.g., [1, 7, 10, 13, 15]. Critically, the fact that Nash optimal solutions bear additional guarantees does not undermine the relevance of finding solutions with as high a Nash social welfare as possible. NSW cardinally ranks the solutions and, conforming to a welfarist perspective, one prefers solutions with higher NSW. Therefore, developing approximation guarantees for NSW maximization is a well-justified objective in and of itself.

1.1 Our Results and Techniques

We develop a constant-factor approximation algorithm for maximizing Nash social welfare in fair coverage instances. Given a set of n agents and threshold \(T \in \mathbb {Z}_+\), our algorithm (Algorithm 1) computes in polynomial time a solution \(\mathcal {F} = (F_1, \ldots , F_T) \in \mathcal {I}_1 \times \ldots \times \mathcal {I}_T\) whose Nash social welfare, \(\textrm{NSW}(\mathcal {F}) = \left( \prod _{i=1}^n v_i(\mathcal {F}) \right) ^{\frac{1}{n}}\), is at least \(\frac{1}{18 + o(1)}\) times the optimal (Theorem 1). As mentioned previously, the algorithm only requires blackbox access to an FPTAS for weight maximization over the set families \(\mathcal {I}_1, \ldots , \mathcal {I}_T \subseteq 2^{[n]}\).

The algorithm starts with an arbitrary solution and iteratively performs updates till it essentially reaches a local maximum of the log social welfare \(\varphi (\mathcal {F}) {:}{=}\sum _{i=1}^n \log \left( v_i(\mathcal {F}) \right) \). Here, for any solution \(\mathcal {F} = (F_1,\ldots ,F_T)\), a local update corresponds to replacing—for some \(\tau \in [T]\)—the subset \(F_\tau \) with some other subset \(A_\tau \in \mathcal {I}_\tau \). The algorithm performs the local updates by invoking, as a subroutine, the FPTAS for weight maximization.

It is relevant to note that while the algorithm is simple in design, its analysis entails novel insights. In particular, the domain of solutions, \(\mathcal {I}_1 \times \ldots \times \mathcal {I}_T\), is combinatorial and, hence, it is not obvious if a local maximum solution of \(\varphi \) upholds any global approximation guarantees for \(\varphi \), let alone for NSW. Furthermore, a multiplicative approximation bound for \(\varphi \) does not translate into a multiplicative guarantee for NSW: for any solution \(\mathcal {F}\), we have \(\frac{1}{n} \varphi \left( \mathcal {F} \right) = \log \left( \textrm{NSW}(\mathcal {F}) \right) \). Hence, even though a solution that (globally) maximizes \(\varphi \) also maximizes NSW, multiplicative approximation guarantees get exponentially worse when one moves from \(\varphi \) to NSW. This observation also implies that one cannot directly utilize the approximation guarantee known for the so-called concave coverage problem [3] to obtain a commensurate approximation ratio for NSW maximization.

Interestingly, in lieu of developing local-to-global approximation guarantees, we rely on counting arguments to establish the approximation ratio. We prove that, at a local maximum solution \(\mathcal {F}\) (of the function \(\varphi \)) and for any integer \(\alpha \ge 4\), the number of \(\alpha \)-suboptimal agents is at most \(n/\alpha \); here, an agent i is said to be \(\alpha \)-suboptimal iff i’s current valuation \(v_i(\mathcal {F})\) is (about) \({1}/{\alpha }\) times less than her optimal valuation. We complete the analysis by proving that these Markov-like bounds ensure that the computed solution \(\mathcal {F}\) achieves an \(\left( 18 + o(1)\right) \)-approximation guarantee for NSW maximization.

In addition, we complement the algorithmic result by proving that, in fair coverage instances, NSW maximization is APX-hard (Theorem 2). This inapproximability result rules out a polynomial-time approximation scheme (PTAS) for NSW maximization in fair coverage instances.

1.2 Additional Related Work and Applications

The coverage framework generalizes the well-motivated setup of public decision making [8], albeit for agents that have binary additive valuations. The public decision making setup captures settings wherein decisions have to be made on T social issues, that can impact many of the n agents simultaneously. Specifically, each issue \(t \in [T]\) is associated with a set of alternatives \(A_t = \{a^1_t, a^2_t,\ldots , a_t^{\ell _t} \}\) and every agent \(i\in [n]\) has an additive valuation over the issues. That is, for any outcome \(\mathcal {A}=(a_1,a_2,\ldots ,a_T) \in A_1 \times A_2 \times \ldots \times A_T\), agent i’s utility is \(u_i(\mathcal {A}) = \sum _{t=1}^T u_i^t(a_t)\); here \(u^t_i(a_t) \in \mathbb {R}_+\) is the utility that i gains from the alternative \(a_t \in A_t\).

Indeed, for agents \(i \in [n]\) with binary additive valuations (i.e., \(u^t_i(a) \in \{0,1\}\) for all t and \(a \in A_t\)) the coverage framework generalizes public decision making: for every \(t \in [T]\), define the set family \(\mathcal {I}_t\) by including in it the set \(F_a {:}{=}\{ i \in [n] : u_i^t(a) = 1 \}\) for each \(a \in A_t\). In particular, \(\mathcal {I}_t\) contains a set \(F_a\), for each alternative \(a \in A_t\), where \(F_a\) is the set of agents that value alternative a. This reduction gives us set families of polynomial size (\(|\mathcal {I}_t| = |A_t|\)) and, hence, our results specialize to this case.

In the public decision making context, Conitzer et al. [8] obtain fairness guarantees in terms of relaxations of proportionality. They also show that Nash optimal solutions bear particular fairness properties. Complementing these results and for agents with (smoothed) binary additive valuations, the current work obtains approximation guarantees for NSW in public decision making.

The coverage framework also encompasses the standard fair division setting that entails allocation of m indivisible goods among n agents that have binary additive valuations. Multiple prior works have studied NSW in this discrete fair division setting; see, e.g., [6, 15]. Here, each agent \(i \in [n]\) prefers a subset of the goods \(V_i \subseteq [m]\) and agent i’s valuation \(u_i(S) = |S \cap V_i|\), for any \(S \subseteq [m]\). One can express this setting as a coverage instance by considering \(T = m\) set families each comprised of singleton subsets. Specifically, for each good \(g \in [m]\), we have a set family \(\mathcal {I}_g\) that includes all singletons \(\{i\}\) with the property that \(g \in V_i\), i.e., subset \(\{i\}\) is included in \(\mathcal {I}_g\) iff agent i values good g. As in the public decision making setting, here we obtain a coverage instance with polynomially large \(\mathcal {I}_t\)-s.

With Nash welfare as a notion of fairness, Fluschnik et al. [12] study fair selection of indivisible goods under a knapsack constraint.Footnote 5 By contrast, the current work addresses combinatorial constraints over subsets of agents.

2 Notation and Preliminaries

An instance of a fair coverage problem is specified as a tuple \(\langle [n], T, \{\mathcal {I}_t\}_{t=1}^{T}\rangle \), where \([n] = \{1,2,\ldots ,n\}\) denotes the set of agents and \(T \in \mathbb {Z}_+\) denotes the number of subsets (of the agents) to be selected. Here, for each \(t \in [T]\), the tth selected subset (say \(F_t \subseteq [n]\)) is constrained to be from the family \(\mathcal {I}_t\), i.e., each \(\mathcal {I}_t \subseteq 2^{[n]}\) specifies the possible choices for the tth selection. It is not necessary that the set families \(\mathcal {I}_t\)-s are given explicitly; our algorithmic result only requires a blackbox access to an FPTAS for weight maximization over \(\mathcal {I}_t\)-s.

For a fair coverage instance \(\langle [n], T, \{\mathcal {I}_t\}_{t=1}^{T}\rangle \), a solution \(\mathcal {F} = (F_1,F_2,\ldots , F_T)\) is a tuple with the property that \(F_t \in \mathcal {I}_t\) for all \(t \in [T]\). We address settings wherein the valuation of each agent depends on the number of times it is covered among the selected subsets. Specifically, for a solution \(\mathcal {F} = (F_1,F_2,\ldots , F_T)\), the coverage value \(v_i(\mathcal {F})\), of agent \(i \in [n]\), is defined as \(v_i(\mathcal {F}) {:}{=}| \{t\in [T] : i \in F_t \} | + 1\). Note the coverage value of each agent is smoothed by adding 1. This smoothing ensures that the Nash social welfare of any solution is nonzero. We, in fact, show that if each agent’s value is equated to exactly the number of times it is covered among the subsets, then one cannot achieve any multiplicative approximation guarantee for Nash social welfare maximization (refer to the full version [4]).

The Nash social welfare (NSW) of a solution \(\mathcal {F}\) is defined as the geometric mean of the agents’ coverage values, \(\textsc {NSW}\left( \mathcal {F} \right) {:}{=}\left( \prod \limits _{i=1}^n v_i(\mathcal {F}) \right) ^\frac{1}{n}\). We will write \(\mathcal {F}^* = (F_1^*, F_2^*, \ldots , F_T^*)\) to denote a solution that maximizes the Nash social welfare in a given fair coverage instance. Furthermore, a solution \(\widehat{\mathcal {F}}\) is said to achieve a \(\gamma \)-approximation guarantee for the Nash social welfare maximization problem iff \(\textsc {NSW}(\widehat{\mathcal {F}}) \ge \frac{1}{\gamma }\textsc {NSW}\left( \mathcal {F}^*\right) \). The current work develops a constant-factor approximation algorithm for NSW maximization in fair coverage instances.

As mentioned previously, the algorithm works with a blackbox access to an FPTAS for weight maximization over \(\mathcal {I}_t\)-s. Specifically, with parameter \(\beta {:}{=}\frac{1}{64n T^{2}}\), we will write ApxMaxWt to denote a subroutine (blackbox) that takes as input weights \(w_1, \ldots ,w_n \in \mathbb {R}_+\), along with an index \(t \in [T]\), and finds a \((1 - \beta )\)-approximation to \(\max \limits _{X\in \mathcal {I}_t} \ \sum \limits _{i\in X} w_i\). The assumption that weight maximization over \(\mathcal {I}_t\)-s admits an FPTAS implies that a \((1 - \beta )\)-approximation (with \(\beta = \frac{1}{64n T^{2}}\)) can be computed in polynomial time.

For any solution \(\mathcal {F} = (F_1,\ldots ,F_T)\), index \(t \in [T]\), and subset \(X \in \mathcal {I}_t\), write \((X, \mathcal {F}_{-t})\) to denote the solution obtained by replacing \(F_t\) with X, i.e., \((X, \mathcal {F}_{-t}) {:}{=}(F_1,,\ldots , F_{t-1},X, F_{t+1},\ldots ,F_T)\). Finally, we will write \(\varphi (\mathcal {F})\) to denote the log social welfare of the agents under solution \(\mathcal {F}\), i.e., \(\varphi (\mathcal {F}) {:}{=}\sum \limits _{i=1}^n \log \left( v_i(\mathcal {F}) \right) \). Here, the logarithm is to the base e, i.e., we consider the natural logarithm of coverage values.

3 Approximation Algorithm for Nash Social Welfare

This section develops an \((18 + o(1))\)-approximation algorithm for maximizing Nash social welfare in fair coverage instances. Given any instance \(\langle [n], T, \{\mathcal {I}_t\}_{t=1}^{T}\rangle \), our algorithm Alg (Algorithm 1) starts with an arbitrary solution \(\mathcal {F} = (F_1,\ldots , F_T) \in \mathcal {I}_1 \times \ldots \times \mathcal {I}_T\) and iteratively performs local updates as long as it experiences a sufficient (additive) increase in the log social welfare \(\varphi \). Here, for any solution \(\mathcal {F} = (F_1,\ldots ,F_T)\), a local update corresponds to replacing—for some \(\tau \in [T]\)—the subset \(F_\tau \) with some other subset \(A_\tau \in \mathcal {I}_\tau \). For updating a solution \(\mathcal {F}\) and with \(\varphi \) as a guiding objective, the algorithm addresses the problem of finding, for every \(t \in [T]\), a subset \(A_t \in \mathcal {I}_t\) that achieves \(\max \limits _{X \in \mathcal {I}_t} \ \varphi (X, \mathcal {F}_{-t}) - \varphi (\mathcal {F})\). Notably, we reduce this problem to that of weight maximization over \(\mathcal {I}_t\)-s, by setting appropriate weights \(w^t_i\), for each agent \(i \in [n]\) and each index \(t \in [T]\). In particular, for a current solution \(\mathcal {F}=(F_1, \ldots , F_T)\), the algorithm sets the weights as follows

$$\begin{aligned} w^t_i = {\left\{ \begin{array}{ll} \log \left( v_i (\mathcal {F}) \right) \ - \ \log \left( v_i (\mathcal {F}) -1 \right) \ {} &{} \text { if } i \in F_t \\ \log \left( v_i (\mathcal {F}) + 1 \right) \ - \ \log \left( v_i(\mathcal {F}) \right) &{} \text { otherwise, if } i \in [n] \setminus F_t. \end{array}\right. } \end{aligned}$$

We note that for each agent \(i \in F_t\), the coverage value \(v_i(\mathcal {F}) \ge 2\); this follows from the inclusion of ‘\(+1\)’ in the definition of coverage value. Hence, the weights (specifically, the terms \(\log \left( v_i (\mathcal {F}) -1 \right) \) for \(i \in F_t\)) are well defined. This is a relevant implication of smoothing the coverage values.

Moreover, this weight assignment ensures that, for every subset \(X \subseteq [n]\), its weight \(\sum _{i \in X} w^t_i = \left( \varphi (X, \mathcal {F}_{-t}) - \varphi (\mathcal {F}) \right) + \sum _{j \in F_t} w^t_j\) (see Claim 3). Since the weight of the current subset \(F_t\) (i.e., \(\sum _{j \in F_t} w^t_j\)) is fixed, finding a subset \(X \in \mathcal {I}_t\) with maximum possible weight is equivalent to finding a subset that maximizes \(\varphi (X, \mathcal {F}_{-t}) - \varphi (\mathcal {F})\). In fact, we show that an FPTAS for this weight maximization suffices. As mentioned previously, we denote by \(\textsc {ApxMaxWt}(t, w^t_1, \ldots , w^t_n)\) a subroutine (blackbox) that takes as input weights \(w^t_1, \ldots , w^t_n \in \mathbb {R}_+\) and finds a \((1 - \beta )\)-approximation to \(\max \limits _{X\in \mathcal {I}_t} \ \sum \limits _{i\in X} w^t_i\); where the parameter \(\beta = \frac{1}{64n T^{2}}\).

Hence, for updating the solution \(\mathcal {F}=(F_1, \ldots , F_T)\), the algorithm invokes \(\textsc {ApxMaxWt}\) to obtain candidate subsets \(A_1, A_2, \ldots , A_T\). If, for some index \(\tau \in [T]\), replacing \(F_\tau \) by \(A_\tau \) leads to a sufficient additive increase \(\varphi \), then Alg updates the solution to \((A_\tau , \mathcal {F}_{-\tau })\). Specifically, the algorithm sets parameter \(\varepsilon {:}{=}\frac{1}{16 n T}\) and if \(\varphi \left( A_\tau , \mathcal {F}_{- \tau } \right) - \varphi (\mathcal {F}) \ge \frac{\varepsilon n}{8 T}\), then it updates the solution (see Lines 4 and 5 in Algorithm 1). Otherwise, if for all the candidate subsets the increase in \(\varphi \) is less than \(\frac{\varepsilon n}{8 T}\), the algorithm terminates.

Note that, for any solution \(\widehat{\mathcal {F}}\), the log social welfare \(\varphi (\widehat{\mathcal {F}})\) is at most \(n \log (T+1)\).Footnote 6 This observation, and the fact that in every iteration of \(\textsc {Alg}\) the log social welfare of the maintained solution increases by at least \(\frac{\varepsilon n}{8 T}\), imply that the algorithm terminates in polynomial time (Lemma 3). Overall, the algorithm efficiently finds a local maximum of \(\varphi \).

figure a

We establish the approximation ratio via counting arguments. In the analysis, for each maintained solution \(\mathcal {F}\), we consider the agents i whose current coverage value, \(v_i(\mathcal {F})\), is sufficiently smaller than their optimal coverage value, \(v_i(\mathcal {F}^*)\); recall that \(\mathcal {F}^*\) denotes a Nash optimal solution. In particular, for a solution \(\mathcal {F}\) and any integer \(\alpha \in \mathbb {Z}_+\), we will write \(S_\alpha ^\mathcal {F}\) to denote the subset of agents whose coverage value is \(\alpha (2.25 + \varepsilon )\) times less than their optimal, where, \(\varepsilon = \frac{1}{16 n T}\). Formally, for any \(\alpha \in \mathbb {Z}_+\), the set of \(\alpha \)-suboptimal agents is defined asFootnote 7

$$\begin{aligned} S^\mathcal {F}_\alpha&{:}{=}\left\{ i\in [n] : v_i (\mathcal {F}) < \frac{1}{\alpha (2.25 + \varepsilon )} v_i(\mathcal {F}^*) \right\} . \end{aligned}$$
(1)

First, we prove that, for any solution \(\mathcal {F}\) and any integer \(\alpha \ge 4\), if the number of \(\alpha \)-suboptimal agents is more than \(\frac{n}{\alpha }\), then there necessarily exists a local update that increases \(\varphi \) by a sufficient amount (Lemma 2). Contrapositively, we obtain that, for the solution finally obtained by Alg and for any \(\alpha \ge 4\), the number of \(\alpha \)-suboptimal agents is at most \(n/\alpha \). We complete the analysis by proving that this guarantee ensures that \(\textsc {Alg}\) achieves a constant-factor approximation ratio for NSW maximization; more formally, we will establish the following theorem (in Sect. 3.2).

Theorem 1 (Main Result)

Given any fair coverage instance \(\langle [n], T, \{\mathcal {I}_t\}_{t=1}^{T} \rangle \), with blackbox access to an FPTAS for weight maximization over \(\mathcal {I}_t\)-s, Alg (Algorithm 1) computes—in polynomial time—an \(\left( 18+ \frac{1}{2nT} \right) \)-approximate solution for the Nash social welfare maximization problem.

3.1 Algorithm’s Analysis

The following claim bounds the change in log social welfare \(\varphi \) when a solution is updated.

Claim 1

For a solution \(\mathcal {F} = (F_1,\ldots ,F_T)\), let value \(v_i {:}{=}v_i(\mathcal {F})\) for all agents \(i\in [n]\). Then, for any subset \(X \subseteq [n]\) and any index \(t \in [T]\), we have

$$\begin{aligned} \varphi (X,\mathcal {F}_{-t}) - \varphi (\mathcal {F})\ge \sum _{i\in X} \frac{1}{v_i+1} - \sum _{j\in F_t} \frac{1}{v_j-1}. \end{aligned}$$

The proof of Claim 1 is deferred to the full version of the paper. Note that here, for each agent \(j \in F_t\), the coverage value \(v_j(\mathcal {F}) \ge 2\) and, hence, the subtracted terms, \(\frac{1}{v_j-1}\), in the claim are well defined.

Next, we bound the expected change in \(\varphi \) when—for any solution \(\mathcal {F}=(F_1, \ldots , F_T)\)—we replace \(F_t\) by \(F^*_t\), for a \(t \in [T]\) chosen uniformly at random.

Lemma 1

For any solution \(\mathcal {F} = (F_1,\ldots ,F_T)\) and a Nash optimal solution \(\mathcal {F^*} = (F^*_1, \ldots ,F^*_T)\), let values \(v_i{:}{=}v_i(\mathcal {F})\) and \( v^*_i{:}{=}v_i(\mathcal {F^*})\), for all agents \(i\in [n]\). Then, uniformly sampling index t from the set [T], we obtain

$$\begin{aligned} \mathbb {E}_{t \in _R [T]} \ \Big [ \varphi (F^*_t,\mathcal {F}_{-t})-\varphi (\mathcal {F}) \Big ] \ge \frac{1}{T}\sum \limits _{i=1}^n \left( \frac{v^*_i-1}{v_i+1} \right) \ - \ \frac{n}{T}. \end{aligned}$$

Proof

Invoking Claim 1, with \(X = F^*_t\) for each \(t \in [T]\), we obtain

figure b

Index t is selected uniformly at random from the set [T]. Also, by definition, \(v_i^*\) is equal to 1 plus the number of subsets that contain i in the Nash optimal solution \(\mathcal {F}^*=(F^*_1, \ldots , F^*_T)\). Hence, the probability \(\mathbb {P}\{i\in F^*_t\} = \frac{v^*_i - 1}{T}\), for all agents \(i \in [n]\). Similarly, for the solution \(\mathcal {F}=(F_1, \ldots , F_T)\), we have \(\mathbb {P}\{j\in F_t\} = \frac{v_j -1}{T}\), for all \(j \in [n]\). These equations and inequality (2) give us

$$\begin{aligned} \mathbb {E}_{t\in _R [T]} \Big [\varphi (F^*_t,\mathcal {F}_{-t})-\varphi (\mathcal {F}) \Big ]&\ge \sum _{i\in [n]} \frac{v^*_i-1}{T} \cdot \frac{1}{v_i+1} - \sum _{j\in [n]:v_j\ge 2} \frac{v_j-1}{T} \cdot \frac{1}{v_j-1} \\&\ge \frac{1}{T}\sum _{i\in [n]} \left( \frac{v^*_i-1}{v_i+1}\right) \ - \ \frac{n}{T}. \end{aligned}$$

The lemma stands proved.

Next, we show that if, under a solution \(\mathcal {F}\), the number of \(\alpha \)-suboptimal agents is large, then the log social welfare can be sufficiently increased by replacing \(F_\tau \) with \(F^*_\tau \), for some \(\tau \in [T]\). Recall that \(\mathcal {F}^* = (F^*_1, \ldots , F^*_T)\) denotes a Nash optimal allocation and \(S^{\mathcal {F}}_\alpha \) denotes the set of \(\alpha \)-suboptimal agents under solution \(\mathcal {F}\); see Eq. (1).

Lemma 2

For any solution \(\mathcal {F}=(F_1,\ldots ,F_T)\) and any \(\alpha \ge 4\), if the number of \(\alpha \)-suboptimal agents is at least \(\frac{n}{\alpha }\) (i.e., \(|S^\mathcal {F}_\alpha |>\frac{n}{\alpha }\)), then there exists an index \(\tau \in [T]\) such that

$$\begin{aligned} \varphi (F^*_\tau , \mathcal {F}_{-\tau })-\varphi (\mathcal {F})\ge \dfrac{\varepsilon n}{2T}. \end{aligned}$$

Proof

Consider any solution \(\mathcal {F}\) and integer \(\alpha \ge 4\) such that \(|S^\mathcal {F}_\alpha |>\frac{n}{\alpha }\). For each agent \(i \in [n]\), write \(v_i {:}{=}v_i(\mathcal {F})\) and \(v^*_i = v_i(\mathcal {F}^*)\). Now, Lemma 1 gives us

figure c

Claim 2

For parameter \(\varepsilon \in (0,1)\) along with any integers \(\alpha \ge 4\) and \(v\ge 1\), we have

$$\begin{aligned} \frac{\alpha (2.25 + \varepsilon ) v -1}{v+1} \ge \left( 1+\frac{\varepsilon }{2}\right) \alpha . \end{aligned}$$

Claim 2 (proof appears in the full version [4]) shows that \(\frac{\alpha (2.25 + \varepsilon ) v -1}{v+1} \ge \left( 1+\frac{\varepsilon }{2}\right) \alpha \), for all integers \(\alpha \ge 4\) and \(v \ge 1\). Therefore, the above-mentioned inequality simplifies to

figure d

Therefore, there exists a \(\tau \in [T]\) such that

$$\begin{aligned} \varphi (F^*_\tau ,\mathcal {F}_{-\tau })-\varphi (\mathcal {F}) \ge \dfrac{\varepsilon n}{2T}. \end{aligned}$$

This completes the proof of the lemma.

Using Lemma 2, we will establish in Corollary 1, below, that the algorithm continues to iterate as long as the number of \(\alpha \)-suboptimal agents is more than \(n/\alpha \). The proof of the corollary also utilizes the following claim.

Claim 3

Let \(\mathcal {F}=(F_1, \ldots , F_T)\) be any solution considered in Alg (Algorithm 1) and, for all indices \(t \in [T]\) and agents \(i \in [n]\), let \(w^t_i\)-s be the corresponding weights set in Lines 2 or 7. Then, the weight of any subset \(X \subseteq [n]\) satisfies

$$\begin{aligned} \sum _{i \in X} w^t_i = \left( \varphi (X, \mathcal {F}_{-t}) - \varphi (\mathcal {F}) \right) + \sum _{j \in F_t} w^t_j. \end{aligned}$$

The proof of this claim appears in the full version of the paper.

Corollary 1

For any solution \(\mathcal {F}=(F_1,\ldots ,F_T)\) considered in Alg (Algorithm 1) and any \(\alpha \ge 4\), if the number of \(\alpha \)-suboptimal agents is at least \(\frac{n}{\alpha }\) (i.e., \(|S^\mathcal {F}_\alpha |>\frac{n}{\alpha }\)), then the execution condition in the while-loop (Line 4) of Alg holds.

Proof

Consider any solution \(\mathcal {F}\) in Alg and integer \(\alpha \ge 4\) such that \(|S^\mathcal {F}_\alpha |>\frac{n}{\alpha }\). In such a case, we will show that there exists an index \(\tau \in [T]\) for which the subset \(A_\tau \) returned by the subroutine \(\textsc {ApxMaxWt}(\tau , w^\tau _1, \ldots , w^\tau _n)\) (in Line 8) satisfies \(\varphi (A_\tau ,\mathcal {F}_{-\tau })-\varphi (\mathcal {F}) \ge \frac{\varepsilon n}{8T}\). Hence, the while-loop continues to iterate.

The desired index is in fact the one identified in Lemma 2. In particular, Lemma 2 ensures that for an index \(\tau \in [T]\) we have

$$\begin{aligned} \varphi (F^*_\tau , \mathcal {F}_{-\tau })-\varphi (\mathcal {F})&\ge \frac{\varepsilon n}{2T}. \end{aligned}$$
(3)

Now, Claim 3 (with \(X = F^*_\tau \)) gives us

figure e

Therefore,

$$\begin{aligned} \max _{X \in \mathcal {I}_\tau } \left\{ \sum \limits _{i\in X} w^\tau _i \right\}&\ge \sum _{i\in F_\tau } w^\tau _i + \frac{\varepsilon n}{2T}. \end{aligned}$$
(4)

Recall that \(\textsc {ApxMaxWt}(\tau , w^\tau _1, \ldots , w^\tau _n)\) returns a set \(A_\tau \in \mathcal {I}_\tau \) with the property that

$$\begin{aligned} \sum \limits _{i\in A_\tau } w^\tau _i&\ge \left( 1- \beta \right) \left( \max _{X\in \mathcal {I}_\tau } \ \sum \limits _{i\in X} w^\tau _i\right) . \end{aligned}$$
(5)

Here, parameter \(\beta = \frac{1}{64n T^{2}}\). Since \(\varepsilon = \frac{1}{16 n T}\), we have \(\beta = \frac{\varepsilon }{4T}\). Inequalities (4) and (5) give us

figure f

Applying Claim 3, with \(X = A_\tau \), we get \(\varphi (A_\tau ,\mathcal {F}_{-\tau })-\varphi (\mathcal {F})\ge \frac{\varepsilon n}{8T}\). Therefore, the execution condition in the while-loop of Alg holds. This establishes the corollary.

We conclude the section by showing that the algorithm runs in polynomial time.

Lemma 3 (Runtime Analysis)

Given any fair coverage instance \(\langle [n], T, \{\mathcal {I}_t\}_{t=1}^{T} \rangle \) with blackbox access to an FPTAS for weight maximization over \(\mathcal {I}_t\)-s, Alg (Algorithm 1) terminates in time that is polynomial in n and T.

Proof

For any solution \(\mathcal {F}\), the coverage values \(v_i(\mathcal {F}) \ge 1\), for agents \(i \in [n]\). Hence, for the initial solution (arbitrarily) selected by the algorithm, we have \(\varphi (\mathcal {F}) = \sum \limits _{i=1}^n \log (v_i(\mathcal {F})) \ge 0\). In addition, since the coverage values of the agents under any solution are at most \(T+1\), the log social welfare \(\varphi \) across all solutions is upper bounded by \(n \log (T+1)\). Furthermore, note that in every iteration of \(\textsc {Alg}\) the log social welfare of the maintained solution increases additively by at least \(\frac{\varepsilon n}{8 T}\). These observations imply that the algorithm terminates after \(O\left( n T^2 \log T\right) \) iterations; recall that \(\varepsilon = \frac{1}{16 n T}\). Since each iteration executes in polynomial time, the time complexity of the algorithm is polynomial in n and T. The lemma stands proved.

3.2 Proof of Theorem 1

This section establishes the approximation ratio of Alg. For the given fair coverage instance, let \(\mathcal {F}= (F_1,\ldots ,F_T)\) be the solution returned by Alg and \(\mathcal {F}^* = (F^*_1,\ldots ,F^*_T)\) be a Nash optimal allocation. Note that \(v_i(\mathcal {F})\ge 1\) and \(v_i(\mathcal {F}^*)\le T+1\), for all agents \(i\in [n]\). Hence, for each agent \(i \in [n]\), the following bound holds: \(v_i(\mathcal {F}) \ge \frac{1}{T+1}v_i(\mathcal {F}^*)\).

We partition the set of agents [n] considering the multiplicative gap between the coverage values under \(\mathcal {F}\) and \(\mathcal {F}^*\). Specifically, for each integer \(d\in \left\{ 2,3,\ldots , \lceil \log (T +1) \rceil \right\} \), define the set

$$\begin{aligned} X_{2^d}&{:}{=}\left\{ i\in [n] : \frac{1}{2^{d+1}}\frac{v_i(\mathcal {F}^*)}{(2.25+\varepsilon )} \le v_i(\mathcal {F}) < \frac{1}{2^{d}} \frac{v_i(\mathcal {F}^*)}{(2.25+\varepsilon )}\right\} . \end{aligned}$$

Furthermore, write \(X' {:}{=}[n] \setminus \left( \bigcup \limits _{d=2}^{\lceil \log {(T+1)}\rceil }X_{2^d} \right) \). Since all agents i satisfy \(v_i(\mathcal {F})\ge \frac{1}{T+1}v_i(\mathcal {F}^*)\), the subset \(X'\) only contains agents \(j \in [n]\) with the property that \(v_j(\mathcal {F}) \ge \frac{1}{4}\frac{v_j(\mathcal {F}^*)}{(2.25+\varepsilon )}\). Also, note that the subsets \(X_{2^d}\)-s and \(X'\) form a partition of the set of agents [n]; in particular, \(|X'|+\sum _{d\ge 2} |X_{2^d}| = n\).

Recall that \(S^\mathcal {F}_\alpha \) denotes the set of \(\alpha \)-suboptimal agents (see Eq. (1)). Also, note that, with \(\alpha = 2^d\), we have \(X_\alpha \subseteq S^\mathcal {F}_\alpha \). Moreover, by the contrapositive of Corollary 1, for the solution \(\mathcal {F}= (F_1,\ldots ,F_T)\), returned by Alg, we have

$$\begin{aligned} \left| X_{2^d} \right|&\le \left| S^\mathcal {F}_{2^d} \right| \ \le \frac{n}{2^d} \qquad \ \text { for all } 2\le d\le \lceil \log {(T+1)} \rceil \end{aligned}$$
(6)

For any subset of agents \(Y \subseteq [n]\), write \(\rho (Y) {:}{=}\prod _{i\in Y} \frac{v_i(\mathcal {F})}{v_i(\mathcal {F}^*)}\), if subset \(Y \ne \emptyset \). Otherwise, if \(Y = \emptyset \), define \(\rho (Y) {:}{=}1\). To bound the approximation ratio of the algorithm, we consider

figure g

Claim 4

For any integer \(\ell \ge 2\), we have \(\prod \limits _{d=2}^\ell \left( \frac{1}{2^{d-1}}\right) ^\frac{1}{2^d}\ge \frac{1}{2}\).

The proof of Claim 4 appears in the full version of the paper. Hence, the stated approximation ratio follows

$$\begin{aligned} \frac{\textsc {NSW}(\mathcal {F})}{\textsc {NSW}{(\mathcal {F}^*)}} \ge \frac{1}{9+4\varepsilon }\left( \prod \limits _{d = 2}^{\lceil \log (T+1) \rceil }\left( \frac{1}{2^{d-1}} \right) ^\frac{1}{2^d}\right) \ge \frac{1}{9+4\varepsilon } \cdot \frac{1}{2} = \frac{1}{18 + 8 \varepsilon }. \end{aligned}$$

4 APX-Hardness of Fair Coverage

This section shows that NSW maximization in fair coverage instances is APX-hard. In particular, we prove that there exists an absolute constant \(\gamma >1\) such that it is NP-hard to approximate the problem within factor \(\gamma \). Hence, a constant-factor approximation is the best one can hope for NSW maximization in fair coverage instances, unless \(\textrm{P} = \textrm{NP}\). The hardness result is obtained via an approximation preserving reduction from the following gap version of the maximum coverage problem.

Maximum k-Coverage [11]: Given a universe of elements \(U = \{1,2,\ldots ,n\}\), a threshold \(k \in \mathbb {Z}_+\), and a set family \(\mathcal {S} = \big \{S_\ell \subseteq [n]\big \}_{\ell =1}^N\), it is NP-hard to distinguish between

  • YES Instances: There exists a collection of k subsets in \(\mathcal {S}\) that covers all the elements, i.e., the union of the k subsets is equal to [n].

  • NO Instances: Any collection of k subsets from \(\mathcal {S}\) covers at most \(\left( 1-\frac{1}{e}\right) n\) elements, i.e., the union of any k subsets from \(\mathcal {S}\) has cardinality at most \(\left( 1-\frac{1}{e}\right) n\).

This hardness result of Feige [11] holds even for instances that satisfy the following properties: (i) all the subsets in \(\mathcal {S}\) have the same size \(\tau \), i.e., \(|S_\ell | = \tau \) for all subsets \(S_\ell \in \mathcal {S}\), and (ii) the threshold \(k = n/\tau \). Properties (i) and (ii) will be utilized in our approximation preserving reduction.Footnote 8

The APX-hardness result is established next. Notably, this negative result is applicable even for fair coverage instances in which the set families \(\mathcal {I}_t\)-s are explicitly given as input.

Theorem 2

In fair coverage instances, it is NP-hard to approximate the maximum Nash social welfare within a factor of 1.092.

Proof

Given an instance of the maximum k-coverage problem with universe \(U=\{1,2,\ldots ,n\}\) and set family \(\mathcal {S} =\{S_1,S_2,\ldots ,S_N\}\) of \(\tau \)-sized subsets of [n], we construct a fair coverage instance with n agents and \(T = k\). Since threshold \(k = \frac{n}{\tau }\), we have \(T = \frac{n}{\tau }\). To complete the construction and obtain an instance \(\langle [n], T, \{\mathcal {I}_t\}_{t=1}^{T}\rangle \), we set the families \(\mathcal {I}_t = \mathcal {S}\), for all \(t \in [T]\).

First, we show that if the underlying maximum coverage instance is a YES instance, then the optimal NSW in the constructed fair coverage instance is at least 2. Note that in the YES case there exists a size-k collection \(\mathcal {S}'=\{S'_1, S'_2, \ldots , S'_k \}\subseteq \mathcal {S}\) that covers all of [n]. Also, by construction, \(T = k\) and \(\mathcal {I}_t = \mathcal {S}\) for all \(1 \le t \le k\). Hence, for each \(t \in [T]\), we have \(S'_t \in \mathcal {I}_t\). Therefore, the tuple \(\mathcal {F}' = (S'_1, S'_2, \ldots , S'_k)\) is a solution under which \(v_i(\mathcal {F}') \ge 2\), for all agents \(i \in [n]\).Footnote 9 This bound on the coverage value of the agents implies that in the current case, the optimal Nash social welfare is at least 2.

Now, we show that in the NO case the optimal NSW is at most c, for an absolute constant \(c<2\). Here, consider any solution \(\mathcal {F}= (F_1,\ldots ,F_{T})\) in the constructed fair coverage instance. We have \(T = k = \frac{n}{\tau }\) and, by construction, \(F_t \in \mathcal {S}\). Furthermore, given that we are in the NO case, the collection of subsets \(\{F_1, F_2, \ldots , F_T\} \subseteq \mathcal {S}\) covers at most \((1-\frac{1}{e})n\) elements. Let L denote the set of agents not covered by the subsets \(F_t\)-s and write \(\ell {:}{=}|L| \ge \frac{n}{e}\). Since each agent \(i \in L\) is not covered under \(\mathcal {F}\), we have \(v_i(\mathcal {F})= 1\) for all \(i \in L\). Furthermore, note that the agents in the set \(L^c {:}{=}[n] \setminus L\) are covered by the \(T = k = \frac{n}{\tau }\) subsets \(F_1, \ldots , F_T\), and each of these subsets is of size \(\tau \). Therefore,

figure h

Hence, the average social welfare among agents in \(L^c\) satisfies \(\frac{1}{|L^c|} \sum _{j \in L^c} v_j(\mathcal {F}) = \frac{2n - \ell }{n - \ell }\). This bound and the AM-GM inequality give us \(\prod \limits _{j \in L^c} v_i(\mathcal {F}) \le \left( \frac{2n - \ell }{n - \ell } \right) ^{|L^c|}\). Therefore, we can bound the Nash social welfare of \(\mathcal {F}\) as follows

$$\begin{aligned} \textrm{NSW}(\mathcal {F}) = \left( \prod \limits _{i \in L} v_i(\mathcal {F}) \ \prod \limits _{j \in L^c} v_j(\mathcal {F}) \right) ^{\frac{1}{n}} \le 1^{\frac{\ell }{n}} \ \left( \frac{2n-\ell }{n-\ell }\right) ^{\frac{n-\ell }{n}} = \left( \frac{2 -\ell /n}{1-\ell /n}\right) ^{\left( 1-\frac{\ell }{n}\right) } \end{aligned}$$
(7)

Note that the function \(f(x) {:}{=}\left( \frac{2-x}{1-x} \right) ^{(1-x)}\) is decreasing in the interval \(x \in \left[ \frac{1}{e}, 1 \right) \). Hence, using the fact that \(\ell \ge \frac{n}{e}\) and inequality (7), we get

$$\begin{aligned} \textrm{NSW}(\mathcal {F}) \le \left( \frac{2- 1/e}{1- 1/e}\right) ^{1-\frac{1}{e}} \le 1.83 \end{aligned}$$
(8)

Since, in the NO case, inequality (8) holds for all solutions \(\mathcal {F}\), we get that the optimal NSW is at most 1.83.

Overall, we get that in the YES case the optimal NSW is at least 2 and in the NO case it is at most 1.83. This multiplicative gap of \(\frac{2}{1.83} > 1.092\) implies that a 1.092-approximation algorithm for NSW maximization can be used to distinguish between the two cases. Since this differentiation is NP-hard, a 1.092-approximation is NP-hard as well. The theorem stands proved.

5 Conclusion and Future Work

The current paper extends the scope of coverage problems from combinatorial optimization to fair division. In this setting, we develop algorithmic and hardness results for maximizing the Nash social welfare. The coverage framework considered in this work accommodates expressive combinatorial constraints and, hence, it models a range of applications. The framework also generalizes public decision making among agents that have binary additive valuations.

It would be interesting to extend the coverage framework to settings in which each agent i has value \(v^t_i\) for getting covered by the tth selected subset and her valuation is additive across the T selections. Online version of fair coverage is another interesting direction for future work.