Keywords

1 Introduction

Voting problems, which form a core topic in the field of artificial intelligence and computational social choice [8, 13], have great significance in serving as a tool to aggregate conflicting preferences [8, 13] and receive a considerable amount of attention [28]. The axiomatic properties as well as algorithmic and computational aspects of voting problems have been extensively studied [2, 9], where voters express their preferences over all candidates and the goal is to compute winners according to voting rules. Herein, the most widely used and straightforward voting rules are based on Approval Voting (AV), which is originally defined for dichotomous votes, where each vote assigns an approval to each of her favorite candidates and all other candidates receive disapproval. The winner set consists of those candidates who receive the most approvals.

AV has many desirable properties in single-winner case including simplicity, monotonicity and robustness against manipulation [5, 17]. But it becomes less favorable for the case of multiple winners and the most significant drawback is the lack of egalitarian [20]. Attempting to address fairness when using AV for multi-winner voting, some variants of AV have been introduced in the literature [20]. Among them are Proportional Approval Voting (PAV) [27], Satisfaction Approval Voting (SAV) [7], Approval Chamberlin-Courant Voting (CCAV) [12, 27], and Minimax approval voting (MAV) [6]. The AV, SAV, PAV, and CCAV use a score to represent each vote’s satisfaction with respect to the committee, and MAV use a score to represent each vote’s dissatisfaction with respect to the committee. The goal is to select a committee, which maximizes the sum of all votes’ satisfaction scores for AV, SAV, PAV, and CCAV, or minimizes the maximum of all votes’ dissatisfaction scores for MAV. Among the rules, only AV and SAV are polynomial-time solvable the others are NP-hard [2, 22, 25].

Most of previous researches of committee elections consider each voter as an individual, who is independent of other voters and there is no relation among the voters. However, we often have the scenario in real-world applications, where every voter belongs to a group due to some attributes, which can be formulated as committee election problems with vote attributes. Below we describe several scenarios.

Student Board Election. The first example is the student board election of a university. Here, students are voters and the schools or departments they belong to define an attribute of the voters.

International Sports Election. The second one is people from different countries try to elect a sort of place to hold an international sports event, where each voter has a natural attribute that the country he comes from.

Favorite Singers and Best Film. In some TV shows, like The Singer, audiences are asked to vote for TOP3 favorite singers and the audiences from different ages may like different kind of singers. Similarly, for the election of the best film of the year with votes being given by different film websites, voters from the same website may have the same taste.

Under these scenarios, voters are partitioned into different groups according to these attributes. The groups might admit significantly different sizes. In the first example, a medical school normally has thousands of students, while less than a hundred students are enrolled in a school of sport sciences. Obviously, the rules for committee elections without voter attributes are not appropriate for this application. For example, if PAV is applied, then the opinion of the sport sciences students might be completely ignored.

Some researches considered the case where the candidates are defined by some attributes and the goal is to select a committee that for each attribute offers a certain representation [4, 10, 11, 18, 21]. While, there are few researches consider the case of committee elections, where the voters are associated with some attributes. In addition, a lot of axioms studied these years only focus on the welfare of large groups, the welfare of a group with few voters always be neglected [1, 16]. For instance, Justified Representation (JR) is introduced to make sure that if a large enough group of voters exhibits agreement by supporting the same candidate, then at least one voter in this group has an approved candidate in the winning committee.

Therefore, new voting rules are needed for this setting, which takes the following consideration into account: (1) Even small groups have the right to be represented in the committee. (2) Candidates approved by a large group have no less chance to be members of the committee than that approved by a small group. We formally define the two axioms in Sect. 2, namely, Small group benefited representation (SGBR) and Large group benefited representation (LGBR). Based on the above two goals, we propose three models, namely, Group Representative-\(\sigma \)-\(\pi \) (GR-\(\sigma \)-\(\pi \)), Group Average-\(\pi \) (GA-\(\pi \)), and Group Egalitarian-\(\pi \) (GE-\(\pi \)).

GR-\(\sigma \)-\(\pi \). With \(\pi \in \{\text {AV, SAV, PAV, CCAV, MAV}\}\), this voting rule can be thought as having two rounds. To be more specific, we first use an internal election rule \(\sigma \) to select or construct a set of votes for each group based on the votes in the group, which will serve as that group’s representation, called the representative votes of the group. Then, based on the representative votes of all groups, we use a voting rule \(\pi \) to select a winning committee.

GA-\(\pi \). With \(\pi \in \{\text {SAV, PAV, CCAV, MAV}\}\), this rule uses the average satisfaction of the votes in a group as the satisfaction of that group. More precisely, use the average \(\pi \)-score of the votes in a group as the \(\pi \)-score of that group. Then, find a committee maximizes the minimum score of all groups for \(\pi \in \{\text {SAV, PAV, CCAV}\}\), or a committee minimizes the maximum score of all groups.

GE-\(\pi \). Despite having rules that satisfy at least one of the LGBR and SGBR as we will show in Sect. 3, GR-\(\sigma \)-\(\pi \) and GA-\(\pi \) both have shortcomings. In GR-\(\sigma \)-\(\pi \), the satisfaction of minorities in a group is ignored; more specifically, a group member whose approved candidates differ from the majority’s of the group may not be taken into account. In GA-\(\pi \) with \(\pi \in \{\text {PAV, CCAV}\}\), if a group contains some individuals whose choices for candidates differ from those of the majority, the group’s score is likely to be lower. Here, we design two voting rules to overcome these drawbacks, called Group Egalitarian-PAV (GE-PAV) and Group Egalitarian-CCAV (GE-CCAV), such that a minority’s opinion does not lead to a decrease in the group’s score, and furthermore, an increase in the group’s score if the candidates approved by minorities are members of the winning committee.

Related Work. The goal of selecting a subset of candidates with different attributes under fairness constraints has recently been the focus of a lot of research [4, 10, 11, 18, 21]. Fairness constraints are typically captured by absolute upper bounds and/or lower bounds on the number of selected candidates in specific attributes, or proportional representative of selected candidates in specific attribute. In contrast to our setup where the groups of voters are provided in the input, Talmon [26] and Faliszewski and Talmon [15] studied the question of how to partition the votes into disjoint groups. A vertex-labeled graph with each vertex representing a vote is provided as input; the task is to divide the graph into disjoint groups and assign a member of the committee to each group, so that each vote is represented by one of her preferred alternatives. The axioms that concern the fairness of groups of votes also been studied these years, namely, Justified Representation (JR) [1], Extended Justified Representation (EJR) [1] and Proportional Justified Representation (PJR) [16]. They concentrate on the scenario where a number of voters supporting the same candidates form a group, and at least a certain number of voters in this group has an approved candidate in the winning committee. While, the welfare of a group with few voters might be neglected. For instance, PAV satisfies all the axioms but fails to fulfil the SGBR, the small group benefited representation.

2 Preliminaries

In this section, we introduce the definitions and notations used in our models for the committee elections with grouped voters. A committee election with grouped voters can be denoted as \(E=(C,V,\zeta )\), where \(C=\{c_1,c_2,\dots ,c_m\}\) is the set of the candidates, \(V=\{v_1,v_2,\dots ,v_n\}\) is a list of voters represented by their votes, and \(\zeta =\{G_1,G_2,\dots ,G_\ell \}\) denotes the set of groups with \(G_i\bigcap G_j=\emptyset \) and \(\bigcup _{i=1}^\ell G_i = V\). In this paper, we interchangeably use the terms vote and voter. The number of the votes in a certain group \(G_i \in \zeta \) is denoted as \(|G_i|=|\{v \mid v \in G_i\}|\). We focus on approval votes, where an approval vote \(v_i \in V\) can be considered as a \(\{0,1\}\)-vector of length m. The x-th position of \(v_i\) is denoted as \(v_i[x]\) with \(v_i[x]\in \{0,1\}\), where \(v_i[x]=1\ (\text {or}\ 0)\) means that the candidate \(c_x\) is approved (or disapproved) by \(v_i\). Given a vote \(v \in V\) and a subset of candidates \(C^\prime \subseteq C\), we let \(v\cap C^\prime \) denote the set of candidates approved by v in \(C^\prime \). Let k be a non-negative integer. A k-committee is a k-size subset of candidates. A k-committee selection rule maps each election (CV) and every non-negative integer k with \(k \le | C |\) to a collection of k-committees of C with the winning k-committees of (CV) having the optimal scores under this rule.

2.1 Approval Voting Rules

We first introduce some important approval-based multi-winner voting rules, namely, AV, SAV, PAV, CCAV, and MAV. With respect to each rule, each k-subset of C receives a score and the winning k-committees are those with the desired score.

Under AV, the score of a candidate \(c \in C\), denoted as \(\text {AV}(c)\), is the number of votes approving c. Given a subset \(C^\prime \subseteq C\), AV\((C^\prime ) = \sum _{c \in C^\prime }\text {AV}(c)\). Under the SAV, PAV, CCAV, and MAV, given a subset \(C^\prime \subseteq C\) and a vote \(v \in V\), the scores with respect to \(C^\prime \) and v are set as follows. \(\text {SAV}(v,C^\prime ) = \frac{|v \cap C^\prime |}{|v|}\). \(\text {PAV}(v,W) = 1 + \frac{1}{2} + \cdots + \frac{1}{|v \cap C^\prime |}\). If \(v \cap C^\prime \ne \emptyset \), \(\text {CCAV}(v,C^\prime ) = 1\); otherwise, \(\text {CCAV}(v,C^\prime ) = 0\). \(\text {SAV}(v,W) = \mathcal {H}(v,C^\prime )\) with \(\mathcal {H}\) being the Hamming distance between v and \(C^\prime \), that is, \(\mathcal {H}(v,C^\prime ) = |C_v \setminus C^\prime | + |C^\prime \setminus C_v|\) with \(C_v\) being the candidates approved by v.

Given a vote set \(V^\prime \in V\) and a candidate set \(C^\prime \in C\), the score with respect to \(V^\prime \) and \(C^\prime \) is set as \(\pi (V^\prime ,C^\prime )= \sum _{v \in V^\prime }\pi (v,C^\prime )\) with \(\pi \) being SAV, PAV, and CCAV. For MAV, we have \(\text {MAV}(V^\prime ,C^\prime ) = \max _{v \in V^\prime }\text {MAV}(v,C^\prime )\).

The \(\pi \)-Winner Determination \((\pi \text {-WD})\) problem is defined as: Input: given an election \(E = (C, V)\), a committee size k and a rational number d. Question: is there a committee \(W \subseteq C\) with \(|W| = k\) satisfying that, \(\text {AV}(W)\ge d\) with \(\pi =\text {AV}\), or \(\pi (V,W) \ge d\) with \(\pi \in \) {SAV, PAV, CCAV}, or \(\pi (V,W) \le d\) with \(\pi =\text {MAV}\).

2.2 Axioms

We introduce two axioms for multi-winner approval voting with grouped votes, namely, large group benefited representation and small group benefited representation. The first axiom captures the intuition that a large group deserves no fewer representatives than a small group. The second axiom considers that even a small group still deserve at least one representative if the committee size is no less than the number of groups.

Definition 1

Let \(E = (C,V,\zeta )\) be an election, k be an integer, W be a committee with \(| W | = k\):

  1. 1.

    W provides large group benefited representation (LGBR) for E if there do not exist two groups \(G_i,G_j \in \zeta \) with \(|G_i| > |G_j|\), such that \(|\bigcup _{v \in G_i}{v} \cap W| < |\bigcup _{v^\prime \in G_j}{v^\prime } \cap W|\).

  2. 2.

    W provides small group benefited representation (SGBR) for E if for each group \(G_i \in \zeta \), we have \((\bigcup _{v \in G_i}{v}) \cap W \ne \emptyset \).

We say that an approval-based voting rule satisfies LGBR (SGBR) if for each election E and target committee size k it outputs a committee providing LGBR (SGBR).

2.3 The Models

Given an election \(E=(C,V,\zeta )\) and an integer k of the target committee size, we define the score of a group \(G_i \in \zeta \) with respect to a k-size subset of C, denoted as \(C^\prime \), as follows.

Group Representative-\(\sigma \)-\(\pi \) (GR-\(\sigma \)-\(\pi \)). We set GR-\(\sigma \)-\(\pi (G_i,C^\prime )=\pi (V^i,C^\prime )\) with \(V^i\) being the set of representative votes of \(G_i\).

Group Average-\(\pi \) (GA-\(\pi \)). We use the average \(\pi \)-score of the votes in a group as the \(\pi \)-score of that group, that is, GA-\(\pi (G_i,C^\prime ) = \frac{\sum _{v \in G_i}\pi (v,C^\prime )}{|G_i|}\).

Group Egalitarian-\(\pi \) (GE-\(\pi \)). \(\text {GE-PAV}(G_i,W)=1+\frac{1}{2}+\frac{1}{3}+\cdots +\frac{1}{\sum _{v \in G_i}|v \cap W|}\). In GE-CCAV, the input contains a set of integer \(\{t_1,\cdots ,t_{\ell }\}\). \(\text {GE-CCAV}(G_i,W)=1+\frac{1}{2}+\frac{1}{3}+\cdots +\frac{1}{|\{v| v \in G_i, \mid v \cap W| \ge t_i\}|}.\)

We now have all tools to define the problem of this paper, called Winner Determination for \(\tau \) (\(\tau \)-WD), where \(\tau \in \){GR-\(\sigma \)-\(\pi _1\), GE-\(\pi _2\)} with \(\pi _1 = \{\text {AV, SAV, PAV, CCAV, MAV}\}\) and \(\pi _2 = \{\text {PAV, CCAV}\}\).

Winner Determination for \(\tau \) (\(\tau \)-WD)

Input: An election \(E = (C, V, \zeta )\), a positive integer k, and a positive rational number d, and a set of integer \(\{t_1,\cdots ,t_{|\zeta |}\}\) for GE-CCAV.

Question: Is there a k-size subset \(W \subseteq C\) satisfying \(\max _{G \in \zeta }\tau (G, W) \le d\) for \(\pi \) being MAV, or \(\sum _{G \in \zeta }\tau (G, W) \ge d\) for \(\pi \) being others?

GA-\(\pi _3\)-WD with \(\pi _3 = \{\text {SAV, PAV, CCAV, MAV}\}\) can be defined similarly by replacing the question as: Is there a k-size subset \(W \subseteq C\) satisfying that (1) for \(\pi \) being MAV, \(\max _{G \in \zeta }\tau (G, W) \le d\), or (2) \(\min _{G \in \zeta }\tau (G, W) \ge d\) for \(\pi \) being others?

In this paper, we consider the following parameters: \(m =~|C|\), \(n = |V|\), \(\ell = |\zeta |\), k, the maximal size of groups \(\max \limits _{i} \left| G_i\right| \), and d (called the total satisfaction bound).

3 Large/Small Group Benefited Representation

GR-\(\sigma \)-\(\pi \). We first show that, no matter what \(\sigma \) is, it might be a bad idea to allow each group to have multiple representative votes.

Theorem 1

(*) (1) If there exists a group having more than one representative vote, then GR-\(\sigma \)-\(\pi \) does not satisfy SGBR even with \(\ell =2\), where \(\pi \in \) {AV, SAV, PAV, CCAV, MAV}. (2) If groups can have more than one representative vote, then GR-\(\sigma \)-\(\pi \) does not satisfy LGBR even with \(\ell =3\), where \(\pi \in \) {AV, SAV, PAV, CCAV, MAV}.

The theorem above implies that if we use \(\pi \) to handle the grouped voters case without making any changes, then \(\pi \) does not satisfy SGBR and LGBR with \(\pi \in \{\text {AV, SAV, PAV, CCAV, MAV}\}\). Here, we only need to let all votes in a group be the representative votes of the group, which can be seen as an internal election rule \(\sigma \).

Corollary 1

AV, SAV, PAV, CCAV, and MAV do not satisfy both SGBR and LGBR.

We then look into the possibility that each group only has one representative vote. In this situation, some voting rules can fulfil SGBR even while none of them satisfies LGBR.

Theorem 2

(*) If each group has exactly one representative vote, then (1) GR-\(\sigma \)-\(\pi \) does not satisfy SGBR even with \(\ell =3\), where \(\pi \in \) {AV, SAV}. (2) GR-\(\sigma \)-\(\pi \) satisfies SGBR with \(\pi \in \) {PAV, CCAV}. (3) if \(|v^{G_i}|=|v^{G_j}|\) with \(G_i,G_j \in \zeta \), GR-\(\sigma \)-MAV satisfies SGBR. (4) if \(|v^{G_i}|>|v^{G_j}|\) with \(|G_i| > |G_j|\) for all \(G_i,G_j \in \zeta \), GR-\(\sigma \)-MAV does not satisfy SGBR even with \(\ell =2\).

It is easy to show that GR-\(\sigma \)-\(\pi \) does not satisfy LGBR, since each group have exactly one representative vote. As a result, the voting rule cannot make use of the information about the sizes of the groups. Indeed, GR-\(\sigma \)-\(\pi \) fails to fulfill LGBR even if we allow \(|v^{G_i}| > |v^{G_j}|\) if \(|G_i|>|G_j|\) with \(\pi \in \) {AV, SAV, PAV, CCAV}., where \(G_i,G_j \in \zeta \), \(v^{G_i},v^{G_j}\) are the representative votes of \(G_i,G_j\), and \(|v^{G_i}|\)(\(|v^{G_j}|\)) is the number of candidates approved by \(v^{G_i}(v^{G_j})\).

Theorem 3

(*) If each group has exactly one representative vote, then (1) GR-\(\sigma \)-\(\pi \) does not satisfy LGBR even with \(\ell =3\), where \(\pi \in \) {AV, SAV, PAV, CCAV}. (2) if \(|v^{G_i}|=|v^{G_j}|\) with \(G_i,G_j \in \zeta \), GR-\(\sigma \)-MAV does not satisfy LGBR even with \(\ell =2\). (3) if \(|v^{G_i}|>|v^{G_j}|\) with \(|G_i| > |G_j|\) for all \(G_i,G_j \in \zeta \), GR-\(\sigma \)-MAV satisfies LGBR.

It can be seen that even if there is only one representative vote per group, most of the voting rules studied in this subsection fail to satisfy both LGBR and SGBR. Finding a voting rule that fulfills both axioms in this setting is therefore important. To address this issue, we define a voting rule Group-based Generalized Approval Voting (GGAV) as follows, which can be seen as generalization of GAV [19]. For each group \(G_i \in \zeta \), there is a score-vector \(w_{i} = \{a_i^1, \cdots , a_i^m\}\). The score of a subset of C, denoted as W, with respect to \(G_i\) and \(\zeta \) are defined as \(\text {GGAV}(G_i,W)= \sum _{j=1}^{|v^{G_i} \cap W|} a_i^j\), and \(\text {GGAV}(\zeta , W) = \sum _{G_i \in \zeta }\text {GGAV}(G_i,W)\). By carefully designing each vector, we can make GGAV satisfy LGBR and SGBR. Given a set of groups \(\zeta = \{G_1, \cdots , G_{\ell }\}\) with \(|G_p| \ge |G_q|\) if \(p > q\), we set \(a_i^j\) with \(1 \le i \le \ell \) as follows. (1) \(a_i^k = \ell -i+1\); (2) \(a_i^j = 0, \text {for }k < j \le m\). (3) \(a_i^j = \sum _{j+1 < \beta \le k}{\sum _{1 \le \alpha \le \ell }{a_{\alpha }^ {\beta }}} + \sum _{i+1 < \gamma \le \ell }{a_{\gamma }^j}, \text {for } 1 \le j < k\). In other word, if we denote \(a_i^j\) as \(b_{(k-j)\times k + (n-i+1)}\), then \(b_p = \sum _{1\le q < p}{b_q}\) with \(p > \ell \). By doing this, We say the score-vectors are set to grouped setting. It is easy to see that, GR-\(\sigma \)-PAV is a special case of GR-\(\sigma \)-GGAV by setting \(w_i\) as \(w_i=\{1, \frac{1}{2}, \frac{1}{3}, \cdots , \frac{1}{m}\}\) for each group \(G_i \in \zeta \). Therefore, we do not study the parameterized complexity of GGAV in Sect. 4, since all the hardness results of GR-\(\sigma \)-PAV hold for GGAV.

Theorem 4

(*) If each group has exactly one representative vote, GR-\(\sigma \)-GGAV satisfies both LGBR and SGBR with score-vectors being set to grouped setting.

Theorem 5

(*) (1) GA-\(\pi \) satisfies SGBR with \(\pi \in \) {SAV, PAV, CCAV}. (2) GA-MAV does not satisfy SGBR. (3) GA-\(\pi \) does not satisfy LGBR even with \(\ell =2\), where \(\pi \in \) { SAV, PAV, CCAV, MAV}.

Unfortunately, even GE-PAV and GE-CCAV can overcome the shortcomings of GR-\(\sigma \)-\(\Pi \) and GA-\(\Pi \), neither of them satisfies the LGBR and SGBR.

Theorem 6

(*) GE-PAV and GE-CCAV do not satisfy the LGBR and SGBR.

4 Parameterized Complexity

In this section, we demonstrate the parameterized complexity results of GR-\(\sigma \)-\(\pi \)-WD, GA-\(\pi \)-WD, and GE-\(\pi \)-WD with parameter being \(n,m,k,\ell ,\max _i|G_i|\), and d. With \(\sigma \) being AV, SAV, or t-Count, it is obvious that GR-\(\sigma \)-AV-WD, GR-\(\sigma \)-SAV-WD, and GA-SAV-WD can be solved in polynomial time. All other problems are NP-hard since PAV-WD, CCAV-WD, and MAV-WD are NP-hard even in the non-grouped setting. Unsurprisingly, parameterized complexity of GR-\(\sigma \)-\(\pi \)-WD with \(\pi \in \{\text {PAV, CCAV, MAV}\}\) is quite similar. Specifically, for a certain parameter, if GR-\(\sigma \)-PAV is W-hard, then GR-\(\sigma \)-CCAV and GR-\(\sigma \)-MAV are also W-hard; likewise, if GR-\(\sigma \)-PAV is FPT, GR-\(\sigma \)-CCAV and GR-\(\sigma \)-MAV are also FPT. To GA-\(\pi \)-WD, the same thing took place. Therefore, instead of displaying the parameterized complexity of all models of GR-\(\sigma \)-\(\pi \)-WD and GA-\(\pi \)-WD, we choose GR-(t-Count)-PAV-WD and GA-MAV-WD to serve as exemplars of R-\(\sigma \)-\(\pi \)-WD and GA-\(\pi \)-WD and examine their parameterized complexity. In addition, we also investigate the parameterized complexity of GE-PAV-WD and GE-CCAV-WD. See Table 1 for the summary of the results.

Table 1. Summary of the results. The results with m as parameter are trivial (by trying all size-k subsets of candidates in \(O^{*}(2^m)\) time). Here, \(m =~|C|\), \(n = |V|\), \(\ell = |\zeta |\), k is the size of committee, \(\max \limits _{i} \left| G_i\right| \) is the maximal size of groups, and d is the total satisfaction bound.

Theorem 7

(*) (1) GE-CCAV-WD is FPT with respect to n.

(2) Even with only one group, GE-CCAV-WD is NP-hard and W[2]-hard with k as parameter.

(3) GE-CCAV-WD is NP-hard even when \(\max \limits _{i} \left| G_i\right| =1\).

(4) GE-CCAV-WD is W[1]-hard with d as parameter.

(5) GE-PAV-WD is FPT with respect to n.

(6) GE-PAV-WD and GR-(t-Count)-PAV-WD are W[1]-hard with k as parameter.

(7) GR-(t-Count)-PAV-WD is FPT with respect to \(\ell \) or n. (8) GA-MAV-WD is NP-hard even if \(\max \limits _{i} \left| G_i\right| =1\).

Theorem 8

GA-MAV-WD is FPT with n as parameter.

Proof

In the following, we use the tool of integer linear program (ILP) to prove the theorem.

We can think of all votes as an \(n\times m\) matrix M with binary values. From the column perspective, there are m columns which can be considered as a collection of n-dimensional vectors. We call two columns identical, if both columns contain the same value at each position. The set of pairwise identical columns is called a column type. Clearly, there are at most \(2^n\) different column types. Moreover, let T denote the set of different column types, and for each type \(t\in T\), let \(n_t\) denote the number of columns of type t in the input. Additionally, let \(\Sigma =\{0,1\}\). The ILP can be formulated as follows. It contains \(2\times {2^n}\) variables \(x{_t,_\varphi }\), where t denotes a column type and \(\varphi \in \Sigma \). The value of \(x{_t,_\varphi }\) denotes the number of columns of type t whose corresponding positions in the winning committee W are set to be \(\varphi \). We use \(\varphi _{t,i,j}\) to denote the value of the vote \(v_i^j\) at the positions corresponding to columns of type t. Considering that the goal of GA-MAV-WD is to minimize the maximum score among all groups, we aim to minimize

$$ \max _{1 \le i \le \ell }\frac{{\sum _{1 \le j \le \left| G_i\right| }\sum _{t \in T}\sum _{\varphi \in (\Sigma \backslash \{\varphi _{t,i,j}\})}x_{t,\varphi }}}{\left| G_i\right| } ,$$

where \(\ell \) denotes the number of groups and the number of votes in a certain group \(G_i\) is represented by \(\left| G_i \right| \). Notice that the above objective function can be replaced by the following constraint by introducing the maximum distance d:

$$ {\sum _{1 \le j \le \left| G_r\right| }\sum _{t \in T}\sum _{\varphi \in (\Sigma \backslash \{\varphi _{t,i,j}\})}x_{t,\varphi }}\le d \times \left| G_i\right| ,\ \forall 1\le i \le \ell $$

Doing so, we arrive at an ILP without objective function. In addition, we add the following constraints:

$$ \sum _{\varphi \in \Sigma }x_{t,\varphi }=n_t,\ \forall t \in T, $$

which means that each column is assigned a value of 0 or 1 in the corresponding position of W, determining whether the corresponding candidate is selected in W or not. All variables \(x_{t,\varphi }\) must be non-negative integers, that is, \(x_{t,\varphi }\in \{0,1,2,\)...\(,n_t\},\forall t \in T\) and \(\forall \varphi \in \{0,1\}\). We need an equation constraint to make sure that the winning committee contains exactly k candidates: \(\sum _{t\in T}x_{t,1}=k.\)

If there is a solution for the above ILP instance, then we can construct a size-k committee W by adding \(x_{t,1}\) many candidates whose corresponding column type is t to W. Thus, we can give an ILP formulation for GA-MAV-WD, where the number of variables depends solely on the parameter value n, the total number of the votes. It is easy to verify that the above ILP has a solution if and only if the GA-MAV-WD instance has a solution. The theorem follows from the result of Lenstra’s [23]. \(\square \)

Theorem 9

GA-MAV-WD is W[2]-hard with respect to parameters k and d.

Proof

We prove the theorem by reducing Dominating Set to GA-MAV-WD. The Dominating Set problem is defined as below. Input: A non-negative integer \(k'\) and an undirected graph \(G'=(V',E')\) with \(\left| V'\right| = n'\) and \(\left| E'\right| = m'\). Question: Is there a subset \(S \subseteq V'\) with \(k'\) vertices such that every vertex \(v\in V'\) is contained in S or has at least one neighbor in S?

Dominating Set is W\([\)2\(]\)-hard with respect to \(k'\) [14]. Without loss of generality, we assume each vertex in \(G'\) has a degree at least \(k'\), since we can add a \(k^\prime \)-clique to \(G'\) if there is a vertex \(v^\prime \in G^\prime \), whose degree is less than \(k^\prime \), then make \(v^\prime \) be adjacent to each vertex of the added \(k^\prime \)-clique. Given an instance \((G',k')\) of Dominating Set, we construct an instance \(F((C,V,\zeta ),k,d)\) as follows. We create a candidate for each vertex in the graph \(G'\). We call these candidates as “real candidates”, denoted as \(c_1,c_2,\dots ,c_m\) with \(m=n'\). In addition, we construct \(3n'\) “dummy candidate” denoted as: \(c'_{1},c'_{2},\dots ,c'_{3n'}\). There are \(4n'\) candidates in total. For each vertex \(v'_i\in V'\), we construct a group \(G_i\), where the votes one-to-one correspond to the edges incident to \(v_i\). In addition, we add a “special” vote to each group. Therefore, the total number \(\left| G_i\right| \) of votes in group \(G_i\) is \({ deg}(v'_i)+1\), where \({ deg}(v'_i)\) denotes the degree of vertex \(v'_i\), that is, the number of edges incident to \(v'_i\). Observe that each edge in \(G'\) corresponds to two votes, because it has two endpoints. The total number n of votes is \(2m'+n'\). For a vote \(v_j\) in group \(G_i\) constructed for the edge \(e_j=\{v_r',v_s'\}\), it solely approves the two real candidates \(c_r\) and \(c_s\), who correspond to the endpoints \(v_r'\) and \(v_s'\), but disapproves of all other candidates. We can consider this vote as a vector with two positions of value 1 and \(4n'-2\) positions of value 0. For the special vote in group \(G_i\), it only approves three dummy candidates, \(c'_{(i-1) \times 3+1},c'_{(i-1) \times 3+2}\) and \(c'_{(i-1) \times 3+3}\) but disapproves all other candidates. In fact, each dummy candidate is approved only once, because each special vote approves three distinct dummy candidates. By doing so, adding a dummy candidate to the committee is never better than adding a real candidate. We denote a committee as a \(\{0,1\}^{4n'}\) vector which has exactly \(k'\) 1’s in the real candidate part, so that the Hamming distance of each special vote to the winning committee is \(k'+3\). Then, let \(k=k'\) and \(d=k'+2\). We show the equivalence between the instances in the following. Here we omit the “\(\Rightarrow \)”.

\(\Leftarrow \)”: Assume that there is a solution of GA-MAV-WD, which means that there is a winning committee W of size \(k'\) satisfying GA-MAV\((G_i,W)\le d=k'+2\) with \(1\le i\le n'\). Let \(W=W_1 \bigcup W_2\), where \(W_1\) denotes the set of real candidates in W and \(W_2\) denotes the set of dummy candidates in W. We consider the following cases according to whether \(\left| W_2\right| =0\) or not.

Case 1\(:\) \(\left| W_2\right| =0\): Then, \(\left| W_1\right| =k'\). That is, all candidates in W are real candidates. Let S be the set of the vertices corresponding to the candidates in W. Since the score of each group is at most \(d=k'+2\) and each vote in the group approves either two real candidates or three dummy candidates, there is at least one vote in this group \(G_i\) whose Hamming distance to W is at most \(k'\), meaning that this vote approves at least one candidate in W and thus there is at least one neighbor of the corresponding vertex \(v'_i\) in S. This implies that the set S forms a dominating set.

Case 2\(:\) \(\left| W_2\right| \ne 0\): Let \(\left| W_2\right| =j\) and \(\left| W_1\right| = k'-j\). By the construction, each dummy candidate is approved exactly once by a special vote in a group. Thus, adding a dummy candidate to W can only decrease the score of one group. Then we can replace this dummy candidate by a real candidate, which is approved by other votes in this group but not in W. With the degree of each vertex in \(G'\) being at least \(k'\), we can conclude that such a real candidate exists. By doing so, we decrease \(\left| W_2\right| \) by one without increasing the score of any group. Repeating this replacing operation for all dummy candidates in \(W_2\), we arrive at another solution W, containing only real candidates, and Case 1 applies. In summary, a solution of GA-MAV-WD implies a dominating set in \(G'\). \(\square \)

5 Concluding Remarks

In this paper, we propose three models to deal with the case of approval-based committee elections with grouped voters. At the same time, we propose two axioms named Large group benefited representation and small group benefited representation, and investigate whether the proposed models satisfy the two axioms. We show that all models can hardly satisfy both axioms except the GGAV with the score-vectors being set to grouped setting. We show that all models are fixed-parameter tractable (FPT) when parameterized by the number n of votes, whereas they become fixed-parameter intractable when parameterized by the size k of the committee or d of the satisfaction bound.

We left four questions in Table 1 open, GE-PAV-WD, GR-(t-Count)-PAV-WD and GCMAV-WD with respect to d, GE-PAV-WD and GA-MAV-WD with the parameterization by \(\ell \). Thus, one future research goal is to resolve the parameterized complexity for them.

Inspired by the work of Baumeister and Dennisen [3], we propose the direction for future research to extend the voting models to other forms of votes, such as trichotomous votes, complete linear orders, and partial linear orders. Another task worthy of detailed study is the problem of coalitional manipulation in the case of committee elections with grouped voters [24].