1 Introduction

Since Gale and Shapley’s groundbreaking work (Gale and Shapley 1962), the use of stable matching mechanisms has proliferated across numerous domains. Applications range from matching children to schools to matching refugees to countries (Andersson 2019). A central goal of these mechanisms is to ensure that participants (or agents) have no incentive to try to manipulate the final outcome of the matching process by being strategic about the choices they make and actions they take. However, when deployed in practice, many of the assumptions behind the Gale-Shapley (deferred-acceptance) algorithm no longer hold. For example, agents may have partial preferences or ties (Drummond and Boutilier 2014; Rastegari et al 2013; Irving et al 2009), quotas imposed on matching outcomes (Goto et al 2016), distributional constraints (Kurata et al 2017), and computational constraints, for which compact representations of preferences are useful (e.g., Gelain et al 2009; Pini et al 2014).

One real-world domain where matching mechanisms are implemented is medical residencies (Roth 2002). In many countries, such as Canada and the United States, medical students are assigned to hospital residencies through matching mechanisms. For example, the National Residency Matching Program (NRMP), an American program for matching medical residents to hospitals, in 2023 offered 37,425 positions for first-year residents for 5487 hospital programs (National Resident Matching Program 2023). In this paper, we focus on a problem that arises in medical residency matching settings, where the number of options for residents is large. As noted above, in the NRMP residents need to choose from over 5000 positions, yet they apply to only eleven on average (Anderson et al 2000). Furthermore, residents are often faced with significant uncertainty regarding which hospital may be the best match for themselves. While there are publicly available rankings for hospitals, an individual’s preferences will also be influenced by specific, personal considerations (e.g., the personal chemistry with the people in the hospital). One way to address this uncertainty is to interview with a set of hospitals; this allows the resident to understand and refine their personal ranking between the possible hospitals. However, this requires the resident to choose the set of hospitals they will interview, based on the limited information they have.

The problem of selecting an appropriate interviewing set gives rise to new strategic concerns and widely used mechanisms—which are strategyproof when assuming every resident initially knows their full ranking of the hospitals—are no longer strategyproof (Haeringer and Klijn 2009; Calsamiglia et al 2010). This observation, that interview-set selection is a strategic decision, motivates our work. Inspired by the medical residencies matching problem, we analyze the Nash equilibrium strategies that arise when residents must select interview sets, knowing that a widely used matching mechanism, Resident-Proposing Deferred Acceptance (rp-da), will be used. In particular, we examine the possibility of assortative equilibria, in which residents are divided into groups, with each group interviewing in the same sets of hospitals.

We provide an example to help illustrate some of the strategic reasoning which arises when residents must select interview sets.

Example 1

Suppose we have 4 hospitals—\(h_{1},h_{2},h_{3},h_{4}\), and 4 residents—\(r_{1},r_{2},r_{3},r_{4}\). All hospitals know the residents’ quality (\(r_{1}\) being the best, followed by \(r_{2}\), then \(r_{3}\), and \(r_{4}\) is the worst), and every resident knows their position in the hospitals’ ranking. Suppose residents have two possible rankings of hospitals: with probability 0.5 a resident’s ranking of hospitals is \(h_{1}\succ h_{2}\succ h_{3}\succ h_{4}\), and with probability 0.5, it is \(h_{2}\succ h_{1}\succ h_{4}\succ h_{3}\). Assume that residents can only interview with at most 2 hospitals.

Residents \(r_{1}\) and \(r_{2}\) can choose to interview at \(h_{1}\) and \(h_{2}\), while residents \(r_{3}\) and \(r_{4}\) can choose to interview at hospitals \(h_{3}\) and \(h_{4}\). Such a choice is both assortative and stable—both \(r_{1}\) and \(r_{2}\) know they will never prefer \(h_{3}\) and \(h_{4}\) over the hospitals they interview with; and because of this, both \(r_{3}\) and \(r_{4}\) know hospitals \(h_{1}\) and \(h_{2}\) will surely be taken already by the time it is their turn to interview, so no point in interviewing there. In this case, there is no other equilibrium.

If the probability of any ordering is as likely as any other, then many other interviewing strategies are stable, including non-assortative ones. For example, \(r_{1}\) and \(r_{3}\) interviewing at \(h_{2}\) and \(h_{4}\) while \(r_{2}\) and \(r_{4}\) interview at \(h_{1}\) and \(h_{3}\).

1.1 Our contributions

Under the assumption that hospitals maintain a common master list over residents (e.g. GPAs, exam results, etc. (Hafalir 2008; Zhu 2014; Chen and Pereyra 2015; Ajayi 2011)), we explore the structure of Nash equilibrium when residents are required to select k hospitals with which to interview (and, thus, rank for the matching mechanism). We show the following:

  • A pure strategy Nash equilibrium exists for this game.

  • Using the Mallow’s model for sampling resident preferences, an assortative equilibrium exists for very small interviewing sets. This equilibrium is “natural” in that hospitals and residents are stratified: highly ranked residents interview at well-regarded hospitals; medium residents interview at medium hospitals; and low-ranked residents interview at low-ranked hospitals. Indeed, we initially believed this would be the natural equilibria in most cases (as Ajayi 2011 seemed to indicate).

  • However, in the Mallow’s model, if residents use larger interview sets, assortative strategies no longer form an equilibrium. That said, new equilibria appear, where residents select interview sets that contain both “reach” and “safety” alternatives.

We note that throughout the paper we use the terminology of hospitals and residents, but emphasize that this is merely for clarity and consistency. Our results hold for any setting in which one side cannot provide a full ranking of the other, and must decide how to focus its attention so as to learn more about particular alternatives. Such scenarios could include students interviewing at schools, universities, and recruiting faculty candidates, among others.

2 Related research

While there is a rich literature on matching markets and stability (e.g. Gusfield and Irving 1989), the importance and impact of interviews in these markets is not as well understood. In recent work (Echenique et al 2022), through analysis of the NRMP, observed that most doctors match with one of their most preferred internship programs, despite having very similar preferences. They argued that this apparent contradiction is an artifact of the interview process that precedes the match. These findings highlight the importance of better understanding market interactions, including interviews, that happen before (and after) the actual market. A similar lesson can be taken from Harless and Manjunath (2018) work that studied the impact of the allocation rule (e.g. matching mechanism) on the interviewing process, illustrating how these two steps influence each other. Indeed, several works on interviewing—using various models with varying similarity to ours—have examined the equilibria that arises when assuming the existence of interviews.

One thread of research has studied interviewing policies that aim to minimize the total number of interviews conducted while also ensuring stability in the final match. For example, Rastegari et al (2013) showed that while finding the minimal interviewing policy is NP-hard in general, there are special cases where a polynomial-time algorithm exists, while Drummond and Boutilier (2014) approached the problem using the framework of minimax regret and proposed heuristic approaches for interviewing policies. These papers assume that interview policies are implemented centrally, ignoring the situation where agents may choose with whom to interview, whereas our work explicitly studies strategic issues arising from situations where agents choose their interview strategies.

There is a body of literature that addresses strategic interviewing in matching markets, but many of the papers ask different questions or make different modelling assumptions than we do in our work. Manjunath and Morill (2023) studied the problem of “interview hoarding” where one side of the market (e.g. residents) can interview as many candidates as they wish, while the other side (e.g. hospitals) are limited in the number of interviews they may conduct. The authors show that this leads to problematic outcomes compared to settings where interviews are limited on both sides. In particular, no resident that would have been matched in the setting with limited interviews is better off in the unlimited setting and many residents are worse off. This is consistent with Kadam (2015) findings that relaxing residents’ interview constraints can adversely impact lower-ranked residents. These results support our modelling choice of restricting the size of agents’ interview sets. We note, however, that research has also shown that limiting the size of agents’ interview sets may have strategic implications. For example, in two related papers, Haeringer and Klijn (2009) and Calsamiglia et al (2010) show that limiting the number of interviews an agent may partake in can lead to less stability in the market and encourage agents to misreport their preferences, while He and Magnac (2017) showed, empirically, that imposing an interviewing cost may lead to decreased match quality. Because of these findings, we make no claim that we are using “optimal” interview set sizes, and we consider the question of the size of interview sets to be outside the scope of this paper.

Several other authors have also studied matching markets with limited/fixed interview sets (Immorlica and Mahdian 2005; Beyhaghi et al 2017; Beyhaghi and Tardos 2019). Unlike our work, these papers typically assume uncorrelated preferences (i.e. every hospital independently ranks residents idiosyncratically), allow for a fixed probability of selection (Immorlica and Mahdian 2005), or uniform preferences within subgroups of residents/hospitals (Beyhaghi et al 2017). We believe that these modelling assumptions are overly strong and empirical evidence indicates that preferences are correlated (e.g. Echenique et al 2022), which we try to capture in our preference models.

We are particularly interested in what we call “natural” interviewing equilibria. These equilibria are assortative in that top residents interview with (and are matched with) top hospitals while bottom-ranked residents interview with (and are matched with) bottom-ranked hospitals etc. Lee (2017) showed the existence of such equilibria in matching markets. Their results, however, relied on several strong assumptions including a large market assumption and strong restrictions on the preference models. We are interested in understanding whether it is possible to support assortative equilibria under broader assumptions. Other papers (e.g. Chade and Smith 2006; Chade et al 2014; Ali and Shorrer 2023), motivated by the college-selection problem, have also studied the structure of the resulting equilibria. While Chade and Smith (2006) showed that students would greedily select which colleges to interview with under the assumption that admissions prospects across colleges were stochastically independent, Ali and Shorrer (2023) argued that changes in the underlying model, namely allowing for correlations, results in equilibrium outcomes where students apply for both “reach” and “safety” colleges. While the problem we study is very different since we assume a centralized matching market while these papers study a decentralized process, we find these papers relevant and informative as they hint at the importance of correlated preferences for students/residents in whether assortative or “reach and safety” strategies form equilibria.

Finally, we mention the work of Lee and Schwarz (2017). They studied a multi-stage worker-firm game where one side of the market (e.g. firms) had to first, at some fixed cost, select workers with which to interview before proceeding to a centralized matching market running (firm-proposing) deferred acceptance. Their key finding was if there was no coordination, then all firms were best off each picking k random workers to interview. However, if firms could coordinate then it was best for them to each select k workers so that there was perfect overlap (forming a set of disconnected complete bipartite interviewing subgraphs), i.e., an assortative equilibirum. This finding, while very elegant, relies heavily on the assumption that all firms and workers are ex-ante homogeneous, with agents’ revealed preferences being idiosyncratic and independent. In particular, for the results to hold either agents have effectively no information about their preferences before they interview, or the market must be perfectly decomposable into homogeneous sub-markets that are known before the interviewing process starts In this paper, we study a similar, multi-stage game, but we relax these assumptions. Instead, we assume that agents’ preferences are correlated and make no assumptions about the decomposability of the market into sub-markets.Footnote 1 We are interested in understanding and characterizing the resulting equilibrium outcomes, providing insights into how sensitive so called “natural” equilibria are to the underlying preference structures of the agents.

3 An interviewing game with limited interviews

Using the resident-hospital matching problem as our basic framework, we assume there is a set, \(R=\{r_1,\ldots , r_n\}\) of residents and a set, \(H=\{h_1,\ldots , h_n\}\) of hospitals.Footnote 2 Every \(h_i\in H\) has (strict) preferences over R, and every \(r_i\in R\) has (strict) preferences over H. These are represented by \(H_\succ = \{\succ _{h_1}, \ldots , \succ _{h_n}\}\) and \(R_\succ =\{\succ _{r_1},\ldots , \succ _{r_n}\}\) respectively.

We are interested in one-to-one matchings; residents can only do their residency at a single hospital, and hospital programs can accept at most one resident. A matching is a 1–1 function \(\mu : R \cup H \rightarrow R \cup H\), such that \(\forall r \in R\), \(\mu (r) \in H\cup \{r\}\), and \(\forall h\in H\), \(\mu (h) \in R \cup \{h\}\). If \(\mu (r)=r\) or \(\mu (h)=h\) then we say that r or h is unmatched. We assume that residents prefer to be assigned to any hospital over not being matched, and hospitals prefer to have any resident over not filling the position. A matching \(\mu \) is stable if there does not exist some \((r,h) \in R\times H\), such that \(h \succ _r \mu (r)\) and \(r \succ _h \mu (h)\).

Critically, we assume the existence of a master list, \(\succ _{ML}\), over residents, which is shared by all \(h\in H\), such that \(\succ _{h}=\succ _{ML}\). This assumption implies that all hospitals share identical preferences over residents. This captures scenarios such as when grades or GPAs are used to rank residents, or when residents are required to write standardized exams (Irving et al 2008; Chen and Pereyra 2015; Zhu 2014)). Without loss of generality, \(r_i\succ _{ML}r_{i+1}\), \(\forall i< n\). We further assume that every \(r\in R\) is aware of their ranking on this master list.

Residents, on the other hand, have idiosyncratic preferences over hospitals. This may be based on, for example, location, potential colleagues, career opportunities for partners, etc. In particular, we assume there is some underlying, commonly known, preference distribution, \( D \), from which each \(r\in R\) draws \(\succ _r\) independently. If resident r draws preference ranking \(\eta \) from \( D \), then \(h_i \succ _{\eta } h_j\) means that \(h_i\) is preferred to \(h_j\) by r under \(\eta \).

Critical to our model is the assumption that residents do not initially know their true preferences, but refine their information by conducting interviews with hospitals. If a resident was able to interview every hospital in H then they would know their true preferences. However, this is infeasible and instead, each resident has an interviewing budget of \(k<n\) hospitals. Let \(I(r)\subseteq H\), \(|I(r)|\le k\) be resident r’s interview set, consisting of the set of hospitals r has selected to interview. Once the interviews are completed, r knows its preferences over \(h\in I(r)\), though does not necessarily have any additional information over \(h'\in H\setminus I(r)\).

Once all residents have interviewed with their selected hospitals, they enter into a matching process, using the preference information obtained through their interviews. In this paper, we use the standard resident-proposing deferred acceptance (rp-da). The resulting matching, \(\mu \), is guaranteed to be stable, resident-optimal, and hospital-pessimal (Gale and Shapley 1962). This stable matching is also guaranteed to be unique, as stable matching problems with master lists have unique stable solutions (Irving et al 2008). Thus our results directly hold for any mechanism that returns a stable matching, including hospital-proposing deferred acceptance and the greedy linear-time algorithm (Irving et al 2008).

We summarize our model and assumptions for the interviewing game in which residents engage. We call this game the Interviewing Game with Limited Quotas, or ILQ.

  • Each \(r\in R\) and \(h\in H\) is informed of the master list \(\succ _{ML}\).

  • Each resident \(r \in R\) simultaneously selects an interviewing set \(I(r)\subset H\), \(|I(r)|\le k\).

  • Each resident \(r\in R\) interviews with hospitals in I(r) and learn their own preference, \(\succ _{r|I(r)}\) over members of I(r).

  • A central matching system runs resident-proposing deferred acceptance (rp-da) using \(\succ _{ML}\) as the preference for all \(h\in H\), and \(\succ _{r|I(r)}\) for all \(r\in R\). Any hospital \(h\not \in I(r)\) is reported to be unacceptable by r.

3.1 Utility functions for the interviewing game

We require a clear specification of the residents’ utility functions, as this supports the choices they make when deciding which hospitals to interview. Thus, in this subsection, we describe how we derive principled utility functions for the residents, based on their preferences and the expected match.

We first assume that residents share some common scoring function, \(v:H\times H_\succ \mapsto \mathbb {R}\) such that for any ranking over hospitals, \(\eta \in H_\succ \), \(h_i\succ _\eta h_j\) if and only if \(v(h_i,\eta )>v(h_j, \eta )\). The existence of such a scoring function is used in other literature (e.g. Coles and Shorrer 2014) and allows for flexibility in the modelling of the problem. For now, we merely assume the existence of such a scoring function and will explore different instantiations later.

Second, we make the critical observation that a resident, \(r\in R\), need only be concerned about other residents that are higher ranked in the master list, \(\succ _{ML}\). If a lower ranked resident, \(r'\), is matching to some \(h\in I(r)\), then it must be the case the r is matched to some \(h'\) such that \(h'\succ _{r} h\) since otherwise the matching would be unstable.

This greatly simplifies the formulation of the utility function for a resident as we need only consider the interview sets of higher-ranked residents, its own choice with whom to interview, and the probability with which it has a particular preference ranking over hospitals.

We introduce notation to help support the development of the utility function for a resident. Consider some resident, \(r_j\), and interview sets, \(I(r_1),\ldots , I(r_{j-1})\), for all \(r_i\succ _{ML} r_j\). Furthermore, define \(m=\mu _{|r_1,\ldots , r_{j-1}}\) to be the partial matching that arises when rp-dc is run. The set of preferences that are consistent with this partial match is

$$\begin{aligned} T(r_j,m) =&\{\xi \in H_\succ | \exists I(r_{j}) {\text { s.t. }}\, \forall h' \in I(r_{j}) {\text { s.t. }}\, h' \succ _{\xi } m(r_j), \\&\exists r_a {\text { s.t. }}\, r_a \succ _{ML} r_j \wedge m(r_a) = h'\}. \end{aligned}$$

Observe that \(T(r,m)\not = \emptyset \) for all \(r\in R\), and that \(T(r_1,m)=H_\succ \).

Given some preference distribution D, the probability that some particular partial match, \(m'\), arises, given interview sets \(I(r_1),\ldots , I(r_{j-1})\) is simply the probability that the residents had preferences consistent with \(T(r_j,m'):\)

$$\begin{aligned} P(m'|(I(r_i))_{i=1}^{j-1} ) = \prod _{i=1}^{j-1} \sum _{\xi \in T(r_i,m)} P(\xi | D). \end{aligned}$$

Now resident \(r_j\) must determine the probability with which it will be matched to a particular hospital, h, since its utility is determined by how it perceives the program quality. We define

$$\begin{aligned} M^*(I(r_j), (I(r_i))_{i=1}^{j-1}, \eta , h) =&\{m | m(r_j) = h; \forall r_i\in \{r_1,\ldots ,r_{j-1} \}, m(r_i) \in I(r_i);\\&\forall x\! \in \!I(r_j), {\text { if }}\, x \succ _{\eta } h, \exists r_i {\text { s.t. }}\, x \!\in \!I(r_i) {\text { and }}\, m(r_i) \!=\! x\} \end{aligned}$$

to be the set of (partial) matches where \(r_j\) is matched to hospital h, given interview sets for residents \(r_1,\ldots , r_{j-1}\), interview set \(I(r_j)\) for resident \(r_j\) with preference ranking \(\eta \). Since the preference rankings of residents \(r_l\) such that \(r_j\succ _{ML} r_l\) do not change what hospital \(r_j\) is matched to, for any complete matching, \(\mu \), we have

$$\begin{aligned} P(\mu (r_j) = h | \eta , I(r_j), (I(r_i))_{i=1}^{j-1}) = \sum _{m\in M^*(I(r_j), (I(r_i))_{i=1}^{j-1}, \eta , h)} P(m'|(I(r_i))_{i=1}^{j-1} ). \end{aligned}$$

Bringing everything together, the utility function for resident \(r_j\), given its interview set, \(I(r_j)\) is

$$\begin{aligned} u_{r_j}(I(r_j)) = \sum _{h\in I(r_j)} \sum _{\eta \in H_\succ } v(h,\eta ) P(\eta |D) P(\mu (r_j) = h | \eta , I(r_j), (I(r_i))_{i=1}^{j-1}). \end{aligned}$$
(1)

This utility function has an intuitive interpretation: it weights the value of a hospital by how likely the resident will be matched to it, given the interview-set choices of “more desirable” residents.

4 Equilibria analysis: general results

We start our analysis by studying the most general form of the ILQgame possible. Recall that an ILQgame is defined as \(\Psi = \langle n, k, D, v\rangle \) where \(n=|R|=|H|\), k is the number of interviews any resident can conduct (also known as the quota), D is the underlying distribution from which residents’ preferences are being drawn, and v is the scoring function over hospitals that residents use. We start by placing no restrictions on the structure of the underlying preference rankings of the residents, nor do we place any constraints on their utility functions. Furthermore, to simplify notation we drop n from the ILQnotation unless it influences the results. We start by presenting an existence result, namely the existence of a pure strategy equilibrium for this game.Footnote 3 We follow this by outlining general conditions under which this equilibrium might take a particularly appealing form, namely assortative interviewing. We then instantiate the residents’ preference models using a common probabilistic model for preferences (the \(\phi \)-Mallows model) and explore how this class of preference rankings support assortative interviewing.

Theorem 1

Given any ILQgame \(\Psi = \langle k, D, v\rangle \) with \(k>0\), there exists a pure strategy equilibrium.

Proof

We wish to show that if every resident chooses their expected utility-maximizing interviewing set, this results in an equilibrium. Given any resident \(r_j\) who is jth in the hospitals’ rank- ordered list, \(r_j\)’s expected payoff function only depends on residents \(r_1,\ldots ,r_{j-1}\). As \(r_j\) knows that each other resident \(r_i\) is drawing from distribution \( D \) i.i.d., they can calculate \(r_1,\ldots ,r_{j-1}\)’s expected utility maximizing interview set, using Eq. 1. Their payoff function depends only on \( D \) and \(I(r_1),\ldots ,I(r_{j-1})\), all of which they now have. They then calculate the expected payoff for each \(n \atopwithdelims ()k\) potential interviewing sets, and interview with the one that maximizes their expected utility. Of course, when there are ties between the expected payoff of different strategies, multiple equilibria may arise. \(\square \)

Theorem 1 is an existence theorem. It does not provide any additional insight into the equilibrium behavior, nor does it provide any insight as to how this equilibria may be computed beyond a brute-force approach. This leads us to our next set of questions, namely under what conditions will a particular class of natural interviewing strategies form an equilibrium. We are particularly interested in assortative interviewing strategies.

Definition 1

Given ILQgame \(\Psi = \langle k, D, v\rangle \) with \(k>0\), an interviewing strategy profile is assortative if and only if for \(j = 0,1,2, \ldots , \frac{n}{k}-1\), each resident \(r \in \{r_{jk+1},\ldots ,r_{jk+k}\}\) chooses to interview with the set of k hospitals \(\{h_{jk+1},\ldots ,h_{jk+k}\}\).Footnote 4

We view assortative strategies as being “natural” in that hospitals and residents are stratified: highly-ranked residents on the master list interview at well-regarded hospitals; mid-ranked residents interview at what they expect to be mid-ranked hospitals; and low-ranked residents interview at low-ranked hospitals. We start by deriving conditions that ensure assortative interviewing. We show that there exist scenarios in which one need only focus on the behavior of a single agent, namely \(r_{k}\) where k is the interviewing budget. If assortative interviewing is a best response for resident \(r_k\) when all other residents \(i<k\) interview assortatively, then assortative interviewing is a best response for every resident \(r_i\) (\(i<k\)) when all other residents interview assortatively. In other words, determining if assortative interviewing is a best response for \(r_k\) is sufficient to show that assortative interviewing is a best response for the first k residents (and is thus an equilibrium for them in this game).

Theorem 2

Let \(\Psi =\langle k, D , v\rangle \) be an ILQgame with quota k, preference distribution \( D \), and resident scoring function, v. Assume residents \(r_1,\ldots , r_{k-1}\) all interview assortatively. Then, if resident \(r_k\)’s best response is to interview assortatively under this setting, it is a best response for any resident \(r_1,\ldots ,r_{k}\) to interview assortatively. Moreover, this forms a unique best-response for \(r_{1},\ldots ,r_{k}\).

Proof

We introduce an indicator function to simplify notation for when a hospital is a resident’s top available choice. For any hospital h and agent i, let \(b^{j}(h,\eta ) = 1\) if and only if h is available when \(r_{j}\) makes their choice (i.e., \(r_{1},\ldots ,r_{j-1}\) have not been allocated h), and is their most-desirable available alternative (i.e., \(h \succ _\eta h'\) for all other \(h'\) available); and 0 otherwise. Directly following from Eq. 1 the utility of resident \(r_j\) when interviewing with hospitals \(S\subset H\) is:

$$\begin{aligned} u_{r_j}(S)&= \sum _{h \in S} \sum _{\eta \in H_\succ } v(h,\eta )P(\eta , D )b^{j}(h,\eta ) \end{aligned}$$

Since for \(r_{1}\), it is always true that \(b^{1}(h,\eta ) = 1\) for any desired h (since \(r_{1}\) goes first, no \(h\in H\) has been allocated by another \(r\in R\)), suppose it will interview in a set of k hospitals \(\{h_{1},\ldots ,h_{k}\}\) (the numbering according to \(r_{1}\)’s choices as determined by the distribution D). We are concerned with the best response strategy of \(r_k\) which only depends on the strategies of \(r_i\) for \(i < k\). Suppose there is no assortative equilibrium, and let \(r_{i}, i < k\), be the resident with the lowest index for which it is better off interviewing in set \(S'\ne \{h_{1},\ldots ,h_{k}\}\). Then \(b^{i}(h,\eta )\ge b^{k}(h,\eta )\), with the inequality being strict for some \(h\in \{h_{1},\ldots ,h_{k}\}\). Note that for any \(h\notin \{h_{1},\ldots ,h_{k}\}\), \(b^{i}(h,\eta )=1\).

Hence, if \(u_{r_{i}}(\{h_{1},\ldots ,h_{k}\})<u_{r_{i}}(S')\), this means if all agents \(r_{1},\ldots ,r_{k-1}\) are being assortative (so \(b^{k}(h,\eta )=1=b^{i}(h,\eta )\) for \(h\in S'{\setminus } \{h_{1},\ldots ,h_{k}\}\)), \(u_{r_{k}}(\{h_{1},\ldots ,h_{k}\})<u_{r_{k}}(S')\). That is, if it is not beneficial for \(r_{i}\) to be assortative, it would not be beneficial for \(r_{k}\) to be assortative if \(r_{1},\ldots ,r_{k-1}\) are assortative.

Note that, as all these players have a strictly dominant strategy, this is a unique equilibrium for this game. \(\square \)

4.1 Interviewing equilibria when preferences are drawn from the Mallows model

While Theorems 1 and 2 hold for general preferences, we are interested in understanding the impact that the underlying preference model has on the strategic choices of the residents. To this end, we investigate the strategic behaviour that arises when residents’ preferences are drawn from the \(\phi \)-Mallows model (Mallows 1957), a probabilistic ranking model that is standardly used for modelling preferences and has been used in previous investigations of preference elicitation schemes for stable matching problems (Drummond and Boutilier 2013, 2014; Brilliantova and Hosseini 2022; Freeman et al 2021). Its particular relevance in our setting is that residents often have a vague ranking of hospitals, based on a common list (e.g., US News Ranking of hospitals)—the Mallows reference ranking—but in practice, their personal preferences may be a noisy variant of it.

4.1.1 The Mallows model

The \(\phi \)-Mallows model (or just Mallows model Mallows (1957)), \( D ^{\phi ,\sigma }\), is a distance-based probabilistic ranking model, characterized by a reference ranking \(\sigma \), and a dispersion parameter \(\phi \in (0,1]\). Given the parameters, \(\sigma \) and \(\phi \), the probability of any given ranking \(\eta \) is:

$$\begin{aligned} P(\eta | D ^{\phi ,\sigma })&= \frac{\phi ^{d(\eta ,\sigma )}}{Z} \end{aligned}$$

where \(d(\eta , \sigma )\) is the Kendall-\(\tau \) distance metric that counts the number of pairwise disagreements between the two rankings, \(\eta \) and \(\phi \), and Z is a normalization factor; \(Z = \sum _{\eta \in A_\succ } \phi ^{d(\eta ,\sigma )} = (1)(1+\phi )(1+\phi +\phi ^2)\ldots (1+\cdots +\phi ^{|A|-1})\) (Lu and Boutilier 2011). The parameter \(\phi \) controls the likelihood of drawing a ranking that is significantly different from the reference ranking, \(\sigma \). As \(\phi \rightarrow 0\), the probability of drawing the reference ranking approaches 1.0, while as \(\phi \rightarrow 1\), the Mallows distribution is equivalent to drawing a ranking from the uniform distribution.

One interpretation of the Mallows model has rankings being generated by inserting alternatives into a ranking, where the insertion point is a function of \(\phi \). Because of this, when comparing only a small subset of alternatives in the ranking, the probability that any two of the alternatives of interest are in a specific order may not depend on the total number of alternatives. Furthermore, it is possible to determine the probability that any given alternative will be inserted in a particular position in a ranking simply by computing the probability it will be inserted in that position after all other alternatives have been ranked. We will use these properties in our analysis and so state them here and include the proofs in Appendix A for completeness.

Lemma 1

Given some Mallows model \( D ^{\phi ,\sigma }\) with a fixed dispersion parameter \(\phi \) and reference ranking \(\sigma \) ordering n agents, in which \(a_i~\succ ~a_j\) (\(1\le i,j\le n\)), the probability that a ranking \(\eta \) is drawn from \( D ^{\phi ,\sigma }\) such that \(a_i \succ _{\eta } a_j\) is equal to drawing from some distribution \( D ^{\phi ,\sigma '}\) where \(\sigma \) is a suffix or prefix of \(\sigma '\) (that is, there is \(\sigma \), an ordering of n agents, and \(\sigma '\), an ordering of \(n'\) agents (\(n'>n\)), and \(\sigma '\) can be divided into \(\sigma \), an ordering of the first/last n agents, and an ordering of the last/first \(n'-n\) agents).

Corollary 1

Given any reference ranking \(\sigma \) and two adjacent alternatives in \(\sigma \): \(a_i,a_{i+1}\),

$$P(a_i \succ a_{i+1} | D^{\phi ,\sigma }) = \frac{1}{1+\phi }.$$

We extend Corollary 1 to include three consecutive items.

Corollary 2

Given any reference ranking \(\sigma \) and alternatives \(a_{i},a_{i+1},a_{i+2}\) and some \(\eta ~\in ~\{a_i,a_{i+1},a_{i+2}\}_\succ \), the probability that some ranking \(\beta \) is drawn from \( D ^{\phi ,\sigma }\) that is consistent with \(\eta \) is:

$$P(\beta | D ^{\phi ,\sigma }) = \frac{\phi ^{d(\eta ,a_i \succ a_{i+1} \succ a_{i+2})}}{(1+\phi )(1+\phi +\phi ^2)}$$

It is useful to know the probability that any one alternative will be in any particular position in a rank ordered list. We show that this is effectively equivalent to ordering all other alternatives, and then calculating the probability that we can put the alternative in question in its desired slot.

Lemma 2

The probability that \(a_{1}\) will be ranked in place j is \(\frac{\phi ^{j-1}}{1+\phi +\cdots +\phi ^{n-1}}\). Furthermore, the probability that \(a_{n}\) will be ranked in place j is \(\frac{\phi ^{n-j}}{1+\phi +\cdots +\phi ^{n-1}}\). Similarly, the probability \(a_{j}\) will be ranked in first place is \(\frac{\phi ^{j-1}}{1+\phi +\cdots +\phi ^{n-1}}\).

It is possible to bound the probability that any two alternatives will be “out of order” in any given ranking;

Lemma 3

Let \(\eta \in D ^{\phi ,\sigma }\) be such that \(a_{j}\succ _{\eta }a_{i}\) for some \(i<j\), then \(P(\eta )<\frac{\phi ^{j-i}}{Z}\).

Finally, we include an observation that follows from the definition of the Mallows’ model:

Observation 1

If \(|j-i|>|j-i'|\), probability \(a_{i}\) is in place j is smaller than probability \(a_{i'}\) is in place j. Similarly, probability \(a_{j}\) is in place i is smaller than probability \(a_{j}\) is in place \(i'\).

4.1.2 Equilibrium analysis

We now study the equilibria that arise in the interviewing game when residents’ preferences are drawn from some underlying \(\phi \)-Mallows model. This allows us to control and, thus, better understand, how diversity of residents’ preferences influences the structure of the underlying interviewing equilibrium. For ease of notation, let \(\Psi = \langle k,\phi ,v\rangle \) be an instance of an ILQ game with interview quota k, a Mallows model with dispersion parameter \(\phi \), and a scoring function v.

We start by considering the class of games where \(\phi =0.0\), namely \(\Psi =\langle k,0.0,v\rangle \). Recall that as \(\phi \rightarrow 0\), the probability of drawing the reference ranking \(\sigma \) goes to 1. This means that all residents have common preferences, namely the reference ranking which we define as \(\sigma \) such that \(h_i\succ h_{i+1}\) for all \(1\le i<n\). It is straightforward to see that any strategy profile such that each resident \(r_i\) interviews with hospital \(h_i\) forms an equilibrium. Thus, trivially, assortative interviewing is an equilibrium as well.

We now consider the general case, \(\Psi =\langle k,\phi ,v\rangle \), where no restrictions are placed on any of the three parameters. We observe that if resident \(r_k\) can not improve its expected utility by interviewing with hospital \(h_{k+1}\) instead of any of the hospitals in \(\{h_1,\ldots , h_k\}\), then in general the best thing resident \(r_k\) can do is set its interview set to be \(I(r_k)=\{h_{1},\ldots , h_k\}\). We formalize this in Lemma 4 and defer the proof to Appendix B. Note that this result greatly simplifies the equilibrium analysis going forward: we need only consider k possible interviewing sets, instead of \(\left( {\begin{array}{c}n\\ k\end{array}}\right) \) to determine if assortative interviewing is the best strategy for \(r_k\).

Lemma 4

Given ILQ game \(\Psi = \langle k,\phi ,v \rangle \), if resident \(r_k\)’s expected payoff from interviewing with hospitals \(\{h_1,\ldots ,h_k\}\) (when residents \(r_{1},\ldots ,r_{k-1}\) have interviewed with them as well) is higher than their expected payoff from interviewing with hospitals \(\{h_1,\ldots ,h_{k+1}\} {\setminus } \{h_j\}\) for all \(j \in \{h_1,\ldots ,h_k\}\), then resident \(r_k\)’s best response is to interview with \(\{h_1,\ldots ,h_k\}\) (i.e., assortatively).

We now provide a necessary and sufficient condition for assortative interviewing to hold for ILQ game \(\Psi = \langle k,\phi ,v \rangle \). Let \(P(h_i)\) denote the probability that hospital \(h_i\) is available for resident \(r_k\) (i.e., residents \(r_1,\ldots ,r_{k-1}\) are all matched to different alternatives).

Lemma 5

Given ILQ game \(\Psi = \langle k,\phi ,v\rangle \), if residents \(r_1,\ldots ,r_{k-1}\) all interview assortatively (i.e., with hospital set \(S = \{h_1,\ldots ,h_{k}\}\)), then assortative interviewing is a best response for resident \(r_k\) if and only if the following inequality is satisfied for all \(h_j \in \{h_1,\ldots ,h_k\}\) when \(S^\prime = S {\setminus } \{h_j\} \cup \{h_{k+1}\}\):

$$\begin{aligned} P(h_j )\mathbb {E}(v(h_j) | D ^{\phi ,\sigma })&\ge \quad P(h_j ) \mathbb {E}(v(h_{k+1}) | D ^{\phi ,\sigma })\\&+ \sum _{\eta \in H_\succ } P(\eta | D ^{\phi ,\sigma }) \cdot \big [\sum _{h_i \in S^\prime } P(h_i )\mathbb {1}_{h_{k+1}\succ _\eta h_i}v(h_{k+1},\eta ) \big ] \nonumber \end{aligned}$$

where

$$ \mathbb {1}_{h_i \succ _\eta h_j} = {\left\{ \begin{array}{ll} 1,& {if }\, h_i \succ _\eta h_j \\ 0, & {otherwise}\, \end{array}\right. } $$

We now present our key result for this section. We provide a necessary and sufficient condition for assortative interviewing to form an equilibrium for a given ILQ game, \(\Psi =\langle k,\phi , v\rangle \). Furthermore, this condition can be checked efficiently since it only involves checking k possible interview sets for a single resident, \(r_k\).

Theorem 3

Given ILQ game \(\Psi = \langle k,\phi ,v \rangle \), then satisfying the inequality found in Lemma 5 for all \(h_j \in \{h_1,\ldots ,h_k\}\) is both sufficient and necessary to show that all residents interviewing assortatively form an equilibrium for this game.

Proof

For the first k residents, this follows directly from combining Theorem 2 and Lemma 5. The theorem would be correct if we could apply this proposition and lemma iteratively, one group of k hospitals and residents at a time. Thanks to the Mallows distribution’s properties, we can: If \(r_{k}\)’s best response was assortative, we know that all the residents \(r_{1},\dots ,r_{k}\) interviewed assortatively, thus all hospitals \(h_{1},\ldots , h_{k}\) are taken. This means that the same equations that told us that \(r_{k}\)’s best response (to \(r_{1},\ldots ,r_{k-1}\)) was assortative tell us that \(r_{2k}\)’s best response (to \(r_{k+1},\ldots ,r_{2k-1}\)) is assortative: Since a switch between \(h_{1}\) and \(h_{2}\) has the same probability as switching between \(h_{k+1}\) and \(h_{k+2}\), if Theorem 2 and Lemma 5 can be applied once on hospitals and residents \(1,\dots , k\), they can be applied again for \(k+1,\ldots ,2k\), as all equations remain the same, due to the practical “disappearance” of the hospitals \(h_{1},\ldots ,h_{k}\) for agents \(r_{k+1},\ldots ,r_{2k}\) (thus their order can be ignored). Now that we have shown that the first two groups of k residents interview assortatively, we can use the same argument iteratively for the next k residents, and so on. \(\square \)

To conclude this section we note that while we focussed on assortative equilibria since they are elegant and simplifies the problem of determining equilibrium strategies for the residents, other equilibria may also exist. For example, consider the special case where \(\Psi =\langle k, 1.0, v \rangle \). When \(\phi =1.0\) the resulting Mallows distribution is uniform. As first noted by Lee and Schwarz (2017) under a different model, when residents and hospitals are divided into n/k subsets and matched inside these subsets, this also forms an equilibrium.

5 Assortativity, utility function structure, and quotas

We now focus our attention on understanding the interplay between the number of interviews residents may conduct and the structure of the underlying utility functions. We continue to be interested in characterizing the conditions in which “natural” or assortative interviewing equilibria exist.

To ground the work we continue to assume that residents’ preferences are drawn from some underlying ranking distribution generated by the \(\phi \)-Mallows model, and then we instantiate the residents’ utility functions in three different ways, drawing inspiration from both the social choice literature (Brandt et al 2016; Loewenstein et al 1989; Messick and Sentis 1985) and the matching literature (e.g. Coles and Shorrer 2014; Calsamiglia et al 2020). Let \(h_i\) be the \(i^{th}\) ranked hospital in a resident’s ranking \(\eta \).

Plurality-based::

A utility function is plurality-based if

$$\begin{aligned} v(h_i) = {\left\{ \begin{array}{ll} 1, & {\text {if }}\,i=1 \\ 0, & i>1 \end{array}\right. } \end{aligned}$$
Borda-based::

A utility function is Borda-based if for any \(h_i\),

$$\begin{aligned} v(h_i) = n-i+1 \end{aligned}$$

where n is the number of alternatives (hospitals) in the market. This is equivalent, in a sense, to the expected rank, though the values are inverted—the most preferred choice has a maximal Borda score, but the expected rank value is minimal (1).

Exponential::

A utility function is exponential if

$$\begin{aligned} v(h_i)=\left( \frac{\epsilon }{2}\right) ^{i-1}, {\text {for }}\, 0<\epsilon <1. \end{aligned}$$

These three functions capture a wide range of residents’ preferences. If best modelled using plurality-based utility functions, residents care only about being matched to their top choice. Borda-based, on the other hand, provides a linear utility function that decreases as a resident is matched to a less preferred hospital. The class of exponential utility functions forms a bridge between plurality and Borda.

Our first result identifies a condition under which residents with plurality-based utility functions will interview assortatively in equilibrium. The proofs are provided in Appendix D.

Lemma 6

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \) where v is the plurality-based utility function, a necessary and sufficient condition for assortative interviewing to be an equilibrium is

$$\begin{aligned} P(h_j)\ge \phi ^{k-j+1} \end{aligned}$$

where \(P(h_j)\) is the probability that hospital \(h_j\) is available for resident \(r_k\).

There is a strong relationship between the conditions under which assortative interviewing is an equilibrium when residents have plurality-based utility functions and when they have exponential utility functions.

Lemma 7

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \), if

$$\begin{aligned} P(h_j)\ge \phi ^{k-j+1} \end{aligned}$$

when v are plurality-based utility functions, then there exists exponential utility functions that also result in assortative interviewing being in equilibria.

One can immediately develop some intuition from these Lemmas by considering the extreme values for the \(\phi \)-parameter. For example, if \(\phi =1.0\), then the distribution from which residents’ preferences are drawn is uniform.Footnote 5 In this case, assortative interviewing will only be supported in equilibrium if resident \(r_k\) is certain to be matched with \(h_1\). This is clearly very strong and unlikely to hold in many real-world settings. On the other hand, if \(\phi \) is close to zero, meaning that residents’ true rankings over hospitals are likely to be similar to each other, then assortative interviewing is supported as long as there is some (possibly fairly small) chance that \(h_1\) will be available to be matched to \(r_k\). We will leverage Lemmas 6 and 7 in the rest of this section to gain a clearer picture of the characteristics of assortative equilibria.

5.1 Assortative equilibria when \(k=2\)

If residents are only allowed to interview with two hospitals, then assortative interviewing forms an equilibrium under certain conditions. In particular, the existence of assortative interviewing depends on both the structure of the residents’ utility functions and the dispersion parameter, \(\phi \), of the underlying Mallows model.

Theorem 4

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \) with \(k=2\) and v being plurality-based utility functions, assortative interviewing forms an equilibrium when \(0 < \phi \le 0.6180\).

A direct consequence of Theorem 4 and Lemma 7 is that for exponential scoring functions, when \(0< \phi < 0.6180\), there exists an \(\varepsilon \) such that if residents’ scoring function is an exponential function dominated by \((\frac{\varepsilon }{2})^{(i-1)}\) with \(\varepsilon > 0\), assortative interviewing is an equilibrium for that \(\phi \).

We are also able to show a similar result when the utility functions of residents are Borda-based, though assortative interviewing is in equilibrium for a significantly smaller range of \(\phi \), meaning that the preferences of the residents are much less diverse. This illustrates the strong connection and interplay between the structure of the utility functions of the residents, the underlying preference distribution, and the number of interviews residents may participate in.

Theorem 5

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \) with \(k=2\) and v being Borda-based utility functions, assortative interviewing forms an equilibrium when \(0 < \phi \le 0.265074\).

5.2 Assortative equilibria when \(k=3\)

Interestingly, when residents can interview with up to three hospitals, assortative interviewing continues to be an equilibrium for plurality-based and exponential utility functions but is no longer an equilibrium if residents have Borda-based utility functions. We begin with the negative result for Borda-based utility functions.

Theorem 6

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \) with \(k=3\) and v being Borda-based utility functions, then assortative interviewing may not form an equilibrium or any \(0<\phi \le 1\).

Alternatively, for plurality and exponential-based utility functions, assortative interviewing still forms an equilibrium for certain ranges of \(\phi \) in the Mallows model. We observe, however, that the range of \(\phi \) is smaller than in the case where \(k=2\), indicating again the sensitivity of residents’ strategic decisions on all aspects of the ILQgame.

Theorem 7

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \) with \(k=3\) and v being plurality-based utility functions, assortative interviewing forms an equilibrium when \(0 < \phi \le 0.4655\).

The existence of assortative interviewing, when v are exponential-based, is an immediate consequence of Theorem 7 and Lemma 7.

5.3 Assortative equilibria when \(k\ge 4\)

We finally consider the setting where residents can interview with more than three hospitals. Unfortunately, our results are negative; we show there are settings, characterized by k and \(\phi \), such that assortative interviewing is not an equilibrium, irrespective of the underlying utility function. We begin by showing that when there is a setting for which there is no assortative equilibria for plurality, then there is no scoring function with assortative equilibria. We use this result to show that, for sufficiently small dispersion parameter \(\phi \) and for \(k > 3\) interviews, assortative interviewing cannot be an equilibrium under any scoring function. We then provide a specific counterexample for all \(\phi \) when \(k = 4\) for plurality, implying there is no assortative equilibrium for any scoring function. This suggests that, for a wide category of resident valuation functions under a Mallows distribution, contrary to some real-world behaviour, assortative interviewing is not an equilibrium.

Theorem 8

Given ILQgame \(\Psi = \langle k,\phi ,v\rangle \) with \(k\ge 4\) and v being plurality-based, if hospital \(h_1\) causes the condition in Lemma 5 to be falsified (i.e., \(\{h_2,\ldots ,\) \( h_{k+1}\}\) has a better expected payoff than \(\{h_1,\ldots ,h_k\}\)), then for \(k\ge 4\) and \(\phi \), assortative interviewing is not an equilibrium for any valuation function.

Intuitively, there is a tradeoff between the likelihood that a hospital will be available for resident \(r_k\) by the time it is their turn to be matched and the expected value of that hospital. Both are strongly tied to the dispersion parameter \(\phi \) of the Mallows model: as the dispersion parameter approaches 1.0, the difference in the expected value of any given hospital goes to 0. As the dispersion parameter approaches 0.0, the expected value of any hospital \(h_i\) goes to the value of its slot in expectation, \(v(s_i)\). However, the likelihood it is taken by some higher ranked \(r_j\) (i.e., with \(j < i\)) also approaches 1. The following theorem addresses the latter case: for sufficiently small dispersion, even though the expected value of a hospital is high, the likelihood it will be available is so low that residents are disincentivized from choosing to interview with it.

Theorem 9

Given ILQgame \(\Psi = \langle k,\phi ,v\rangle \) with \(k>4\), there exists \(0<\varepsilon <1\) such that for any scoring function v no assortative interviewing forms an equilibrium for dispersion parameter \(0< \phi <\varepsilon \).

We now show that for \(k = 4\), assortative interviewing is not an equilibrium for any \(\phi < 1\) and any scoring rule. We then continue to show that for \(k > 4\) and \(\phi \) sufficiently small, assortative interviewing is not an equilibrium.

Theorem 10

Given ILQgame \(\Psi = \langle k,\phi ,v\rangle \) with \(k = 4\) and any scoring function v, assortative interviewing is not an equilibrium for any dispersion parameter \(0 \!<\! \phi \!< \!1\).

It seems unlikely that for \(k > 4\), assortative interviewing is an equilibrium. Intuitively, if it is an equilibrium it should be for low \(\phi \): this is when the expected value of hospital \(h_i\) is very close to \(v(s_i)\). However, this is also when residents \(r_1,\ldots ,r_{k-1}\) are all most likely to be matched with hospitals \(h_1,\ldots ,h_{k-1}\). We leave open the possibility that there may exist some \(\delta \) such that when \(0< \varepsilon< \phi < \delta \le 1\), assortative interviewing is an equilibrium for plurality.

6 Beyond assortative interviewing

The results in the previous sections are mixed. While we believe that assortative interviewing is an interesting phenomenon and is “natural” as it provides an intuitive strategy for residents, we have also shown that such equilibria are only guaranteed to exist when residents have a limited number of interviewing options.

This inspires us to do two things. First, we observe that our definition of assortative interviewing is strong. Thus, we explore the ramifications of weakening the definition. As we will show, interestingly, our weaker definition does not help and instead can add further complications to the problem. This motivates us to expand the class of what we consider “natural” outcomes and initiate an investigation into the class of reach and safety strategies.

6.1 The weakness of weak assortative strategies

Our definition of assortative interviewing was very strong. Definition 1 imposed two key restrictions. First, it assumed that each resident, with an interview budget of size k, interviews with k consecutive hospitals (according to the reference ranking). Second, the definition assumed that the top k residents interviewed with the top k hospitals, the following k residents (ranked from \(k+1\) to 2k) interviewed with the next k hospitals, etc. We consider two relaxations of this definition: pseudo-strong assortative and weak assortative interviewing.

Definition 2

Given ILQgame with quota \(k>0\), an interviewing strategy profile is pseudo-strong assortative if given \(j = 0,1,2, \ldots , \frac{n}{k}-1\), for each group of k hospitals such that \(H_j=\{h_{1+kj},h_{2+kj},\ldots , h_{k(j+1)}\}\), there exists a subset of residents, \(R_j\subset R\), \(|R_j|=k\) such that all residents in \(R_j\) interview with \(H_j\).Footnote 6

Note that this definition relaxes the assumption that consecutive residents interview with the same hospitals. For example, if there was an outcome such that resident \(r_{1}, r_{3}\) and \(r_{5}\) interview at \(h_{1},h_{2}\) and \(h_{3}\), while residents \(r_{2}, r_{4}\) and \(r_{6}\) interview at hospitals \(h_{4}, h_{5}\) and \(h_{6}\), this would be pseudo-strong but not strongly assortative.

Definition 3

Given ILQgame with quota \(k>0\), we say that an interviewing strategy profile is weakly assortative iff for all \(r_i\in R\), \(I(r_i)=\{h_j,h_{j+1},\ldots ,h_{j+k-1}\}\) for some j.

Weak assortative interviewing has that each resident selects k consecutive hospitals to interview and places no other restrictions on the residents’ strategies. That is, it is conceivable that almost every resident is targeting a different part of the hospital list.

Strong and pseudo-strong strategies result in a similar structure of resident and hospital interviews: hospitals are divided into consecutive sets (the top-k hospitals, the 2nd-k hospitals, etc.), and each resident interviews in one of those sets. Since a resident only interviews with one of those sets (e.g., a resident cannot interview in some of the top-k and some of the 2nd-k), such a structure may happen when the difference between the expected value of each hospital is so small that the mere interview of a resident in a hospital (which decreases the probability of another resident getting it) reduces the expected value in a way that another agent prefers to avoid it completely. This can happen when \(\phi \) is very close to 1, in which case all hospitals are almost equivalent to resident preferences. It can also occur when the valuation function is such that the value of each hospital is very close, regardless of their ranking.

Since residents do not interview with different sets of hospitals (top-k, 2nd-k, etc.), this also means that if \(r_{1}\) and \(r_{2}\) interview in the same hospitals in equilibrium, this indicates \(r_{2}\) sees some value in these hospitals, even if \(r_{1}\) interviews there, which means that \(r_{3}\) would consider these hospitals as well (assuming \(k>2\), of course), so while \(r_{3}\) may gradually change the interview set, they will not completely avoid \(r_{1}\) and \(r_{2}\)’s hospitals (resulting in an equilibrium that is not strong or pseudo-strong).

The above argument, however, does not answer what happens when we consider weakly assortative interviewing that is neither strong nor pseudo-strong. Surprisingly, weak assortative interviewing turns out to be more problematic than strong (or pseudo-strong) assortative interviewing.

Theorem 11

Given ILQgame \(\Psi =\langle k, \phi , v\rangle \) where all residents employ a weakly assortative interviewing strategy (that is, at least one resident is not strong or pseudo-strong). Then there is a non-zero probability that some resident will be unmatched.

Note that such a situation cannot happen with strongly (or pseudo-strong) assortative strategies—sets of k residents all interview at the same k hospitals, meaning they are guaranteed to be matched.

While Theorem 11 is an inherently negative result, it is not the only negative aspect that arises when one considers weakening strong assortativity.

Theorem 12

Given ILQgame \(\Phi =\langle k,\psi , v\rangle \) with \(k=2\), \(\phi <1\), and \(|R|=|H|=4\), then there does not exist a Nash equilibrium in which residents have weak-assortative strategies.

Fig. 1
figure 1

Interviewing sets when \(|R|=4\), \(k=2\), and we require weak assortative strategies.

The proof of Theorem 12 (found in the appendix) illustrates a cascading effect where residents (particularly the bottom-ranked resident) have incentives to break the sequential structure of hospitals when selecting their interview sets. This is illustrated in Fig. 1. This is further exacerbated as the number of residents (|R|) increases. With a larger k, a similar process would occur, and as also intuited in Theorem 11, creating more than k residents interviewing at the same hospital means the probability of a resident being left without a hospital grows. While we hypothesize that for \(k > 2\) and \(n > k\), there is no Nash equilibrium at all with only weakly assortative strategies (in cases without tie-breaking), it is clear that if there is a sufficiently negative cost to being left without a hospital (as there is in the real world), weakly assortative interviewing cannot happen, as weakly assortative strategies result in hospitals with over k interviewees (and thus residents without hospitals). These residents would instead seek to reduce this probability by interviewing at the hospitals with highest availability probability, which means they would not interview in a hospital with more than k interviewees.

6.2 Reach and safety strategies for a small interviewing quota

While the example shown in Fig. 1 lacked the assortativity structure we have been interested in, it still illustrated an interesting phenomenon, reach-and-safety strategic behaviour. We investigate such behaviour empirically and relate the emergence of such behaviour to the underlying preference models of the residents. We focus on small ILQgames as we concentrate on exact computation of the underlying equilibrium, but we hypothesize that our findings generalize to larger settings.

Consider the case for \(k=2\) interviews where (for the Borda scoring rule) we only guarantee assortative interviewing for some sufficiently small dispersion parameter \(\phi \). To gain better insight into the strategic behaviour of the residents as a function of \(\phi \), we calculated the exact values of \(\phi \) where the interviewing equilibria changes in small markets. In doing so, we see that the structure of the interviewing equilibria contain both “reach” and “safety” schools, where participants diversify their interviewing portfolio to get both the benefit of a desirable, unlikely option, and a likely, but less desirable option.

Fig. 2
figure 2

Interviewing sets of residents as a function of \(\phi \) when using the Borda scoring function, with 4 participants, and interview set size of 2.

Figure 2 depicts a market with 4 hospitals, 4 residents, and 2 interviews (\(n = 4\), \(k = 2\)). The figure shows what sets are being chosen by the different residents for any dispersion \(\phi \). As \(\phi \) increases, we explicitly see the trade-off between a safer choice, and a better expected-payoff value for individual alternatives. For small \(\phi \), as the theoretical results showed, assortative interviewing is optimal, and \(r_{2}\) chooses \(\{h_1,h_2\}\), while \(r_{3}\) and \(r_{4}\) choose \(\{h_{3},h_{4}\}\). Interestingly, for \(\phi \in [0.5,0.62]\), \(r_2\)’s best option is to split the difference, and interview with one hospital (\(h_3\)) they are guaranteed to get and one hospital (\(h_2\)) that will be available with sufficiently high probability, and has a higher expected value. This choice available to \(r_2\) further results in some of the “reach” vs. safe behaviour we see in college admissions markets; namely, \(r_3\)’s best response now is to interview with \(h_1,h_4\) (i.e., a “reach” choice, and a “safe” bet), while \(r_{4}\), being left without any truly “safe” option, aims slightly higher than its rank. As \(\phi \) grows and approaches 1, any ordering of hospitals is as likely as another, making \(r_{2}\)’s choice \(\{h_{3},h_{4}\}\), which are as likely as any to be highly ranked, and are available. The desire to avoid interviewing hospitals that are already chosen by many other residents also drives \(r_{3}\) and \(r_{4}\) to \(\{h_{2},h_{3}\}\) and \(\{h_{1},h_{4}\}\), respectively; that is, they both want to avoid competing with \(r_1\) and \(r_2\).

Fig. 3
figure 3

Interviewing sets of residents as a function of \(\phi \) when using the Borda scoring function, with 6 participants, and interview set size of 2.

Fig. 4
figure 4

Interviewing sets of residents as a function of \(\phi \) when using the Borda scoring function, with 6 participants, and interview set size of 3.

Fig. 5
figure 5

Interviewing sets of residents as a function of \(\phi \) when using the Borda scoring function, with 6 participants, and interview set size of 4.

We expand on these results and now consider the case of \(n=6\) residents with \(k=2, 3\) and 4 interviews per resident. Here we see in Figs. 3, 4, and 5, similar equilibrium strategies as for the \(n=4,k=2\) case. For \(k=2\), and \(\phi \le 0.4\), we again see that assortative interviewing is an equilibrium. When \(\phi = 0.5\), we observe that the second resident departs from strict assortative interviewing in favour of a weak version of assortative interviewing and this, in turn, affects the other players, as, for example, the third resident applies what is a safety move (hospital 4, which is theirs if they want it) with a reach move (hospital 1, the top choice). Of some interest, for \(\phi \ge 0.7\), all residents except \(r_{1}\) use a weak assortative strategy, that is, they interview in sets of hospitals which are adjacent in rank, rather than splitting their interviews between radically different ranked hospitals.

Turning to \(k = 3\) interviews per resident, we see as Theorem 6 claimed, that the third resident does not interview assortatively. While residents 3, 4, and 5 are mostly weakly assortative (except the third resident and \(\phi = 0.8\), where it tries a small reach choice), the sixth resident goes consistently for a reach and safety strategy, as it interviews in the top hospital as well. The resident’s behaviour only changes when \(\phi \) is large enough (\(\phi >0.6\)) when the chance of the true ranking being different from the ground truth is much higher. Of interest, when \(\phi = 0.9\), the second resident chooses hospitals 4, 5, 6 (even knowing that at least two of the hospitals in {1,2,3} will be available. But when \(\phi \) is sufficiently close to 1, the distribution is approaching the uniform distribution so that this resident might as well choose hospital 4, 5, 6 as they might very well be as desirable as 1, 2, 3 where the residents’ top choices might be taken.

Finally, for \(k = 4\), we see that resident 1 (as we know must happen) interviews assortatively for all settings of \(\phi \) while other residents are much more willing to experiment. Not included in Fig. 5 are further results, showing that even for some very small \(\phi \) (\(0.1>\phi \ge 10^{-20}\)), there are residents which are not even weakly assortative. We hypothesize that this “reach” and “safety” behaviour is present in markets with larger interviewing quotas.

7 Conclusions and future directions

We investigated equilibria in ILQgames, inspired by the matching of medical residents to hospital programs. A key feature of this game is that residents must interview with hospitals to discover their true preferences, but are limited in the number of interviews they may conduct. This introduces a new level of complication as residents need to carefully consider how to optimize their interviewing strategies given the interviews choices of other residents. While we showed the existence of a pure-strategy Nash equilibria for this game, we where particularly interested in understanding under what circumstances assortative interviewing forms an equilibrium, as such strategies are “easy” for residents to execute, result in stable outcomes where everyone is matched, and which earlier work had suggested might exist (Lee 2017).

We summarize our findings into a few key take-aways that may provide useful guidelines for market designers:

  • Assortative interviewing is supported in equilibrium, but its existence depends critically on how correlated residents’ preferences are, the limit on the number of interviews, and the structure on the underlying value functions for alternatives.

  • If the underlying value function is Borda-based, then assortative interviewing only forms an equilibrium when preferences of residents are closely correlated (as measured by the \(\phi \) parameter in the underlying \(\phi -\)Mallows model). If the underlying value function is plurality-based or exponential then assortative interviewing is more broadly supported.

  • Limiting the number of interviews residents can undertake is critical if assortative interviewing it to be supported. If residents are allowed to interview with 4 or more hospitals then assortativity might not be supported in equilibrium.

  • Relaxing the definition of assortativity (to weak-assortativity) does not help.

There are many research questions raised by our results, to which at least some of out technical results and techniques may also contribute. Most concretely, we hypothesize Theorem 9 could be replaced by extending Theorem 10 for all \(k\ge 4\). Second, while we believe that the space of scoring functions used in this paper was broad in its scope, we always assumed that residents’ underlying ranked preferences were drawn from a distribution generated by the \(\phi \)-Mallows model. While the \(\phi \)-Mallows model is standard in the literature, it is possible that other preference distributions (e.g., Plackett-Luce) may better support assortative interviewing. Second, the analysis relies on the assumption that one side of the market maintained a master list. While master-lists do occur in real-world matching markets, lifting this assumption will obviously generalize the setting, and may invalidate our results. More specifically, the removal of the master-list assumption would complicate the analysis significantly, increasing the complexity of the payoff function formulation.

Furthermore, we could consider modifying our definition of an interview set. Currently we assume that residents could interview up to k hospitals for free, but an alternative model to consider would be to allow each resident r to have a “budget” \(b_r\), and incur a cost, \(c_r(h)\), when interviewing hospital h, with the constraint that if S is the set of hospitals interviewed by resident r, then \(\sum _{h\in S} c_r(h)\le b_r\). Such a budget, even if the cost is equal for all hospitals, will change the equilibrium in a variety of ways, including by making it no longer always a dominating strategy to interview at all k hospitals, as sometime—particularly for very high/low ranked residents—interviewing at some hospitals might not offer enough expected utility. It may also give rise to a setting equivalent to the hospitals having a limited number of potential interview slots, which would make the hospitals strategic players as well, as they wish to find the candidates which are both highly ranked and that will also choose them.Footnote 7 A similar form of either assortative or “reach”/“safety” may happen, this time by the hospitals, though that is outside the scope of this paper.

A long-term research goal is to better understand the extent to which “natural equilibria” exist in matching games, and how such equilibria correspond with observed behaviour in actual markets. One such possibility is for interviewing to be assortative for “safety” programs while allowing for one or a few “reach” programs. (See for example the strategy of resident 6 for small values of the Mallows’ parameter in Fig. 4.) Furthermore, we are interested in techniques that could reduce the cognitive burden placed on participants in matching markets, while also reducing inefficiencies. For example, there may be ways to leverage research on preference elicitation for matching markets (e.g., Drummond and Boutilier 2014) with matching market design so as to guide participants to interview with the appropriate programs so as to improve the overall quality of the match.