Abstract
In this chapter, we consider the problem of estimating the latent influence of vertices of a network in which some edges are unobserved for known reasons. We present and employ a quantitative scoring method that incorporates differences in “potential influence” between vertices. As an example, we apply the method to rank Supreme Court majority opinions in terms of their “citability,” measured as the likelihood the opinion will be cited in future opinions. Our method incorporates the fact that future opinions cannot be cited in a present-day opinion. In addition, the method is consistent with the fact that a judicial opinion can cite multiple previous opinions.
This research was supported by NIH Grant # 1RC4LM010958-01.
Access provided by Autonomous University of Puebla. Download chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Examples of network data in political science are ubiquitous, and include records of legislative co-sponsorship, alliances between countries, social relationships, and judicial citations.Footnote 1 Numerical estimates of the influence of each node (e.g. legislator, country, citizen, opinion), defined in terms of its propensity to form a relationship with another node, are often of interest to an analyst in each of these examples. In this chapter we present a new approach to solving a common problem in the social sciences—that of estimating the influence of vertices in a network. Our approach assumes that observed levels of influence relate to an underlying latent “quality” of the vertices.Footnote 2 Although common methods for measuring influence in networks assume that each vertex has the potential to influence every other vertex, many networks reflect temporal, spatial, or other practical constraints that make this assumption implausible. We present a scoring method that is appropriate for measuring influence in networks where (1) some vertices cannot form an edge with certain vertices for reasons that are unrelated to their underlying “quality” and (2) each vertex may be influenced by a different number of other vertices, so that some edges reveal different amounts of information about the latent “quality” of the influencing vertices.
As an example, we rate the “quality” of Supreme Court decisions, which we define as the likelihood that the decision will be cited in a future decision. These decisions are readily analyzed by our method due to their connectedness—the Supreme Court’s explicit usage of previous decisions as precedent for current and future decisions generates a network structure. The network data enable us to assess some instances when a given decision “succeeded” (i.e., was cited in a later opinion) or “failed” (i.e., was not cited in a later opinion). However, because later decisions cannot be cited by earlier opinions, the data do not allow us to observe whether a given opinion would have been cited by an earlier opinion. Our network structure is necessarily incomplete.
The method we describe and employ in this chapter is intended to deal explicitly with this problem of incompleteness. The method, developed and explored in more detail by Schnakenberg and Penn (2012), is founded on a simple (axiomatic) theoretical model that identifies each opinion’s latent quality in an (unobserved) world in which every object has the potential to succeed or fail. The theoretical model identifies the relative quality of the objects under consideration by presuming that the observed successes are generated in accordance with the independence of irrelevant alternatives (IIA) choice axiom as described by Luce (1958). In a nutshell, the power of this axiom for our purposes is the ability to generate scores for alternatives that are not directly compared in the data. Substantively, these scores locate all opinions on a common scale.
1 Inferring Quality from Network Data
We conceive of our data as a network in this chapter. Accordingly we first lay out some preliminaries and then discuss how one applies the method to general network data. We represent the observed network data by a graph denoted by G=(V,E), where V={1,2,…,n} is a set of n vertices and E is a set of directed edges, where for any v,w∈V, (v,w)∈E indicates that there is an edge from v to w.Footnote 3 We define a community to be a subset of vertices, C⊆V, with a community structure \(\mathcal{C}=(C_{1},\ldots, C_{n})\) being a set of subsets of V, and C i being the community of vertex i.
Underlying our model is an assumption that each vertex j in a community C i has the potential to influence vertex i. To define this formally, let \(\tilde{E}\) be a set of potential interactions, with \(E\subseteq \tilde{E}\). If (i,j)∈E then we know that i and j interacted with j influencing i, and so it is known that they had the potential to interact: it is known that j∈C i . On the other hand, of course, \((i, k)\not\in E\) need not imply that i could not have been connected to k. Rather, it may be the case that opinion that i could have been connected to k, but the link was not created for some reason (possibly because k was not of high enough quality to influence i, possibly because k and i never had an opportunity to interact, or for some other independent factor(s)). Our community structure is designed to accommodate this fact, and in particular we assume that k∈C i implies that \((i, k)\in \tilde{E}\). Thus, k being in community C i implies that k had the potential to influence i (i.e., i had the opportunity to link to k), regardless of whether k may or may not have succeeded (i.e., regardless of whether an edge between i and k is observed).
The second assumption of our model is that each vertex can be placed on a common scale representing the vertex’s quality. We assume that vertices with higher latent qualities are more likely to have had successful (i.e., influential) interactions with vertices that they had the potential to interact with. Thus, the higher latent quality of vertex i, the more likely that, for any given vertex j∈V, \((j, i)\in \tilde{E}\) implies that (j,i)∈E.
Our goal is to estimate each vertex’s “latent quality” score subject to a network G and an observed or estimated community structure, \(\mathcal{C}\). We conceive of our network and community structure as generating a collection of “contests” in which some vertices were influential, some had the potential to be influential but were not, and others had no potential to influence. These contests are represented by the set \(\mathcal{S}=\{s\in V: (s, v)\in E\mbox{ for some }v\in V\}\). Thus, every vertex that was influenced represents the outcome of a contest.
Let x=(x 1,…,x n )∈R n represent each vertex’s latent quality. Then for each \(i\in \mathcal{S}\) we let the expected influence of vertex k in contest i (i.e., probability of i connecting to k), which we denote by E(i,k), equal 0 if \((i, k)\not\in \tilde{E}\). Thus, k’s expected influence in contest i is zero because in this opinion we assume that \(k\not\in C_{i}\), and thus k had no potential to influence i (i.e., there is no chance that i will connect to k). Otherwise,
In words, the expected share of influence of k in a contest in which k has the potential to influence i is k’s share of latent influence relative to the total latent influence of the vertices that can potentially influence i.
Similarly, we can calculate the share of actual influence of k in i, or A(i,k), by looking at the total set of vertices that actually influenced i in the network described by G. This set is W i ={w:(i,w)∈E}⊆C i , and (without any additional information such as edge weights), k’s share is \(\frac{1}{{|W_{i}|}}\) if k∈W i and 0 otherwise. We can now utilize our network and community structure to estimate x subject to an unbiasedness constraint that is conditional on the community structure. The constraint is that
or that each vertex’s total actual score equals their total expected score. Satisfaction of this constraint implies, given a correct community structure, that no vertex is estimated to be more or less influential than it actually was. Schnakenberg and Penn (2012) prove that, subject to a minimal connectedness condition, there exists a vector \(x^{*}=(x^{*}_{1},\ldots, x^{*}_{n})\) that solves the above system of equations and that is unique up to scalar multiplication.Footnote 4 Viewed substantively, this vector represents the relative qualities/influences of the different nodes. In particular, as x ∗ is uniquely identified up to scalar multiplication, the ratio of any two nodes’ qualities,
is uniquely identified. This ratio \(\rho^{i}_{j}\) represents the hypothetical relative frequency of selection/influence by node i versus that by node j in a future contest in which both nodes i and j compete (i.e., for any future node that both i and j have the ability to exert influence on).
2 Measuring the Quality of Precedent
The use of judicial precedent by Supreme Court Justices—and, in particular, a focus on citations as an indication of this usage—has attracted sustained attention from legal and political science scholars for over 60 years.Footnote 5 Unsurprisingly, given the breadth of the topic, scholars have adopted various approaches to the study of precedent, but most have focused on the determinants of citation: in a nutshell, what factor or factors of an opinion augur revisitation of the opinion in future opinions?
Because our model imputes unobserved relationships between objects, it is particularly well-suited to analyzing networks in which certain links are impossible to observe. These types of networks could, for example, arise in situations in which vertices are indexed by time and a later vertex is incapable of influencing a vertex that preceded it.
We utilize a data set consisting of the collection of citations by United States Supreme Court majority opinions to Supreme Court majority opinions from 1791 to 2002. Thus, viewed in the theoretical framework presented above in Sect. 1, the vertices of our network are Supreme Court majority opinions, and if majority opinion i cites majority opinion j, we include the edge (i,j)∈E.
Before moving on, it is important to note what we are explicitly abstracting from in our operationalization of the judicial citation/precedent network. Most importantly, we omit consideration of all opinions other than the majority opinion. Both dissenting and concurring opinions are relevant for understanding both the bargaining processes at work in constructing the majority opinion and inferring the role and quality of precedent (e.g., Carrubba et al. (2011)).Footnote 6 In addition, our approach ignores the citing opinion’s treatment of the cited opinion (e.g., favorable, critical, or distinguishing).Footnote 7 , Footnote 8 We leave each of these for future work.
Differentiating Cases: Community Structure
As discussed earlier, the method we employ allows us to compare/score objects that have not been directly compared. Accordingly, it offers an analyst the freedom to “break up” the data in the sense of estimating (or, perhaps, observing) communities of objects that are less likely to be directly compared with one another. For the purposes of this chapter, we take into account only the temporal bias discussed earlier—later opinions cannot be cited by earlier opinions—and presume that each opinion is eligible (i.e., “in competition”) for citation by every subsequently rendered opinion.Footnote 9
Thus we construct the community C i for a given opinion i as follows. Letting Year(i) be the year in which opinion i was heard, we assume that for any pair of vertices (i.e., majority opinions), i, j,
In words, an opinion can be influenced by any and only opinions that strictly predate it.
Data
We apply our method to Fowler and Jeon’s Supreme Court majority opinion citation data (Fowler et al. (2007), Fowler and Jeon (2008)). There are a number of ways one might approach this data when considering the question of the quality or influence of each opinion. The most straightforward approach would rank all of the opinions that have been cited at least once (any opinion that is not cited by any other opinion in the database cannot be ranked). In this approach, every opinion is a contest, and each opinion that is cited at least once is a contestant.
Practical constraints prohibit us from ranking all of the opinions. Fortunately, our approach implies that we can examine any subset of the data and recover relative rankings that are (in theory) identical to the rankings that would be estimated from the entire data set. Accordingly, we restrict our attention to the 100 most frequently cited opinions between 1946 and 2002. In graph theoretic terms, we examine the smallest subgraph containing all edges beginning or ending (or both) with an opinion whose in degree (number of times cited) ranks among the top 100 among the opinions rendered between 1946 and 2002. This graph contains many more than 100 opinions (3674, to be exact). After these opinions, and their incident edges, are selected, they are then used for our community detection algorithm, which we now describe.
Using the years of the opinions to create the communities as described earlier, we then solve for the influence scores of the opinions (i.e., contestants) as follows. First, we choose the contestants in turn and, for each majority opinion (i.e., contest) that was subsequent to an opinion and cited at least one member of the contestant’s community, we count the contestant as having been participant (i.e., available for citation) in that majority opinion/contest. If the contestant was cited in (i.e., won) that contest, the contestant is awarded 1/|W| points, where W is the set of opinions (contestants) cited in that majority opinion (contest). Otherwise, the contestant is awarded 0 points in that contest. With this vector of scores for each contestant in each contest, it is then possible to directly apply the method developed by Schnakenberg and Penn (2012) to generate the latent influence scores of each majority opinion, \(\hat{x}=(\hat{x}_{1},\ldots,\hat{x}_{n})\).
These latent influence scores represent, in essence, the appeal of each majority opinion as a potential citation in any subsequent majority opinion. What this appeal represents in substantive terms is not unambiguous, of course. It might proxy for the degree to which the opinion is easily understood, the degree to which its conclusions are broadly applicable,Footnote 10 or perhaps the likelihood that the policy implications of the opinion support policies that are supported by a majority of justices in a typical opinion. Obviously, further study is necessary before offering a conclusion on the micro-level foundations of these scores. Such research will require inclusion of observed and estimated covariates distinguishing the various opinions and majority opinions.
3 Results
We now present the results of three related analyses. We first present our results for the 100 most-cited opinions rendered between 1946 and 2002.Footnote 11 Following that, we present the results for the 100 most-cited opinions since 1800.Footnote 12 Finally, we consider the 204 most-cited opinions since 1800 with an eye toward comparing the ranking of the 100 most-cited opinions since 1946 with the ranking of those cases when all opinions that have been cited at least as many times as these 100 are considered.
3.1 Top 100 Opinions Since 1946
Table 2 presents the opinions with the top 36 estimated latent quality scores for this period. This is the set of opinions for which the estimated quality score is greater than 1, which is by construction the average estimated quality score for the 100 cases.
This ranking is interesting in a number of ways. The top two majority opinions score significantly higher than all of the others.Footnote 13 The top-scoring opinion, Chevron, is a well-known case in administrative law with broad implications for the judicial review of bureaucratic decision-making. The second-ranked opinion, Gregg, clarified the constitutionality of the death penalty in the United States. Of course, the third highest scoring opinion is the famous Miranda decision in which the Court clarified the procedural rights of detained individuals.
Space prevents us from a full-throated treatment of the scores, but a few simple correlations are of interest. Table 1 presents three Pearson correlation coefficients relating the opinions’ scores with, respectively, the age of the opinion, the number of subsequent opinions citing the opinion, and the number of subsequent opinions citing the opinion divided by the age of the opinion.
The negative correlation between the age of an opinion and its score is broadly in line with previous work on the depreciation of the precedential value (or, at least, usage) of judicial opinions.Footnote 14 It is important to note, however, that this effect is potentially at odds with the IIA axiom on which the scoring algorithm is based. We partially return to this question below when we expand the sample of opinions.
That the correlation between the opinions’ scores and the number of times each opinion has been cited by a subsequent Supreme Court majority opinion is positive is not surprising: the score of an opinion is obviously positively responsive to the number of times that an opinion has been cited, ceteris paribus. Accordingly, the interesting aspect of the correlation is not that it is positive but, rather, that it is not closer to 1. Indeed, inspection of Table 2 indicates, a fortiori, that the rankings of the opinions with respect to the number of citations they have received and with respect to their scores are not identical. Put another way: the scores are measuring something different than the opinions’ citation counts or, as it is commonly known in network analysis, the degree centralities of the opinions in the citation network.
Finally, the correlation between the score and the average number of times per year the opinion has been cited since it was handed down is strongly positive. This highlights the fact that the scores control for the fact that an opinion cannot cite an opinion that is rendered subsequently. Again, though, it is important to note that the ranking of the opinions generated by our scores differs from that generated by the number of citations per year. It is useful to consider the origins of this difference. Specifically, the distinction arises because of the fact that the IIA axiom on which the method is based implies that an opinion’s “reward” (or score) for being cited by a subsequent opinion is inversely proportional to the number of other opinions cited by that opinion. At the extreme, for example, a hypothetical opinion that cited every previous opinion would compress the scores of the opinions in the sense that the scores of all opinions that initially had lower than average scores would increase as a result of the citation by the hypothetical opinion, whereas the scores of all of those opinions with above average scores prior to the hypothetical opinion would decrease.Footnote 15
3.2 Top 100 Opinions Since 1800
We now present our results for the top 100 most-cited opinions rendered between 1800 and 2002. Table 3 presents the opinions with the top 38 estimated latent quality scores for this period. As with the previous analysis for the period between 1946 and 2002, this is the set of opinions for which the estimated quality score is greater than 1.
Comparing these scores with those in Table 2, it is perhaps surprising how similar the two sets of scores are. In particular, the top three majority opinions are identical and have very similar scores in the two analyses. Things get interesting at the fourth highest-scoring position. First, the majority opinion ranked fourth-highest in the 1946–2002 analysis reported in Table 2, Cannon v. University of Chicago, is not among the top 100 most-cited majority opinions since 1819.Footnote 16 The fourth highest-scoring opinion among the 100 most-cited majority opinions since 1819 is Miller v. California, in which the Court affirmed and clarified the power of state and local governments to place limits on obscenity. This opinion is, of course, among the top 100 most-cited rendered since 1946, yet ranks only 19th in the scores reported in Table 2. This point highlights a feature of the scores in both tables: after the top 3 or 4, there is a relatively large “plateau” of scores.
Beyond visual inspection, it is useful to reconsider the correlations analogous to those reported in Table 1. These are displayed in Table 4 and closely conform to the conclusions drawn in the discussion of the correlations reported in Table 1: older opinions tend to have lower scores, and scores are positively associated with both number of subsequent citations as well as the average annual rate of subsequent citation.
3.3 Probing IIA: Top 204 Opinions Since 1800
We calculated the scores for the top 204 most-cited majority opinions since 1819. This is the smallest set of most-cited opinions for the entire time period that contains the top 100 most-cited opinions rendered since 1946. Each opinion rendered after 1946 is accompanied by two scores and two ranks: the “Post ’46” values are identical to those reported in Table 2. The “Full” values, presented in Table 6, correspond to the rank of that opinion’s score from the analysis of the 204 most-cited opinions since 1800 relative to the analogous scores for the opinions rendered after 1946. The IIA axiom underpinning the scoring method implies that the relative ranking of the opinions should be invariant to including additional opinions, as the scoring of the 204 most-cited opinions does. Inspection indicates a strong similarity between the two rankings. Most telling are the following two correlations between, respectively, the (relative) ranks of the 100 post-1946 opinions in the two samples and the scores of these cases in the two samples in Table 5.
Each of these correlations indicate a very strong agreement between the (relative) ranks and scores, respectively, for the top 100 most-cited opinions since 1946. This agreement provides support for the supposition of IIA that identifies the method.
4 Conclusion
In this chapter we score all Supreme Court majority opinions since 1800 on the basis of their “quality” (measured as influence or citability), using network citation data. In placing all such opinions on a common scale we are faced with the problem that majority opinions cite heterogeneous numbers of other opinions and that an opinion cannot be cited by a different opinion that predates it—our network is necessarily incomplete. To deal with the incomplete nature of our data we utilize an axiomatic scoring method that is designed to compare objects that have never been directly compared in the data.
The scores calculated by this method are analogous to measures of network influence—specifically, it is a vertex metric. As such, it fundamentally differs from other centrality measures for partially connected networks such as eigenvector centrality and degree centrality. One difference is that our measure does not utilize the score of s in computing the contribution of link (s,v) to v’s score (as in eigenvector centrality); instead our score utilizes the scores of the other w that could have potentially influenced s, or \(\{w: (s, w)\in \tilde{E}\}\). In generating estimates of the x i using observed network and community data we impute “influence relationships” between vertices that did not have the potential to interact. This leads to the following interpretation of our scores: if there were a hypothetical vertex with a community equal to the set of all possible vertices, then our scores represent the expected influence of each vertex on that hypothetical vertex.
The analysis presented in this chapter is preliminary, with an obvious shortcoming being the fact that we assume that the community of a case i, or collection of cases that could potentially influence i, consists of all of the cases that predate it. In future work we intend to allow community structure to be determined not only by the year in which a case was considered but also by the topic of the case. Additionally, we hope to apply our scoring method to other types of incomplete network data as we believe it provides a useful new measure of node centrality that generalizes the concept of in-degree centrality.
Notes
- 1.
- 2.
The word “quality” is simply a placeholder, though one that is roughly descriptive (at least in common parlance) of the characteristic that our method is estimating. While one might be precise and use a term such as “citability,” we note the traditional issues of scope and space constraints and, setting this larger issue to the side, default to the use of a real word to refer to the latent construct our method is attempting to detect and estimate.
- 3.
In general network settings, we interpret a connection from v to w as implying that w “influences” or “is greater than” v. What is key for our purposes is that the notion of influence be conceptually tied to the notion of quality, as we have discussed earlier.
- 4.
For reasons of space, we refer the interested reader to Schnakenberg and Penn (2012) for more details on the method.
- 5.
- 6.
In addition, there are many interesting theoretical and empirical questions regarding how one should conceive of the relationship between opinions and opinions (e.g., Bommarito et al. (2009)) that the data we employ here do not allow us to explore more fully.
- 7.
- 8.
We are not aware of any recent work that has differentiated citations by the number of times the citation occurs in the citing opinion.
- 9.
Note that, for simplicity, we approximate this “later than” relation in the sense that we presume (unrealistically) that, in any year, the Court cannot cite one opinion that is decided in that year in another opinion that is decided in that same year. Given the number of years that we consider, this approximation affects a very small proportion of the number of potential citations we consider.
- 10.
Note that this is true despite the presumption that an opinion might have been feasible only in a subset of observed and subsequent majority opinions.
- 11.
This time period includes all cases in the Fowler and Jeon data for which Spaeth’s rich descriptive data (Spaeth 2012) are also available.
- 12.
This time period includes all cases in the Fowler and Jeon data.
- 13.
Note that the estimated scores for the top 100 opinions sum to 100, so these two opinions account for over 1/8th of the sum of the estimated scores. In other words, any opinion that cites exactly one of these 100 cases is predicted to cite either Chevron or Gregg almost 13 % of the time.
- 14.
See, for example, Black and Spriggs II (2010).
- 15.
Recall that the scores are identified only up to multiplication by a positive scalar, implying that they inherently relative scores.
- 16.
In that case, the majority opinion affirmed an individual’s right to sue recipients of federal financial support for gender discrimination under Title IX, which calls for gender equity in higher education.
References
Black RC, Spriggs JF II (2010) The depreciation of US Supreme Court precedent. Working paper, Washington University in Saint Louis
Bommarito MJ II, Katz D, Zelner J (2009) Law as a seamless web? Comparison of various network representations of the United States Supreme Court corpus (1791–2005). In: Proceedings of the 12th international conference on artificial intelligence and law, ICAIL’09. ACM, New York, pp 234–235
Carrubba C, Friedman B, Martin AD, Vanberg G (2011) Who controls the content of Supreme Court opinions? Am J Polit Sci 56(2):400–412
Clark TS, Lauderdale B (2010) Locating Supreme Court opinions in doctrine space. Am J Polit Sci 54(4):871–890
Fowler JH, Heaney MT, Nickerson DW, Padgett JF, Sinclair B (2011) Causality in political networks. Am Polit Res 39(2):437–480
Fowler JH, Jeon S (2008) The authority of Supreme Court precedent. Soc Netw 30(1):16–30
Fowler JH, Johnson TR, Spriggs JF II, Jeon S, Wahlbeck PJ (2007) Network analysis and the law: measuring the legal importance of Supreme Court precedents. Polit Anal 15(3):324–346
Gerhardt MJ (2008) The power of precedent. Oxford University Press, New York
Hansford TG, Spriggs JF II (2006) The politics of precedent on the US Supreme Court. Princeton University Press, Princeton
Landes WM, Posner RA (1976) Legal precedent: a theoretical and empirical analysis. J Law Econ 19(2):249–307
Lazer D (2011) Networks in political science: back to the future. PS Polit Sci Polit 44(1):61
Luce RD (1958) Individual choice behavior. John Wiley, New York
Merryman JH (1954) The authority of authority: what the California Supreme Court cited in 1950. Stanford Law Rev 6(4):613–673
Schnakenberg K, Penn EM (2012) Scoring from contests. Working paper, Washington University in Saint Louis
Spaeth HJ (2012) The United States Supreme Court database. Center for the Empirical Research in the Law, Washington University in Saint Louis
Spriggs J II, Hansford T, Stenger A (2011) The information dynamics of vertical stare decisis. Working paper, Washington University in Saint Louis
Ward MD, Stovel K, Sacks A (2011) Network analysis and political science. Annu Rev Pol Sci 14:245–264
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Patty, J.W., Penn, E.M., Schnakenberg, K.E. (2013). Measuring the Latent Quality of Precedent: Scoring Vertices in a Network. In: Schofield, N., Caballero, G., Kselman, D. (eds) Advances in Political Economy. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35239-3_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-35239-3_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35238-6
Online ISBN: 978-3-642-35239-3
eBook Packages: Humanities, Social Sciences and LawPolitical Science and International Studies (R0)