Introduction

Intra- and interassemblage comparisons of taxon presence and abundance are standard undertakings in zooarchaeology. Researchers regularly assess temporal and/or spatial variation in these attributes as a starting point to address questions of diet and prey selection, anthropogenic environmental impacts, technological change, taphonomy, and domestication, among many others (e.g., Bar-Oz and Dayan 2003; Carder and Crock 2012; Conolly et al. 2011; Grayson 1991; Grayson and Delpech 2002; Newsom and Wing 2004; Nims and Butler 2019; Whitaker and Byrd 2012; Wing and Wing 2001). More recently, meta-analyses have used large-scale comparisons of sites across extensive space to investigate regional patterning in relation to these questions (e.g., Conolly et al. 2011; Jones 2018; McKechnie and Moss 2016; McKechnie et al. 2014; Nims and Butler 2019; Orton et al. 2014). Comparisons of zooarchaeological assemblages, whether between sites or different contexts within a site, employ various measures to evaluate and quantify taxonomic variation and establish the cause of observed changes in assemblage taxa and abundances. Among the most common are comparisons of measures of richness, diversity, and evenness, but other methods include ubiquity, nestedness, and principle components analyses. These approaches differ in their data requirements, ease of application, and appropriateness to the questions at hand.

Less frequently used are measures of similarity—similarity indices—derived from ecology (Lyman 2008:185; but see Belmaker 2017; Carder and Crock 2012; Giovas 2018a; Jones 2016; Lyman 2014; Reitz and Wing 2008; Walsh 2015). A similarity index, or similarity coefficient, is a quantitative measure of how similar the constituents of two samples are in their composition (the inverse measure is dissimilarity) (Jones 2016:87; Jost et al. 2011; Magurran 1988:94-95). As Lyman (2008:185) discusses in relation to similarity, two zooarchaeological assemblages may share the same taxonomic richness (NTAXA) but exhibit variation in the specific taxa represented. Moreover, taxa in these assemblages may vary in their abundance in meaningful ways, even when the same taxa are present. Several dozen indices for measuring similarity exist in the ecological literature, varying in their calculation, the sample properties emphasized, and their suitability in a given context (Alroy 2015a; Chao et al. 2005; Jost et al. 2011; Magurran 2004; Todeschini et al. 2012; Wolda 1981). These indices can be divided into two basic types: those based on incidence, i.e., the presence or absence of a taxon, also known as binary similarity coefficients; and indices which incorporate taxonomic abundance in their calculation (Krebs 2014; Magurran 1988, 2004).

Given their potential usefulness and general ease of implementation, it is surprising that similarity indices are not more widely applied in zooarchaeological comparisons. Here, I review several of the more commonly used incidence-based and abundance-based indices, comparing their relative strengths and weaknesses using hypothetical assemblages. A simple method using paired similarity indices to evaluate correspondence in the taxa present and taxon-specific abundances in zooarchaeological assemblages is proposed. This approach is demonstrated through two case studies from the Caribbean archaeological sites of Sabazan (Carriacou), Sandy Ground (Anguilla), and Crève Coeur (Martinique) applying the corrected Forbes and Morista-Horn indices and a proposed classification scheme for interpreting results. I conclude by discussing the relative advantages and limitations of the approach.

Incidence-Based Similarity Indices

Similarity is a multi-faceted concept but in its simplest, incidence-based form may be conceived as the degree to which taxa identified in one assemblage also occur in a second assemblage. Accordingly, similarity is the amount of taxonomic overlap between two assemblages where the total number of distinct (mutually exclusive) taxa for the combined assemblages represents the maximum number of taxa that can be potentially shared. Here, I use “assemblage” to refer to an archaeological sample from a larger, uncensused archaeological population of animals exploited by past humans for which the population parameters are unknown. “Taxa” and “taxon” refer to categories which have been assigned a formal Linnaean classification and are potentially quantifiable by number of identified specimens (NISP), minimum number of individuals (MNI), biomass, or another quantitative unit. The concept of compositional correspondence between assemblages is illustrated in Table 1. Sixty taxa (NTAXA = 60) are recorded in each of two hypothetical assemblages, HA1 and HA2, across variable scenarios of taxonomic overlap. In scenarios 1–13, the number of taxa co-occurring in the two assemblages increases from 0 to 60, while the total number of distinct taxa from the combined assemblages decreases from 120 to 0. NTAXA for each assemblage has been held constant to illustrate different proportions of taxonomic correspondence between assemblages of equivalent richness. Scenarios 14 and 15 illustrate the same proportions of shared to unshared taxa as scenarios 3 and 7, but with differing assemblage evenness and absolute sample richness.

Table 1 The hypothetical assemblages with ten scenarios of taxonomic overlap and associated Simpson, corrected Forbes, and Morisita-Horn similarity statistics (J). Scenarios in bold are extrapolated into datasets with taxonomic relative abundance in Fig. 1 and Online Resource 1

There are more than 50 incidence-based indices (Todeschini et al. 2012) that could potentially be applied to evaluate correspondence between the sets of assemblages in Table 1, but two of the most long-standing and commonly employed are the Jaccard (1900) index and Sørensen (1948) index, the latter also known as the Dice (1945) coefficient. The Jaccard index represents the proportion of shared taxa to the total number of distinct taxa in two combined assemblages, while the Sørensen index represents the number of shared taxa in relation to the mean number of species in a single assemblage (Jost et al. 2011). Neither index considers joint absences of unseen taxa, that is, those taxa present in the archaeological population but not captured in sampling (Anderson et al. 2011). Both indices are scaled from 0 to 1, where 1 indicates identical taxonomic composition. While the Sørensen index is often considered among the more robust of the incidence-based similarity measures (Alroy 2015b; Barwell et al. 2015; Smith 1986 in Magurran 1988:95-96; Magurran 2004), it and the Jaccard index are unduly sensitive to sample size effects, tending to underestimate similarity (Alroy 2015b; Chao et al. 2005; Patterson et al. 2014). In cases where sampling is uneven between assemblages or there is significant inclusion of rare taxa, both indices underestimate similarity (Alroy 2015a; Chao et al. 2005). This is because less common taxa are unlikely to be represented in both assemblages when the sample size of one or both is low. Since assemblages are an incomplete sample of an unknown population, when assemblage richness is unequal, rarer taxa present in the larger assemblage may be missed by the smaller, limiting the potential number of shared taxa used to estimate similarity. For this reason, the Jaccard and Sørensen indices are less effective when assemblages are well nested.

To counter this effect, Simpson (1943) developed a similarity measure that assesses the number shared of taxa against the richness of the smaller assemblage only (not to be confused with the Simpson diversity index). The Simpson index is calculated as:

$$ {S}_{\mathrm{Sim}}=\frac{a}{\min \left\{\left(a+b\right),\left(a+c\right)\right\}} $$

where:

a:

is the number of taxa occurring in both assemblages

b:

the number of taxa occurring only in Assemblage 1

c:

the number of taxa occurring only in Assemblage 2

When b = c, the Simpson index delivers the same results as the Sørensen. In recent simulation studies, Alroy (2015a, b) showed that the Simpson index is outperformed by a modified version of the Forbes (1907) index. Alroy’s (2015a, b) corrected metric excludes the joint-absences that are part of Forbes’ original calculation, following the logic that the number of joint-absences is seldom known in ecological or paleontological contexts, a situation which also characterizes archaeology. In fact, substantial debate exists in the ecological literature over whether joint-absences constitute a biologically meaningful value (Anderson et al. 2011; Janson and Vegelius 1981; Krebs 2014). In addition to this adjustment, Alroy (2015a) applied three constants to the Forbes index to temper its bias under sample size effects. Following the same notation in Simpson’s equation above, the corrected Forbes index, denoted F′, is expressed:

$$ {S}_{F\prime }=\frac{a\left(n+\sqrt{n}\right)}{\left(a+b\right)\left(a+c\right)+a\sqrt{n}+\frac{1}{2}\left(b\times c\right)} $$

where n = (a + b + c)

Alroy (2015a) reports that F′ is not correlated with the number of taxa in the smaller assemblage. Simpson and F′ statistics are provided for the 15 scenarios of taxonomic correspondence in Table 1. As with the Jaccard and Sørensen indices, the Simpson’s and F’ indices are scaled from 0 to 1. Scenario 7 depicts a situation of 50% similarity where half the taxa in each assemblage are shared, or a = b = c. Here, Simpson’s index returns a value of 0.5, while the corrected Forbes index is 0.689. When sample size is large, the F′ correction settles the statistic down to 2/3 (0.667) (Alroy 2015a). Scenario 15 in Table 1 presents such a case with, admittedly, unrealistic species richness to illustrate this effect. In other words, the corrected Forbes index is sensitive to the absolute number of taxa, even when the proportion of shared to unique taxa remains unchanged between two sets of taxa. Treating the case of 50% similarity as the point between “less similar” and “more similar” assemblages, the Simpson index provides an intuitively appealing metric of 0.5 to mark this threshold. However, use of F′ and a threshold of 0.667 is probably more appropriate in most archaeological contexts because sampling of the record is generally incomplete. Alroy (2015a) found that the corrected Forbes index did underestimate similarity when the sampling is limited relative to the size of the taxonomic pool but still performed better than the Sørensen and Simpson indices under these conditions.

Both the corrected Forbes and Simpson indices equal 1 when all taxa are shared in common (scenario 13), but also yield 1 when the smaller assemblage consists entirely of taxa also found in the larger (scenario 14). Thus when a = b or a = c, both indices signal complete similarity, reflecting the nestedness of the assemblages. By contrast, in the same scenario, the Jaccard and Sørensen statistics are 0.091 and 0.167, respectively, reflecting the negative bias discussed above. There are archaeological circumstances where the sensitivity of the Simpson and F′ indices to nestedness might not be desirable, for instance, when the limited richness of one assemblage reflects something meaningful about past human behavior that a researcher wishes to measure (see the related discussion by Jost et al. 2011 involving the same issue with Lennon et al.’s (2001) conditional Sørensen index). A good example of such circumstances can be found in zooarchaeological studies of status-mediated consumption. Here, we might expect the resources consumed by one social group to (by intent) be a subset of those consumed by another, with the taxonomic differences between assemblages reflecting privileged, wider access to resources (e.g., Kirch and O'Day 2003). In such cases, if sampling is robust, the Sørensen index might perform better precisely because it is sensitive to assemblage unevenness. I suggest, however, the Simpson and corrected Forbes index are generally more appropriate for most contexts, and of the two, the latter is preferred based on simulations and empirical studies (Alroy 2015a).

Abundance-Based Similarity Indices

Even in cases of absolute taxonomic correspondence—where all taxa found in the first assemblage are found in the second and vice versa (i.e., scenario 13 in Table 1)—the relative abundance of taxa can differ between assemblages. Similarity may thus also describe the degree to which corresponding taxa occur at comparable levels of proportional abundance, that is, whether the same set of prey are exploited at similar intensities. As with incidence indices, numerous abundance indices exist to quantify this aspect of similarity (Anderson et al. 2011; Jost et al. 2011). Two of the most widely used are the Bray-Curtis (Bray and Curtis 1957) index, also known as the Sørensen quantitative index, and the Morisita-Horn (Horn 1966; Morisita 1962) index. While the former is popular and has been advocated in a number of publications based on its practical properties (e.g., Clarke and Warwick 2001; Faith et al. 1987), the Bray-Curtis index has more recently been critiqued for several conceptual and applied shortfalls, including vulnerability to sample size effects and its poor performance except when sampling fractions are equal (Chao et al. 2006; Jost et al. 2011; Krebs 2014; Wolda 1981). Samples taken from disparate geographic locations or time periods are vulnerable to this effect (Anderson et al. 2011), as are samples with few identified specimens since NTAXA is positively correlated with sample size (Grayson 1984; Lyman 2008).

The Morisita-Horn index, in contrast, has been found generally to perform well; it is less sensitive than other quantitative indices to differences in taxonomic richness and sample size (Barwell et al. 2015; Chao et al. 2006; Wolda 1981), although it is not without limitations, as discussed below. The index is calculated (Magurran 1988:95-96, 2004:174-175) as:

$$ {S}_{M-H}=\kern0.5em \frac{2\sum \left({an}_1\times b{n}_1\right)}{\left( da+ db\right) aN\times bN} $$

where

$$ da=\frac{\sum a{n}^2}{a{N}^2} $$

and

$$ db=\frac{\sum b{n}^2}{b{N}^2} $$
aN :

the number of individuals in Assemblage A

bN :

the number of individuals in Assemblage B

an 1 :

the number of individuals in the ith taxon in Assemblage A

bn 1 :

the number of individuals in the ith taxon in Assemblage B

The index returns a value of 0 for completely dissimilar assemblages and 1 for assemblages of absolute similarity. Figure 1 provides the Morisita-Horn statistics for scenarios 3, 4, 7, 11, and 12 from Table 1, calculated with varying taxonomic relative abundances based on NISP, although MNI could also be employed (NISP data for taxa appear in Online Resource 1). In scenarios 3 and 12, SM-H is low, 0.019 and 0.310 respectively, due to differences in the taxa present as well as significant disparities in the abundance of taxa that are common to both the HA1 and HA2 assemblages. In scenario 12, SM-H is low despite high taxonomic overlap (SSim = 0.917, SF’ = 0.991). By comparison, scenarios 4 and 11 exhibit high SM-H, 0.909 and 0.999 respectively, although the number of shared taxa in scenario 11 is nearly four times that of scenario 4. In the case of 50% compositional similarity found in scenario 7, NISP is identical for each shared taxon in HA1 and HA2, with abundance equally distributed between shared and unique taxa (Fig. 1, Online Resource 1). Here, a Morisita-Horn statistic of 0.5 marks the threshold between similarity and dissimilarity.

Fig. 1
figure 1

Five scenarios of (dis)similarity from hypothetical assemblages 1 and 2 in Table 1 illustrating varying taxonomic correspondence and proportional abundance between assemblages, with Simpson, corrected Forbes, and Morisita-Horn (M-H) index statistics provided. NISP for each hypothetical assemblage is 15,000 for all scenarios; see Online Resource 1 for NISP data

A well-recognized issue with the Morisita-Horn index is that it places greater emphasis on the abundances of the most prevalent taxa and is less sensitive to the presence of rare taxa (Krebs 2014; Magurran 2004). Arguably, this is a desirable property for zooarchaeological reconstructions of resource exploitation and diet because it reduces the “noise” of rare taxa that are likely incidental to the hunting and foodways under study. Additionally, it makes the index resistant to under-sampling of uncommon taxa (Jost et al. 2011). Logarithmic or square root transformations can be used to reduce the influence of dominant taxa, however, if warranted by the research question (Clarke and Warwick 2001; Wolda 1981). Alternatively, other indices may be appropriate. Chao et al. (2005, 2006) critique the Morisita-Horn index for underestimating similarity when sample size is small and for its sensitivity to the most abundant taxa even where sample size is adequate. They derive adjusted, abundance-based Jaccard and Sørensen indices and offer these in place of the Morisita-Horn and similar indices, showing how the former outperform the latter in simulation and practical analyses (Chao et al. 2005, 2006). The adjusted Jaccard and Sørensen indices, however, consider the effect of unseen shared taxa, which as noted above, may not always be appropriate. Anderson et al. (2011) suggest such joint-absences may be suitable to incorporate into metrics where the research question is concerned with phenomena that cause changes in total abundances instead of proportional abundances. In light of this consideration and the fact that it reduces the influence of incidental taxa, the Morista-Horn index is employed in the practical examples below. However, the choice of similarity index, whether incidence- or quantitative-based, is one that should be made actively by the researcher with attention to the relative advantages and disadvantages as these relate to the research question under consideration.

Application of Paired Incidence- and Abundance-Based Indices

Since taxonomic abundance can be incorporated into similarity quantifications using abundance-based indices, why use incidence-based indices at all? In ecology and other settings, incidence-based indices are appropriate when abundance data are unavailable or unreliable. While these shortfalls also occur in archaeology, the act of assigning a zooarchaeological specimen to a taxon is generally accompanied by quantification of NISP and/or MNI, and sometimes other abundance measures, such as biomass. This practice, then, would seemingly make incidence-based similarity metrics moot for archaeology. The two types of similarity metrics measure different properties, however. I suggest that the knowledge each provides can contribute added insight when the two are used together.

In ecology, similarity indices are often applied in evaluations of β diversity—generally defined as the change in taxa (turnover) among biotic communities along a habitat gradient at scales between the local area and broad region (Barwell et al. 2015; Whittaker 1972)—with abundance indices more sensitive to the desired target properties for this assessment (Anderson et al. 2011; Barwell et al. 2015; Jost et al. 2011; Magurran 1988). (Zoo)archaeological applications, however, may extend to other questions where both incidence and quantitative measures of similarity may be of interest. For example, high taxonomic overlap, irrespective of abundance—particularly if reliable quantification is problematic—may signal the presence of similar faunal communities indicative of a comparable environment (Faith and Lyman 2019). It is also worth considering why and when we should expect the exploitation intensity of a given taxon in two assemblages to mirror each other. On one hand, there are cultural and theoretical (e.g., foraging theory) reasons for anticipating quantitative similarity between zooarchaeological assemblages, particularly if these come from the same local area. On the other hand, even in such cases, dissimilarity can arise due to differences in technology, resource access, site function, and specific socioeconomic constraints or opportunities. In other words (dis)similarity can be driven by human behavior. Incidence- and abundance-based indices allow us to capture different aspects of this, tease out, and better understand variation. I therefore propose combined application of the two index types as a simple means for evaluating the two fundamental dimensions of assemblage similarity and offer a classification framework for describing the interaction between these two indices. In the practical case studies presented in the following section, I use the corrected Forbes index to measure taxonomic correspondence because of its resistance to sample size effects and strong performance compared to other incidence-based indices (Alroy 2015a, b). This is combined with the Morisita-Horn index, selected for its general robustness and to avoid undue influence of rare taxa on comparisons of resource exploitation. Index selection also considers that for consistency both indices should either include or exclude joint-absences. This paired-index approach is in keeping with Anderson et al.’s (2011) recommended application of multiple similarity measures, each directed toward specific hypotheses, to investigate different properties of interest.

When used in concert, the corrected Forbes index indicates the degree of taxonomic correspondence between two assemblages, while the Morisita-Horn index reflects the extent to which taxa are exploited in similar proportional abundances, that is, at similar levels of exploitation intensity. Together they quantify the “breadth” and “depth” of faunal resource overlap. Figure 2 illustrates the intersection of these two measures and defines four potential conditions that may characterize a set of samples: qualitative similarity, where SF′ is high and SM-H is low, indicating exploitation of many of the same taxa but at different proportional abundances; quantitative similarity, where SF′ is low and SM-H is high, indicating the few matched taxa are also those most heavily exploited in both samples; substantive similarity, where both SF′ and SM-H are high, indicating an emphasis on the same taxa at comparable levels of exploitation; and dissimilarity, where both SF′ and SM-H are low, indicating the samples differ qualitatively and quantitatively from each other, that is, more unalike than alike. The thresholds for classifying SF′ and SM-H measurements as “similar” versus “dissimilar” are 0.667 and 0.500, respectively, following from the discussion above, but these could be shifted based on alternative criteria. The four classifications are a way to interpret and conceptualize index results and can help identify appropriate questions and tests that might be conducted to more fully understand the cause(s) of patterning. Chi-square tests and residual analysis, for example, could be applied to understand which specific taxa drive dissimilarity between samples due to significantly different abundances (Cannon 2001; Giovas 2013; Grayson and Delpech 2002; Lyman 2008).

Fig. 2
figure 2

Intersection of the corrected Forbes and Morita-Horn Index classifying four scenarios of potential similarity and dissimilarity between faunal samples

The paired-index method proposed here offers several significant analytical and interpretive advantages: (1) it provides a means to precisely quantify the similarity between assemblages according to a meaningful, delimited scale; (2) it offers a straightforward approach for understanding how the dimensions of assemblage similarity interact based on a proposed classification scheme; and (3) it is simple to apply and does not require advanced software applications or technical skills (for example, the statistics in this paper were computed using a basic Microsoft Excel function). I suggest the methodological and interpretive accessibility of this method make it a valuable first-order tool for comparing zooarchaeological assemblages.

An important consideration which must be addressed when applying similarity indices is that taxa shared between samples must first be rendered at equivalent levels of classification. Within a sample, this necessitates that richness (NTAXA) be based on non-overlapping taxonomic designations. Consider a scenario in which the HA1 assemblage includes specimens identified to white-tailed deer, Odocoileus virginianus, and the deer family, Cervidae, while HA2 contains specimens identified to Cervidae, but not to lower taxonomic designations within the family. Since it is possible that the Cervidae category includes unrecognized or non-diagnostic specimens of O. virginianus, the two deer taxa in HA1 are not mutually exclusive (i.e., independent) of each other, and cannot be counted as two distinct taxa against HA2’s one deer taxon. Thus, in order to calculate similarity and facilitate comparison between the assemblages at the appropriate taxonomic level, it is necessary to aggregate the O. virginianus and Cervidae specimens into a single taxon, Cervidae. While this procedure may reduce taxonomic resolution, it is the same standard analytic protocol required for calculating richness (NTAXA) and richness-derived measures, such as diversity and evenness, in order to avoid “overcounting” taxa, that is, counting the same taxon more than once (Giovas 2018b; Grayson 1991). For comparisons involving two assemblages only, the extent of aggregation may be limited, yielding minimal loss of taxonomic resolution. However, for cases of multiple pairwise comparisons (e.g., Walsh 2015) involving taxonomic groups with significantly varying levels of specificity (e.g., a mix of overlapping species, genus, and family designations within a given family), aggregating taxa into the same mutually exclusive designations across all samples could reduce analytic resolution considerably. Alternative analytic approaches are recommended in such cases.

Paired-Index Case Studies

Two case studies from two pre-European contact sites and one post-contact site in the Lesser Antilles archipelago of the Caribbean are presented below: Sabazan, on the island of Carriacou (Grenada); Sandy Ground on Anguilla; and Crève Cœur on Martinique. Corrected Forbes and Morisita-Horn similarity assessments employ vertebrate assemblages quantified by NISP. Original data and associated analytic methods are published in Giovas (2013, 2016, 2018b) for Sabazan, Carder and Crock (2012) and Carder et al. (2007) for Sandy Ground; and Wallman (2018) and Wallman and Grouard (2017) for Crève Cœur. To allow for comparison, overlapping taxonomic designations and associated NISP counts were aggregated into discrete taxonomic categories at the next highest taxonomic level, up to and including family level categories (Table 2). Taxa recorded at levels above the family (e.g., Rodentia, Lacertilia, unidentified Aves, and medium mammal) were excluded from analysis. Species and genera with a confer (cf.) designation were assigned to the taxon in question when the latter was recorded elsewhere in the assemblage, but if not, the cf. designation was left standing. Because of the very high diversity among Caribbean marine fish (Grenada = 495 species, Anguilla = 432 species, Martinique = 445 species; Froese and Pauly 2019) and considerable variation in the degree of taxonomic information provided by different skeletal elements, it was necessary to aggregate all fish taxa into family-level categories. Reliance on family-level taxa is typical for comparative zooarchaeological studies involving tropical marine fish (Giovas et al. 2017; Newsom and Wing 2004). While aggregating specimens in this way lowers resolution and depresses NTAXA somewhat, it preserves larger sample sizes. The alternative for the case studies would have been to maintain genus- and species-level designations at the cost of excluding mutually inclusive family-level designations. Doing so would have reduced assemblage NISP by more than 50% in some instances and introduced far more bias since reductions would have been distributed disproportionately across taxa. All statistics were calculated in Microsoft Excel.

Table 2 Vertebrate zooarchaeological assemblages from Sabazan, Carriacou; Sandy Ground, Anguilla; and Crève Cœur, Martinique; based on data from Carder and Crock (2012), Carder et al. (2007), Giovas (2013, 2016, 2018b), Wallman (2018), and Wallman and Grouard (2017)

Case Study 1: Comparing Sustainable Resource Use at Sabazan, Carriacou and Sandy Ground, Anguilla

Sabazan is an Indigenous village site located on Carriacou’s Atlantic coast within the Grenada Bank (also known as the Grenadine Bank), a shallow, productive marine bank that extends ca. 4000 km2 from Grenada through the Grenadines microarchipelago in the southernmost Lesser Antilles. Radiocarbon dates indicate continuous site occupation by forager-farmers AD 400–1400, consistent with the presence of Terminal Saladoid and Troumassoid ceramics (Fitzpatrick and Giovas 2011; Giovas 2018b). Zooarchaeological remains come from sediments excavated from stratified midden deposits dated to AD 600–1400 and screened through nested 6.4-mm (1/4-inch) and 1.6-mm (1/16-inch) sieves. Prior analysis (Giovas 2013, 2018b) revealed heavy reliance on marine resources with lesser exploitation of terrestrial vertebrates, mostly comprising introduced South American mammals such as the opossum (Didelphis sp.) and agouti (Dasyprocta sp.).

Sandy Ground is a large village site situated on the western Atlantic coast of Anguilla, northernmost of the Lesser Antillean islands. It exhibits continuous occupation from AD 650 to at least AD 1200/1500, making it roughly coeval with Sabazan. Culturally, the site belongs to the northern expression of the same Troumassoid culture area to which Sabazan belongs. Like Carriacou, Anguilla lies on a marine bank that would have provided Sandy Ground’s inhabitants with ready access to abundant marine resources (Carder et al. 2007). The Anguilla Bank is a ca. 3400 km2 shallow submerged area that is among the most productive reef systems in the Caribbean region (Carder et al. 2007; Hoggarth 2001). Zooarchaeological remains come from stratified deposits screened through 6.4-mm (1/4-inch) and 3.2-mm (1/8-inch) sieves (Carder et al. 2007).

Sabazan and Sandy Ground are distinctive for being among only a handful of pre-contact sites with demonstrated evidence for long-term (ca. 800 years), sustainable fishing in the Lesser Antilles (see also Grouard 2001), standing in contrast to wider evidence for anthropogenic impacts on pre-contact Caribbean fisheries (Grouard et al. 2019; Newsom and Wing 2004; Wing and Wing 2001). Compelling correspondences between the two sites raise the question of whether the findings for sustainability at each might lie in similarities of prey choice and intensity of exploitation. Both sites occur on small islands (Carriacou, 34 km2; Anguilla, 91 km2) with exceptional limitations in freshwater and terrestrial vertebrate resources that are seemingly offset by access to nearshore fringing and patch reefs within productive fishing banks. Located at opposing ends of the Lesser Antilles, Sabazan and Sandy Ground also exhibit high exploitation (≥ 20% NISP) of members of the tuna and mackerel family (Scombridae), an interesting parallel given that this pattern is atypical for the region (see discussion in Giovas 2018b; Carder et al. 2007; Carder and Crock 2012; see also Bochaton et al. 2021). Notwithstanding, the two sites differ in their immediate environments (Giovas 2013; Carder et al. 2007). Sabazan is situated between steep volcanic hills that descend into a sheltered bay with a mixed rocky and sandy beach (Giovas 2013), while Sandy Ground lies on a strip of barrier beach proximate to a salt pond, swamps, and mangroves; the underlying geology is carbonate rock (Carder et al. 2007).

In this case study, the corrected Forbes and Morista-Horn indices are applied to the Sabazan and Sandy Ground assemblages to explore the question: are zooarchaeological assemblages from Caribbean sites with evidence for sustainable vertebrate resource use similar in the types and abundances of taxa exploited? In a prior study, Carder and Crock (2012) compared the assemblage similarity of five Anguillan pre-contact sites with evidence for sustainable fishing using the Renkonen (1938) percentage similarity index in multiple pairwise comparisons. They found that similarity was higher between sites with similar environmental conditions. Given that Sabazan and Sandy Ground share a notable emphasis on scombrids but differ in their surrounding habitats, the question of whether a certain combination(s) of taxonomic composition and/or exploitation intensity is more likely to be sustainable in pre-contact Caribbean contexts is warranted. I explore this question using paired indices to illustrate this approach. However, truly comprehensive testing of this hypothesis requires region-wide investigation of multiple sites with evidence for both sustainable and unsustainable foraging, and is thus beyond the scope of this paper.

Corrected Forbes and Morisita-Horn statistics are provided in Table 3. Both are high, indicating substantive similarity between the assemblages. Although other potential causes need to be ruled out, this finding would be expected if sustainability was conditioned by exploiting a particular set of taxa at certain intensities. This result challenges the presumption that shared environmental characteristics necessarily underlie high assemblage similarity, instead supporting the notion that certain taxonomic compositions may have been more resilient to human predation due to species’ ecology and life history characteristics. SM-H (0.724) is lower than SF′ (0.907); however, indicating that although almost all taxa at Sandy Ground also occur at Sabazan, proportional abundances between shared taxa do vary and the assemblages are not identical. Differences of 5% NISP or more in shared taxa can be seen in sea turtles (Cheloniidae), parrotfish (ScaridaeFootnote 1), surgeonfish (Acanthuridae), jacks and scads (Carangidae), and groupers and sea basses (Serranidae) (Table 2).

Table 3 Zooarchaeological assemblage data, corrected Forbes and Morisita-Horn similarity statistics, and similarity classifications for the Sabazan, Sandy Ground, and Crève Cœur sites

This case study illustrates some of the possible concerns discussed above. For instance, SF′ may be inflated somewhat by the aggregation procedure since this might mask sub-family diversity, depressing NTAXA, especially for fish. Within the scombrids, for example, reported nominal taxa at Sandy Ground are Scomberomorus spp. and Euthynnus spp. (Carder et al. 2007), while at Sabazan these are Auxis sp., Thunnus sp., Katsuwonus pelamis, and Euthynnus alletteratus (Giovas 2013); this represents two and four distinct taxa, respectively, compared to the single taxon that is recorded when these taxa are aggregated at the family level. For both assemblages, however, more than 50% of scombrid specimens cannot be assigned to a species or genus warranting the conservative aggregation approach used here both to preserve sample size and acknowledge that remains of undetected shared taxa may be included among specimens recorded only to family.

Another consideration is that differences in screen size between the two sites may have influenced specimen recovery and resulting richness and relative abundances (Giovas 2018a; Gordon 1993; Nagaoka 1994) thereby influencing similarity metrics. In practice, this may be less of an issue since the majority of diagnostic skeletal material captured in Sabazan’s 1.6-mm screen represents what would have been captured in a 3.2-mm screen had this been used. Ultimately, concerns over aggregation and recovery methods are not unique to similarity indices, but are instead ones that affect quantitative zooarchaeological approaches in general, and as such may be addressed through a range of established techniques. The important point to note is that formally quantifying both types of similarity reveals an Indigenous emphasis on roughly the same set of taxa at close to the same levels of exploitation, which may be significant for understanding what made resource exploitation at these two sites sustainable over ca. 800 years. Further investigation would be beneficial to better understand the relationship between taxonomic composition, environmental characteristics, and sustainable exploitation in the region.

Case Study 2: Comparing Pre- and Post-European Contact Assemblages from Sabazan, Carriacou, and Crève Cœur, Martinique

The second case study compares Sabazan’s vertebrate assemblage to that of the Crève Cœur site, located on the southern peninsula of the mountainous volcanic island of Martinique (1128 km2) about midway along the Lesser Antillean chain. Habitation Crève Cœur is an 18th–19th century French colonial sugar plantation spanning the period before and following emancipation (1848) (Wallman 2018; Wallman and Grouard 2017). At its height, 200 or more enslaved peoples were held at this estate (Wallman 2018). Faunal remains come primarily from residential areas in the slave village and post-emancipation sharecropping quarters, along with a smaller component from planter’s house, and reflect the “free Saturday” system in which enslaved peoples were obliged to make up shortfalls in estate provisioning through fishing, livestock rearing, and growing crops (Wallman 2018). This economic system continued into the sharecropping period following emancipation (Wallman 2018). The Crève Cœur zooarchaeological assemblage is thus a product of the dramatic socio-cultural, economic, and ecological transformation of the Lesser Antilles in the wake of European settlement. Zooarchaeological remains were recovered using 3.2-mm (1/8-inch) screens and floatation of 5-L and 10-L sediment samples collected on a per case basis (Wallman and Grouard 2017). Here, I employ paired corrected Forbes and Morisita-Horn indices to address the question: how (dis)similar is colonial era faunal resource exploitation, as represented by Crève Cœur, to pre-contact Indigenous patterns seen at Sabazan?

While SF′ (0.805) is high, SM-H (0.492) falls below the 50% similarity threshold, indicating the two assemblages are qualitatively similar—the Sabazan assemblage nests relatively well within Crève Cœur’s—but they differ in how abundances are distributed across taxa (Table 3). This is partially expected. The Caribbean colonial era saw the widespread introduction of European livestock and other Eurasian fauna (Borroto-Páez and Woods 2012), which appear in the Crève Cœur assemblage at low (< 10%) abundances and are, obviously, not found in earlier Indigenous sites (Table 2). However, because colonial economic systems compelled enslaved peoples to exploit the surrounding natural environment for subsistence, especially the sea, this manifests as a high number of shared taxa (24 of 40 distinct taxa), driving incidence-based similarity upward.

To better understand these patterns, I partitioned the Sabazan and Crève Cœur vertebrate assemblages into marine and terrestrial resource components, assigning sea turtles to terrestrial habitats as these were most likely acquired on land during nesting season when females come to shore to lay eggs. The resulting similarity statistics illuminate the socioeconomic processes of Martinique’s colonial period and provide robust quantifications for these (Table 3). The terrestrial assemblages are dissimilar (SF′ = 0.598, SM-H = 0.050), not only in the taxa present but especially in how abundances are distributed across taxa, due to the reliance on introduced mammals. In contrast, the marine assemblages exhibit substantive similarity. Here, high SF′ (0.894) and moderately high SM-H (0.605) statistics reflect exploitation of a majority of the same fish families at similar, although not identical, levels of intensity. These results indicate that, at least in this case, marine resources continued to be important to islander lifeways following European contact despite the introduction of domestic animals and wider-scale economic transformation. Although this exercise is exploratory and further investigation employing additional pre- and post-contact assemblages from Martinique and the wider Lesser Antilles is needed, it demonstrates how the paired-index approach can provide concrete metrics that quantify and characterize the processes of cultural transformation as seen in the zooarchaeological record.

Discussion and Conclusion

The simple method for comparing zooarchaeological assemblages presented here combines established, incidence- and abundance-based indices to quantify the different dimensions of assemblage similarity and reveal how these intersect. Employing the classification matrix provided above, quantitative results can be translated into intuitively meaningful categories describing states of dissimilarity and qualitative, quantitative, and substantive similarity. Below, I discuss the primary advantages as well as some of the limitations of this paired-index method and consider some of its other possible applications.

A major benefit of the paired-index approach lies in the quantitative precision it provides to comparisons of taxonomic similarity while also providing a framework and terminology to conceptualize similarity beyond a linear continuum of low versus high. This can be seen in the first case study comparing the geographically and environmentally disparate sites of Sandy Ground and Sabazan. While the Morisita-Horn index on its own would have shown the zooarchaeological assemblages were similar in the exploitation intensity of the more common taxa, combining this with the results of the correct Forbes index adds information. Assemblage correspondence is not just high, but substantively high, registering as elevated in two dimensions of similarity according to quantifiable criteria. In short, this approach explicitly recognizes that (dis)similarity between faunal samples may take different forms and allows analysts to measure and characterize those forms for a more nuanced understanding of assemblage correspondence.

All the indices considered above are simple to calculate and interpret, making them accessible to researchers with varying levels of statistical and technical software skills. Importantly, the method considers the specific taxa present, unlike evenness and diversity measures (e.g., Shannon-Wiener index or Simpson’s diversity index), which are often employed in assemblage comparisons but are not sensitive to taxonomic correspondence and thus cannot reveal (in)consistencies in the specific animals exploited. Two assemblages may exhibit similar levels of diversity and evenness, for instance, but contain entirely different fauna.

Having said this, the paired-index method is not proposed as a replacement for measures such as Shannon-Wiener diversity, nestedness analysis, analysis of variance, Chi-square, or other common statistical techniques. These statistics have different properties and target different, although potentially related, questions and are effective in their own right when appropriate to the analytic circumstances. Instead then, the application of paired similarity indices should be regarded as a complementary tool in an arsenal of methods researchers have their disposal. Seen in this light, the approach offers a valuable new technique that expands the types of research questions archaeologists can ask, as well as a means to refine lines of inquiry and identify the appropriate analytic methods to pursue them. For instance, the results showing disparities between marine and terrestrial resource use in the Crève Cœur-Sabazan case study suggest that enslaved people’s access to fishing gear versus firearms for use in hunting may have influenced assemblage composition and might be a productive area for future investigation.

Worth noting is that the paired-index method is flexible and can be employed to assess the similarity of samples other than those composed of biological taxa. In zooarchaeology, it could be applied to classes comprising skeletal parts or elements to compare patterns of attrition and taphonomy between sites. Habitat or resource patches indicated by the species in assemblages could also be assessed with similarity indices to investigate how foraging effort is distributed over the landscape (e.g., Giovas 2013). Building on prior studies using single similarity indices (e.g., Barceló et al. 2019; Carr and Case 2005; Rick 1996), the method can also be employed within wider archaeology to compare assemblages of artifacts, cultural traits, or other quantifiable, discrete types. It should also be noted that the case studies above involve archaeological sites, but sample comparisons could just as easily be made between screen-size fractions, residential features, social groups, etc. The method thus has a wide variety of potential applications. If the situation demands, it also allows for other incidence and/or abundance indices to be substituted for the corrected Forbes and Morisita-Horn metrics, and the thresholds for (dis)similarity classification adjusted as appropriate.

The flexibility of the paired-index method highlights an important limitation, however. It cannot be used passively. The appropriate indices for combined application must be actively chosen based on the nature of the research questions, the distribution of the data, and the representativeness of sampling. Among incidence-based indices, for example, the Jaccard index may be generally unsuitable since it is significantly biased by sample size effects (Chao et al. 2005), which are a common occurrence in zooarchaeology. For assemblages of uneven richness, the Sørensen index tends to underestimate similarity, making the Simpson or corrected Forbes index more appropriate unless the research question requires the metric to discount nestedness. Where sampling is robust, both the Simpson and corrected Forbes indices perform well, but where this is not the case, the latter is more suitable (Alroy 2015a, 2015b).

Among abundance indices, the Bray-Curtis and Morisita-Horn have enjoyed heavy use (Magurran 2004) because both are generally robust, although each has limitations. The Bray-Curtis should be avoided when sampling is uneven, limited, or widely variant in location or time period, while the Morista-Horn should be avoided where the influence of less common taxa on resulting similarity is important to capture (Anderson et al. 2011; Chao et al. 2006; Jost et al. 2011; Krebs 2014; Magurran 2004; Wolda 1981). Data transformations may be required to reduce the influence of dominant taxa, in which case the Morisita-Horn or Renkonen percent similarity index are recommended (Wolda 1981). If the existence of joint absences is a component of the research question, the adjusted Jaccard and adjusted Sørensen indices perform better than the Morisita-Horn index (Chao et al. 2005, 2006), although it should be appropriately paired with a incidence-based index which also considers joint absences, such as the Forbes index.

Over a decade ago Lyman (2008:185) noted that similarity indices have seen limited use in zooarchaeology. Little has changed since then. However, their ease of application and the insight they generate are strong arguments for their increased uptake. Here, I have sought to demonstrate that pairing similarity indices—in this case, the corrected Forbes and Morisita-Horn indices—provides a simple and robust method for quantifying similarity and describing how taxon presence/absence and abundance intersect between assemblages. The results are simple to compute and straightforward to conceptualize. These are compelling reasons to add the paired-index approach to the analytic toolkit and will hopefully expand the research questions that archaeologists can explore.