Keywords

Introduction

Archaeologists have long had an interest in being able to identify prehistoric populations (Foley 1987; Lyman et al. 1997; Hermon and Niccolucci 2017; Garvey 2018; Groucutt [Chap. 1] 2020), traditionally using distinctive sets of artifacts—stone tools, pottery, clothing, housing, rock art, fish weirs, and the like—as proxies for the actual groups responsible for making, using, and losing or abandoning the items (McNabb 2020; Reynolds 2020; Shipton 2020; Shott 2020). By the mid-twentieth century, these artifact sets had been used to subdivide much of the North American archaeological record into myriad cultural units such as stages, phases, aspects, foci, traditions, and horizons (e.g. Phillips and Willey 1953; Willey and Phillips 1958; Lehmer 1971). The units contained cores, or sets of artifacts that did not overlap with other sets in either time or space. Extending out from the cultural cores were still other sets that were shared by multiple units. The shared traits were viewed as stemming from common ancestry between populations, from enculturation, and/or from diffusion.

Our goal in this chapter is to offer several alternatives to the standard way of identifying archaeological populations. As examples, we focus on studies that incorporate models of cultural transmission grounded in evolutionary theory and modern analytical methods in order to identify populations and understand their patterns of interaction during the late Pleistocene colonization of North America. The precise timing of the colonization is debatable (see below), but what is not at issue is the point of origin of the colonizing populations. Overwhelming archaeological and archaeogenetic evidence (Waters and Stafford 2007; Goebel et al. 2008; Kemp and Schurr 2010; O’Rourke and Raff 2010; Raff et al. 2010; Morrow 2014; Raghavan et al. 2014, 2015; Rasmussen et al. 2014; Raff and Bolnick 2014, 2015; Hoffecker et al. 2016; Llamas et al. 2016; Blong 2018; Moreno-Mayar et al. 2018; Posth et al. 2018; Davis et al. 2019) indicates that humans moved eastward across the Bering Land Bridge, or Beringia, during the Late Glacial Maximum, perhaps as a result of a shift to warmer/wetter conditions in Beringia between 14,700 and 13,500 years ago, which was associated with the early Bølling/Allerød interstadial (Wooler et al. 2018). Migrant groups then made their way either south along or near the coastline (Fladmark 1979; Erlandson et al. 2007; Gilbert et al. 2008; Braje et al. 2017, 2019) and/or through a corridor that ran between the Cordilleran and Laurentide ice sheets that covered the northern half of the continent (Ives et al. 2014; Freeman 2016; Pederson et al. 2016; Potter et al. 2017, 2018). In our view, both scenarios remain equally viable (Potter et al. 2018; O’Brien 2019a).

With respect to timing, colonizing populations could have entered North America before 15,000 years ago (see below), but the earliest widespread human occupation of the continent dates to around 13,400 years ago (Potter et al. 2018), the visible manifestation of which is a tool kit referred to as the “Clovis techno-complex” (Bradley et al. 2010). That techno-complex is marked by a number of distinctive tool types, including bone and ivory rods (O’Brien, Lyman, et al. 2016; Sutton 2018), large prismatic stone blades (Bradley et al. 2010), and bifacially chipped and fluted stone weapon tips, referred to as “Clovis points” (Wormington 1957; Bradley 1993; Morrow 1995; Bradley et al. 2010; Sholts et al. 2012) (Fig. 13.1). The points exhibit parallel to slightly convex sides, concave bases, and flake-removal scars on one or both faces that extend from the base to about a third of the way to the tip. This flake removal, called “fluting,” created a thinner base that acted as a “shock absorber,” increasing point robustness and the ability to withstand physical stress through stress redistribution and damage relocation (Thomas et al. 2017; Story et al. 2018). Clovis points were hafted to spears that were thrust and/or thrown (Hutchings 2015) and, at least occasionally, functioned as butchering tools (Lyman et al. 1998; Smallwood 2013; Smallwood and Jennings 2016).

Fig. 13.1
figure 1

Clovis points from various North American sites. Top row (left to right): Townsend Co., Kentucky; unknown county, North Carolina; Williamson Co., Tennessee; Lewis Co., Kentucky (courtesy D. Meltzer); Essex Co., Massachusetts (courtesy J. Boudreau). Bottom row (left to right): Barnstable Co., Massachusetts (courtesy E. L. Bell); Essex Co., Massachusetts (courtesy J. Boudreau); Humphreys Co., Tennessee; Green Co., Kentucky; Columbia Co., Arkansas. All images from Whitt (2010) unless otherwise noted; composite by Matt Boulanger

As widespread as components of the Clovis tool kit are, they apparently were not the first technological items to appear in North America. Several well-dated sites in Texas, Florida, and Oregon have produced stone-tool assemblages (Waters et al. 2011; Halligan et al. 2016; Williams et al. 2018; Davis et al. 2019) that indicate there was clearly one or more technocomplexes already present by the time Clovis points were first made (see Haynes [2015] for in-depth discussion of other candidate sites). The beginning dates of those technocomplexes are difficult to assess, but it appears that they pre-date Clovis by one or two millennia and perhaps more. Other technologies in the West may have been contemporaries of Clovis (Beck and Jones 2010; Smith et al. 2019), but to us the jury is still out.

Clovis points appear to have originated in the American Southwest (Morrow and Morrow 1999; Hamilton and Buchanan 2007; Meltzer 2009; Beck and Jones 2010; Waters et al. 2011) and spread north and east, including up into the Canadian ice-free corridor (Smith and Goebel 2018). In eastern North America, with a few exceptions the earliest dates from archaeological sites that have produced large numbers of fluted points consistently fall later in time than the earliest fluted points in the West (Haynes et al. 1984; Levine 1990; Curran 1996; Bradley et al. 2008; Robinson et al. 2009; Miller and Gingerich 2013a, b). To simplify a rather complex chronology, we can assign a range of 13,400–12,800 years ago for Clovis in the western half of the continent and 12,800–12,500 years ago in the East, although more restrictive date and spatial ranges have been proposed (e.g. Waters and Stafford 2007, 2014).

The difference in chronological ranges between the East and the West has been explained as the result of Clovis points originating in the West and then spreading eastward as the result of either the movement of populations or down-the-line transmission among established populations (Hamilton and Buchanan 2009; Lothrop et al. 2011, 2016; Smith et al. 2015). It seems highly unlikely, however, that the small sample of radiocarbon dates for the Clovis period has captured the earliest or latest use of Clovis points (Waguespack 2007; O’Brien, Boulanger, Buchanan et al. 2014; Prasciunas and Surovell 2015) in either half of the continent, so we use the ranges above as estimates.

The stone points represent the primary sources of information about the dynamics of Clovis populations, having yielded insights into migration routes, mobility and economics, weapon systems, hunting and domestic activities, and the learning and transmission of technological knowledge (Anderson and Gillam 2000; Cannon and Meltzer 2008; Meltzer 2009; Smallwood 2012; Jennings 2013; Eren et al. 2015). It is the last topic—learning and transmission—that is of particular interest here. Cultural transmission encompasses the mechanisms that humans, as well as other primates, use to acquire, modify, and retransmit cultural information in particular instances (Eerkens et al. 2014), whether it be rules concerning the eligibility of potential marriage partners, instructions for how to produce fishing nets, or the proper method of flaking Clovis points. We can refer to the units of transmission as cultural traits. After being transmitted, cultural traits serve as units of replication in that they can be modified as part of an individual’s cultural repertoire through processes such as recombination, loss, or alteration within an individual’s mind. As with genes, cultural traits are subject to recombination, copying error, and the like and thus can be the foundation for the production of new traits (O’Brien et al. 2010). Using cultural traits as general proxies for human behavior might, at first glance, seem straightforward enough, but as we will see, the issue is much more complicated than it appears.

Cultural Units, Transmission, and the Problem of Analogy

Even before Darwin (1859) wrote On the Origin of Species, many naturalists made a distinction between what later would be called analogous traits and homologous traits. Analogous traits—analogs, for short—are those that two or more organisms possess that, although they might serve similar purposes, did not evolve because of any common ancestry. Birds and bats both have wings, and those traits share properties in common, yet we classify birds and bats in two widely separate taxonomic groups because birds and bats are only distantly related. This is because these two large groups diverged from a common vertebrate ancestor long before either one of them developed wings. Therefore, wings are of no utility in reconstructing lineages because they evolved independently in the two lineages after they diverged. Conversely, homologous traits—homologs, for short—are useful for tracking continuity resulting from inheritance because they are holdovers from the time when two lineages were historically a single lineage. As another example, all mammals have a vertebral column, as do animals placed in other categories. The presence of vertebrae is one criterion that we use to place organisms in the subphylum Vertebrata. The vertebral column is a homologous trait shared by mammals, birds, reptiles, and some fishes, and it suggests that at some remote time in the past, organisms in these groups shared a common ancestor.

American archaeologists working in the first half of the twentieth century appreciated not only that there was a distinction between homologs and analogs but that it applied as much to culture as it did to biology. Writing in the 1930s, Kroeber (1931: 152–153) had this to say on the subject:

There are cases in which it is not a simple matter to decide whether the totality of traits points to a true relationship or to secondary convergence. … Yet few biologists would doubt that sufficiently intensive analysis of structure will ultimately solve such problems of descent. … There seems no reason why on the whole the same cautious optimism should not prevail in the field of culture; why homologies should not be positively distinguishable from analogies when analysis of the whole of the phenomena in question has become truly intensive. That such analysis has often been lacking but judgments have nevertheless been rendered, does not invalidate the positive reliability of the method.

Note that although Kroeber was clear that there are two forms of similarity, one analogous and the other homologous, he was not clear as to how one might distinguish between them. He pointed out that identifying “similarities [that] are specific and structural and not merely superficial … has long been the accepted method in evolutionary and systematic biology” (Kroeber 1931: 151), but he offered no real opinion on how to separate what is “specific and structural” from what is “merely superficial” beyond undertaking a “sufficiently intensive analysis of structure.” He was correct: An intensive analysis of structure, especially a detailed comparative analysis, is critical to being able to make the distinction, but again, he did not offer any thoughts on how to do that. Thus, Kroeber, and he was by no means alone, landed on the default option: Formal similarities between sets of artifacts must signal some kind of relationship, either an ancestor–descendant relationship or one derived through ethnologically documented mechanisms such as diffusion and enculturation (Lyman et al. 1997). Gordon Willey (1953: 363) didn’t waffle on the matter, declaring axiomatically that “typological similarity is an indicator of cultural relatedness (and this is surely axiomatic to archeology), [and thus] such relatedness carries with it implications of a common or similar history” (emphasis added). This axiom, however, falls prey to a caution raised by paleontologist Simpson (1961), using monozygotic twins as an example: They are twins not because they are similar; rather, they are similar because they are twins and thus share a common history. There is a big difference between the two (O’Brien and Lyman 2000).

The default option—formal similarity signals relationship—continued to dominate archaeology, and the number of articles and monographs emphasizing diffusion and migration as explanatory devices continued to increase throughout the twentieth century. As Rowe (1966: 334) noted, however, most accounts were nothing more than poorly concocted just-so stories: “We are now being subjected in archaeological meetings to ever more strident claims that Mesoamerican culture was derived from China or southeast Asia, early Ecuadorian culture from Japan, Woodland culture from Siberia, Peruvian culture from Mesoamerica, and so forth. In the science-fiction world of the diffusionists, a dozen similarities of detail prove cultural contact, and time, distance, and the difficulties of navigation are assumed to be irrelevant.”

One of the studies to which Rowe clearly was referring grew out of the work of Ecuadorian archaeologist Emilio Estrada and two American colleagues, Betty Meggers and Clifford Evans, who saw definite evidence of transoceanic contact between Japan and coastal Ecuador around 5,000 years ago (Evans et al. 1959; Estrada et al. 1962; Meggers et al. 1965). Their claim was based on similarities between some of the pottery they were excavating in Ecuador and pottery they had seen in collections from southern Japan. How did the pottery in Ecuador get there? Estrada and colleagues proposed that Japanese fishermen were blown off course and that Pacific currents carried them to the Ecuadorian coast. It was there that they taught local populations the art of pottery making. It makes an interesting story, but again, similarity does not imply homology.

Style and Function: Not a Simple Dichotomy

Beginning in the 1970s, Robert Dunnell addressed the issue of convergence and divergence with his “fundamental dichotomy” between style, which he equated with homology, and function, which he equated with analogy (Dunnell 1978, 1980; Shennan 2020). In his scheme, stylistic traits, by definition, are those that are not under selection, whereas functional traits are those that are under selection. In archaeology, many examples of this dichotomy come to mind, such as that between a functional canoe paddle versus the stylistic design painted on it or perhaps between more-creative “private” rock paintings in limited-access caves versus tightly regimented and highly visible rock art on a more public landscape (Bradley and Valcarce 1998; Simek et al. 2013).

Although these synchronic distinctions can be used to create hypotheses, Dunnell’s (1980) point was to introduce a diachronic distinction between style and function, which could be identified by documenting change through time in the frequencies of artifacts or other proxies for behavior (see Shennan (2020), for an excellent discussion of the use of “style” in archaeology). The frequencies of stylistic traits—those not under natural selection—are expected to change in stochastic fashion, analogous to neutral traits in biology. This creates continuous, unimodal frequency distributions, as things come into fashion, reach their zenith, and then decline, finally disappearing. Conversely, functional traits can display one of several distributions. They might display a sharp rise in popularity followed by a steep decline (O’Brien and Holland 1990) as they are quickly replaced by other functional traits; they might display unimodal frequency distributions similar to those of stylistic traits; or they might display discontinuous, multimodal frequency distributions as a result of convergence or fluctuation in the selective environment.

For some reason, however, some archaeologists began arguing that only stylistic traits, not functional traits, could be used to measure interaction, transmission, and inheritance within and between populations. It was supposed that functional traits were useful only for identifying the presence of selection and measuring its effects. This was incorrect, and the confusion led archaeology down a long, convoluted rabbit hole (O’Brien and Leonard 2001). Put correctly, analogous traits can always be assumed to be functional, but the reverse is not always true: Functional traits can be either homologous or analogous. In other words, functional traits—those that by definition affect the fitness of the bearer—can show up in two different lineages as a result of either common ancestry or convergence (see Groucutt [Chap. 4] 2020). Let’s take a look at an example of misplaced use of functional traits as being unequivocally homologs. We use this particular example because it has a direct connection to our discussion of the early colonization of North America.

In 2012, Dennis Stanford and Bruce Bradley published the book Across Atlantic Ice: The Origin of America’s Clovis Culture (Stanford and Bradley 2012), the latest version of their proposal that North America was first colonized by groups from southern France and/or the Iberian Peninsula that used watercraft to make their way across the North Atlantic and into North America during the Last Glacial Maximum, some 20,000–24,000 years ago. This 6,000-km journey was facilitated, in their view, by a continuous ice shelf that provided the emigrants with fresh water and a stable food supply. In its initial formulation, the hypothesis was based primarily on similarities between stone tools associated with the Solutrean culture of Western Europe, which dates 23,500–18,000 years ago (Straus 2005), and those associated with the North American Clovis culture, which, as we noted earlier, dates 13,400–12,500 years ago.

Flaws in the “Solutrean hypothesis” were quickly pointed out. The multiple-thousand-year gap between Solutrean and Clovis made an ancestor–descendant relationship highly improbable, meaning that similarities in tool design were instead the result of convergence: unrelated populations of prehistoric flintknappers finding similar solutions to similar adaptive problems (Straus 2000; Will and Mackay 2020). To deal with the large chronological gap, Stanford and Bradley shifted their focus from similarities between the Solutrean and Clovis to supposed similarities among Solutrean, Clovis, and pre-Clovis tool types and production techniques (Stanford and Bradley 2002; Bradley and Stanford 2004). This was an unfortunate modification to their proposal because the pre-Clovis dates used by Stanford and Bradley—all of which are from highly questionable contexts—actually predate the Solutrean (O’Brien, Boulanger, Collard, et al. 2014). This would suggest that the traits appeared first in North America and then were carried to Europe. This, of course, is implausible.

That Stanford and Bradley fell prey to the “similarity equals relatedness” principle is not, as we’ve seen, an isolated incident, and we would be the first to admit that distinguishing between homologous and analogous traits is difficult. As we will show, however, it is not impossible. As an introduction to that issue, we are reminded of a quote from Clarke (1968: 211). What he said was not so different from what other archaeologists had said—Kroeber for example—but it contained an important contrast between two terms, phyletic and phenetic:

One of the fundamental problems that the archaeologist repeatedly encounters is the assessment of whether a set of archaeological entities are connected by a direct cultural relationship linking their generators or whether any affinity between the set is based on more general grounds. This problem usually takes the form of an estimation of the degree of affinity or similarity between the entities and then an argument as to whether these may represent a genetic and phyletic lineage or merely a phenetic and non-descent connected affinity.

Both terms, “phyletic” and “phenetic,” are grounded in the concept of “similarity,” but whereas the former signifies a descent-related affinity—one person or population being related to another one (or more)—the latter has nothing to do with descent. We now have at our disposal a battery of methods and techniques that offer objective grounds for making the distinction. One of them, cladistics, was introduced into biology in the mid-twentieth century (Hennig 1950, 1966) and, in various forms, has become the standard approach in the discipline. It also has seen widespread usage in archaeology, including in research focusing on the Clovis colonization of North America.

Phylogeny and Cladistics

From an archaeological standpoint, if the issue at hand is identifying populations and understanding how they are related—if indeed they are—then the bottom line is, use traits, often referred to as characters, that will potentially emit strong phylogenetic signals. Phylogenetic—Clarke’s “phyletic”—refers to relatedness between or among phenomena, whether they be sets of organisms—including human populations—or sets of stone tools. Conversely, “phenetic” ordering is based solely on similarity. There are several methods of investigating phylogeny, but here we focus on only one, cladistics, which defines phylogenetic relationships in terms of relative recency of common ancestry: Two groups—we’ll refer to them as taxa—are deemed to be more closely related to one another than either is to a third taxon if they share a common ancestor that is not also shared by the third taxon. The evidence for exclusive common ancestry is found in evolutionarily novel, or derived, character states. Note that our taxa could be sets of anything that is capable of evolving, including sets of stone tools, manuscripts, and groups of people.

Having said that, we point out that inanimate objects obviously do not breed and reproduce. This narrow view of the Darwinian process doomed early efforts to view the archaeological record in evolutionary terms (Lyman et al. 1997). It overlooked the fact that humans do breed and reproduce and that things such as stone tools are part of human phenotypes in the same way that teeth and bones are or that beaks and feathers are for birds. In essence, stone tools are proxies for the human behaviors that create them. All evolution cares about are three conditions being met: (1) variation is present, (2) the variation is inherited, and (3) there is a sorting mechanism that creates differential persistence of variants over generations.

As an example, Fig. 13.2 is a phylogenetic tree that shows relationships among four taxa. It tells us that based on a certain character distribution—more on that below—taxa C and D are more similar to one another than either is to any other taxon. It also says that taxa B, C, and D are more similar to one another than any of the three is to Taxon A. We know that taxa A–D evolved from ancestral taxa, although at this point we know little or nothing about those ancestors except that with respect to certain characteristics, taxa C and D look more like their immediate common ancestor (x) than they do the one (y) that unites them with Taxon B. Likewise, taxa B, C, and D look more like their common ancestor (y) than they do the one (z) that unites them with Taxon A.

Fig. 13.2
figure 2

Relationship of four taxa (A–D) and three ancestors (x–z). Based on a certain character-state distribution (not shown), taxa C and D are more similar to one another than either is to any other taxon. Also, taxa B, C, and D are more similar to each other than any of the three is to Taxon A. Related groups and their ancestors form ever-more-inclusive taxa, or clades: C + D + x is one clade; B + C + x + D + y is a second; and A + B + x + C + D + y + z is a third

In cladistics, convention is to place nodes at the points where branches meet and to refer to the nodes as ancestors that produced the terminal taxa (those at the branch tips). In our tree, taxa C + D, together with their hypothetical common ancestor (node x), form what is termed a monophyletic group, or clade. Taxa D + C + B and node x, together with their common ancestor (node y), form another, more inclusive clade, and taxa D + C + B + A (and nodes x and y), together with their common ancestor (node z), form yet another, and the most inclusive, clade. A common misconception is that the interior nodes—“ancestors”—are somehow “real.” They are not—hence our use of the term “hypothetical” above.

Another series of trees is shown in Fig. 13.3, this time with emphasis on the kinds of characters and character states that one encounters in archaeological phylogenetic studies. The trees show the evolution of a projectile-point lineage that begins with Ancestor A. For simplicity, we are tracking only a single character, fluting, which, again, is the removal of one or more longitudinal flakes from the base of a projectile point in order to thin it. Clovis points, as we noted, are fluted. Here, there are only two character states, fluted and unfluted. Over time, Ancestor A, which is unfluted, gives rise to two lines, one of which, like its ancestor, is unfluted and the other of which is fluted (Fig. 13.3A). Thus the character state “fluted” in Taxon 2 is derived from the ancestral character state, “unfluted.” In Fig. 13.3B, Ancestor B (old Taxon 2) gives rise to two new taxa, 3 and 4, each of which carries the derived character state, “fluted.” At this point “fluted” becomes a shared derived character state, defined as one that is shared only by sister taxa and their immediate common ancestor. Character states in sister taxa that have been inherited from an ancestor more distant than the common ancestor are shared ancestral character states. In Fig. 13.3C, in which two descendent taxa have been added, fluting is now a shared ancestral character state relative to taxa 5 and 6 because it is shared by three taxa and two ancestors. But relative to taxa 3, 5, and 6, fluting is a derived character state because it is shared by three taxa and their immediate common ancestor, B. Thus depending on where in a lineage one begins, a trait can be derived or ancestral.

Fig. 13.3
figure 3

Phylogenetic trees showing the evolution of projectile-point taxa (after O’Brien et al. 2001). In (A), fluting appears during the evolution of Taxon 2 out of its ancestral group. Its appearance in Taxon 2 is as a derived character state. In (B), Taxon 2 has produced two taxa, 3 and 4, both of which contain fluted specimens. The appearance of fluting in those sister taxa and their common ancestor makes it a shared derived character state. In (C), one of the taxa that appeared in the previous generation gives rise to two new taxa, 5 and 6, both of which contain fluted specimens. If we focus attention only on those two new taxa, fluting is now a shared ancestral character state because it is shared by more taxa than just sister taxa 5 and 6 and their immediate common ancestor. But if we include Taxon 3 in our focus, fluting is a shared derived character state because, following the definition, it occurs only in sister taxa 3, 5 and 6 and their immediate common ancestor

Figure 13.3 does not show a third kind of character, but it is one that occurs on virtually all phylogenetic trees and, if not recognized, creates false positives in terms of similarity resulting from common ancestry. These are analogs, which in cladistics are referred to as homoplasies—similarities resulting from processes other than descent from a common ancestor, such as convergence, parallelism, and horizontal transmission between lineages (Sanderson and Hufford 1996; Groucutt [Chap. 1] 2020). Suppose in Fig. 13.3C that the tree is a true depiction of projectile-point evolution. Suppose further that taxa 1 and 6 share a character—say, beveling—that taxa 3 and 5 do not exhibit. We would refer to beveling as a homoplasious character—one that arose independently in those two taxa.

Several studies have examined how various Clovis-period and slightly later point types from across North America are related phylogenetically (O’Brien and Lyman 2000, 2003; O’Brien et al. 2001, 2002, 2012, 2013, 2015; Darwent and O’Brien 2006; Buchanan and Collard 2007, 2008; O’Brien, Boulanger, Buchanan, et al. 2014, 2016; Smith and Goebel 2018). Instead of using traditional projectile-point types, several studies used a standardized set of projectile-point classes (taxa) that were defined on the basis of eight characters, including base shape, the shape of the blade, the length/width ratio, and how deeply indented the base was (O’Brien et al. 2001). These characters are shown in the box in the upper left of Fig. 13.4, represented by Roman numerals (I–VIII). Each character has a number of states, and it is the intersection of the states of each character that creates a class (see O’Brien et al. [2001] for the states of each character). The choice of which characters to use was based on expectations as to which parts of a point would change most over time as a result of cultural transmission and thus create a strong phylogenetic signal. Archaeologists, like biologists, lean heavily on experience in selecting characters, and experience has shown that the hafting element—the proximal end of a projectile point (the part that comes into contact with a spear or dart shaft)—is a likely region in which to find characters useful in phylogenetic analyses. Forty of the 41 classes (taxa) used in the latest analysis (O’Brien, Boulanger, Buchanan, et al. 2014, 2016) are shown at the branch tips, and the class that was used to root the tree—the one predicted to be ancestral to all the other classes (O’Brien et al. 2002)—is at the far left (KDR [12212223]).1

Fig. 13.4
figure 4

Phylogenetic tree showing 41 classes (after O’Brien, Boulanger, Buchanan, et al. 2014). Roman numerals denote characters, and subscript numbers denote character states. Open boxes indicate phylogenetically informative changes; shaded boxes indicate parallel or convergent changes (homoplasy); and half-shaded boxes indicate characters that reverted to an ancestral state. Six of the clades are labeled A–F. The tree is a 50% majority-rule consensus tree based on 100 replicates

The phylogenetic tree shown in Fig. 13.4 is a 50% majority-rule consensus tree, meaning that out of all trees generated during analysis, at least 50% of them had the projectile-point classes in the positions shown. The tree exhibits numerous clades, which, again, are defined as units that consist of two or more related taxa and their common ancestor. Six of the clades in Fig. 13.4 are labeled A–F. Of perhaps more immediate importance are the 48 squares shown on the tree, each of which conforms to one of the three kinds of characters shown at the top of the box in Fig. 13.4. Each square is labeled with a Roman numeral, which corresponds to the characters in the Fig. 13.4 box. The presence of a square indicates that the character has changed states from one generation to the next; the subscript Arabic numeral indicates the evolved character state. For example, the first characters to change were location of maximum blade width (Roman numeral I) and constriction ratio (Roman numeral IV). The former changed from state 1 to state 2, and the latter changed from state 1 to state 3. These changes created an ancestor that then produced Class Kg (22231223) and an offspring class that, with an additional state change, became Class Kj (22232323). White squares on the tree indicate phylogenetically informative changes—shifts that result from descent with modification—-as opposed to changes that result from either adaptive convergence (black squares) or reversals to ancestral character states (half-shaded squares).

Phylogenetic analysis is important because it allows us to track heritable continuity—what produced what—as opposed to simply historical continuity—what followed what with no reliable knowledge as to whether an ancestor–descendant relationship existed. If we are interested strictly in phylogeny, then our focus is on the white squares in Fig. 13.4 because they are the only ones that resulted from descent with modification. But at a more general scale, all of the morphological changes shown in Fig. 13.4 are important because they give us important insights into how Clovis flintknappers were making decisions about how to manufacture their points. Unless character states were independently invented, the process that led to the traits showing up in the positions they do is cultural transmission. Now, what about the learning processes embedded in the transmission? Do different kinds of learning create different patterns of variation, and at various levels, and can we use the patterns to talk about populations as they move across the landscape? As we will see, the answer to both questions is “yes.”

Learning: The Basis of Cultural Transmission

Franz Boas (1904: 522) pointed out at the beginning of the twentieth century that “the theory of transmission has induced investigators to trace the distribution and history of [cultural traits] with care so as to ascertain empirically whether they are spontaneous creations or whether they are borrowed and adapted.” Boas (1911: 809) later noted that “we must investigate the innumerable cases of transmission that happen under our very eyes and try to understand how transmission is brought about and what are the conditions that favor the grouping of certain new elements of an older culture.” These are excellent points, but again, there was a lack of rigor in producing testable models. It wasn’t until the 1970s that Boas’s insights led to such models, starting with mathematical work that incorporated cultural information into evolutionary models of differential transmission of genes (e.g. Cavalli-Sforza and Feldman 1973, 1981; Feldman and Cavalli-Sforza 1976), followed by work that brought to the forefront various kinds of learning (e.g. Boyd and Richerson 1985; see Shennan 2020).

For our purposes here, we can subdivide learning into social learning and individual learning (Cavalli-Sforza and Feldman 1981; Boyd and Richerson 1985; Laland 2004; Mesoudi 2011a; Kendal et al. 2018), keeping in mind that humans are neither purely social learners nor purely individual learners. Rather, certain conditions, perceived or real, dictate which one is used in any particular situation. In fact, there are good reasons to suspect that many species, especially humans, may have experienced selection for reliable social learning, with enhanced individual learning being a by-product (Laland 2017). At the group level, social learning is advantageous for most agents, but that benefit relies on the remaining proportion of individual learners and what they know about the environment. Without any individual learners to constantly sample the environment—to produce information useful to the group—social learners cannot track environmental change. Without a source of variation, agents simply copy themselves into stasis—potentially a recipe for disaster in the face of a changing environment. For this to work, however, there has to be an adaptive value for individual learning to occur in the first place. This is achieved by social learning making individual learning less costly (Boyd and Richerson 1995).

Social learning is a powerful adaptive strategy that allows others to risk failure so we don’t have to (Henrich 2001; Laland 2004; Aoki and Feldman 2014)—that is, it lets others filter behaviors and pass along those that have the highest payoff (Rendell et al. 2011). This translates into social learning being less costly in terms of energy and/or time (Morgan et al. 2011). Social learning is how individuals acquire their language, morals, technology, how to behave socially, what foods to eat, and most of their ideas from people with whom they come into constant contact. Over generations, the effect is cumulative, as individuals continue to “learn things from others, improve those things, transmit them to the next generation, where they are improved again, and so on” (Boyd and Richerson 2005: 4). As Mesoudi and Thornton (2018: 6) put it, cumulative cultural evolution is “the introduction of behavioural novelty or modification, the transmission of behaviour via social learning, the improvement in genetic and/or cultural fitness or fitness proxies as a result of the learned behaviour and the repeated transmission and improvement of the behaviour over time.” This has been referred to as the “ratchet effect” (Tomasello et al. 1993; Tomasello 1999; Tennie et al. 2009). Any number of species exhibit social learning (Hoppitt and Laland 2013; Mesoudi and Thornton 2018), but humans, and a limited number of other species, exhibit an amped-up form of social learning, which we can refer to as “cultural learning” (Dean et al. 2014). Humans excel at cultural learning, which is what makes human minds, not to mention human lives, so different from those of other animals (Heyes 2015).

Learning is the process that ensures what we earlier referred to as heritable continuity—one thing resembling another as a result of transmission (Lyman and O’Brien 1998). Over time, continuity creates what archaeologists refer to as traditions, defined as “(primarily) temporal continuity represented by persistent configurations in single technologies or other systems of related forms” (Willey and Phillips 1958: 37). From an evolutionary perspective more explicitly, a cultural tradition “is a socially transmitted form unit (or a series of systematically related form units) which persists in time” (Thompson 1956: 38)—a definition that reflects transmission, persistence by means of replication, and heritable continuity (Lyman et al. 1997; O’Brien et al. 2010).

Copying

Social learning usually involves copying others, which itself is a set of competing strategies. You might, for example, copy someone based on that individual’s skill level—perhaps a person who appears to be better at something than you are or someone who appears to be successful—whereas someone else might base his or her decisions on social criteria—copy the majority, copy kin or friends, or copy older individuals (Kendal et al. 2018). The various factors that can affect one’s choice of whom or what to copy are often referred to as “social learning strategies” (Laland 2004) or “transmission biases” (Boyd and Richerson 1985)—unique evolutionary forces for the selective retention of cultural variants. The term “biased learning” is commonly used as a synonym for certain social-learning strategies. Given the difference between the effects of copying based on selection for knowledge or a skill level as opposed to copying based on random social interaction, “bias” is used in a statistical sense to indicate some deviation from random, or “unbiased,” copying. It is not used in any normative sense, such as “gender bias” or “racial bias.”

With respect to model-based transmission—you are picking someone to copy—we might make the underlying assumption that individuals can find a master teacher from whom to learn. Likewise, it might be assumed that individuals can sense how popular a behavior is in the population. These assumptions might be unrealistic for large populations, perhaps where individuals have only local, imperfect knowledge of what models, and hence what behaviors, are optimal (Bentley and O’Brien 2011; Bentley et al. 2014). Thus, we would expect that if individuals are selective and accurate in finding the most skilled model for copying, then the pace of cultural evolution depends strongly on population size, from the Upper Paleolithic Revolution of 40,000 BP (Powell et al. 2009) to the information cascade that confronts us today (Bentley and O’Brien 2017). If, however, learning is relatively unselective, then the pace depends only weakly on population size, if at all, and perhaps more on the level of environmental risk (Collard et al. 2013).

Of course, even with large populations the individual minds involved must communicate in the first place in order to create this “collective-mind” effect. Unconnected individuals are irrelevant to learning and the collective storage/retrieval of information (Bentley and O’Brien 2011). This has been documented time and again, most dramatically in a computer-mediated tournament of learning algorithms held at St Andrews University in 2009 (Rendell et al. 2010). Before the tournament, many expected the winning strategy to be some combination of majority individual learning supplemented by some social learning. In fact, the most successful strategies relied almost exclusively on social learning, even when the environment was changing rapidly. The winning strategy copied frequently and was biased toward copying the most recent successful behavior it observed—an excellent strategy in the face of rapidly changing environments (but see Heyes 2016). Of course, even here there had to be some individuals—a minority—who were creating and updating information for others to copy.

With respect to copying, our view mirrors that of Rendell et al. (2011): Copying confers an adaptive plasticity on populations, which allows them to draw on deep knowledge bases in order to respond to changing environments rapidly. High-fidelity copying leads to an exponential increase in the retention of cultural knowledge. The key term here is “high fidelity” (Boyd and Richerson 1995). What if acquisition costs affect the ability to copy faithfully (Mesoudi 2011b)—a point that applies to all modes of social learning but appears to be especially important for model-based learning? There also is another issue involved with the fidelity of copying, and it involves the difference between imitation, copying the form of an action, versus emulation, copying the result of an action sequence. This distinction sounds clear enough, but it can be difficult to demonstrate empirically. As an example, let’s look at the bearded capuchins that live in the savannah of Brazil. One of the monkeys’ economic pursuits involves cracking tough palm nuts using large stones as hammers and stone or log surfaces as anvils. This is no simple task, in that it involves proper stance, proper placement of a nut on an anvil, and a proper striking angle so that the nut doesn’t skip away. Adults crack the nuts routinely throughout the year, but juveniles rarely manage to crack a whole nut, even though from a young age and for several years they devote considerable time and effort to watching their elders and practicing pounding actions with bits of nut and small stones.

Can young monkeys learn to crack nuts, or at least improve their technique, from directly copying some aspect of the behavior of others? Some researchers (e.g. Fragaszy et al. 2013) think the answer is no. Beating on a nut because another monkey is pounding on one might increase the copier’s skill, but simply pounding a stone on a nut is not sufficient to crack it. Even after a young monkey reliably produces all the relevant actions, and in the correct sequence, it takes another year or more before it succeeds in cracking a whole nut. Does this mean, though, that all nonhuman primates are only good emulators but not imitators? No. Whiten et al. (2009), for example, report results from an experimental study in which a young chimpanzee watching another chimpanzee cracking nuts made repeated and moderately synchronous matching actions, but involving no nut or hammer.

With respect to the manufacture of a Clovis point, there is, as we will see later, a clear distinction between imitation—understanding the actions necessary to produce a point—and emulation—trying to produce a point without understanding the necessary actions (and their correct sequence). Stone-reduction sequences are complex procedures that require a significant amount of investment in terms of time and energy to learn effectively (Geribàs et al. 2010; Stout 2011). Clovis-point production is no exception (Bradley et al. 2010). Fluting can be a challenging technology to master, occurring after a point is already thinned to approximately 7.5 mm (Thomas et al. 2017). That doesn’t give the knapper much margin for error.

A Map of Decision Making

Learning, of whatever kind, results in decision making, whether it’s how to make a Clovis point, where to find the next meal, or whom to marry. Decisions are affected by two inputs: the kind of learning involved and the costs and benefits related to the knowledge acquired. Figure 13.5 shows a “map” of decision making that is defined by kind of learning along the horizontal axis and by costs and benefits along the vertical axis. Along the left edge, agents are purely individual learners—they use no information from others in making decisions. Along the right edge, agents are purely social learners—their decisions are based solely on copying, instruction, or other similar social processes. In between the extremes is a balance between the two—a flexible measure of the agents represented. The midpoint could represent, for example, a population of half social learners and half individual learners, or each individual giving a 50% weight to his or her own experience and a likewise amount to that of others.

Fig. 13.5
figure 5

A four-quadrant map for understanding different domains of human decision making, based on whether a decision is made individually or socially (horizontal axis) and the transparency of options and payoffs that inform a decision (vertical axis) (after Bentley et al. 2014)

We can compare the kinds of learning against the costs and benefits of acquiring that knowledge. The farther up one goes on the map, the more attuned an agent’s decisions will be to the potential costs and payoffs of various decisions. A projectile-point manufacturer, for example, might quickly learn that a certain shape of a base makes a point susceptible to catastrophic failure and thus would likely change the design. Such a decision might be made individually, which places you in the upper left quadrant, or there might be socially identified authoritative experts whom you copy, which places you in the upper right quadrant. As an agent moves down the map, the relation between an action and its impact on performance becomes less clear. At the extreme bottom edge are cases that correspond to total indifference, where choice is based either on randomly guessing among all possible choices (lower left) or copying from a randomly chosen individual (lower right). This area of the cost/benefit spectrum represents cases in which agents perhaps are overwhelmed by decision fatigue—for example, when the number of choices becomes prohibitively large to be processed effectively.

Based on what we see in small nonwestern groups today (e.g. Henrich and Broesch 2011; Muthukrishna and Henrich 2016), we would assume that similarly among Clovis groups, social learning was transparent, as members would have learned adaptive knowledge—tool making, hunting practices, medicinal-plant use, and the like—from respected experts in the group. If learning is nontransparent, then misinformation can invade the social-learning process, such as a misguided panic among a herd of social animals (Couzin et al. 2005). For humans, imagine a case where social influence is strong but transparency is low. This highly social, nontransparent situation might characterize disasters that occur through misguided conformity, such as people remaining in a burning building because they don’t yet see anyone else exiting or cult suicides, where everyone drinks the cyanide-laced Flavor Aid, and so on. Although the spread of misinformation is well-documented in modern media-saturated society (Aral et al. 2009; Garcia-Herranz et al. 2014; Vosoughi et al. 2018), we can assume it was much less common in the traditional subsistence societies of prehistory, except perhaps in cases of gossip or deception (e.g. Chagnon 2000), where expertise might not have been transparent to all members of a network.

Fitness Landscapes

We can overlay the map of decision making with peaks and valleys, as shown in Fig. 13.6, to create a fitness landscape. The geneticist Wright (1932, 1988) introduced the metaphor of a fitness landscape to describe the possible mutational trajectories that lineages take (evolve) from genotypes that lie in regions of low fitness to regions of higher fitness (Kvitek and Sherlock 2011). We can borrow this metaphorical landscape and turn it into a kind of design space, or, in biological terms, a morphospace (McGhee 2018). We can also adapt its features so that the highest peak on the landscape corresponds to the optimal design of something—a projectile point, for example—and lower peaks correspond to designs that, although not optimal, are good enough for the intended function at particular points in time. The landscape also contains valleys, which correspond to designs that yield negative fitness. An example of the latter would be a stone spear tip that is so thin that it consistently snaps on the slightest impact—not the best weapon to have when facing a charging animal (O’Brien, Boulanger, Buchanan, et al. 2016).

Fig. 13.6
figure 6

The four-quadrant map shown in Fig. 13.5 with a fitness landscape superimposed; the view is from the lower left of the figure (from O’Brien, Boulanger, Buchanan, et al. 2016). The presence or absence of clouds corresponds to the transparency of potential costs and payoffs of a decision. Agents, shown as dots, attempt to find the optimum peak, either on their own or with help from other agents. Figure by Matt Boulanger

Note that the bottom half of the fitness landscape contains clouds, which begin to obscure the tops of some of the fitness peaks. Imagine that stone projectile points are variable in design such that some perform better than others for the purpose of, say, hunting mammoth. As the relationship between that variability and the performance for hunting mammoth becomes less clear, it equally becomes less clear as to what changes might be made to increase the performance of a point. Thus, an individual learner is likely to produce variation in design that drifts from one form to the other, but if an agent learns socially, he or she can use the actions of other agents as a guide, although they may be in no better shape to make informed decisions. As the connection between the variation produced and the outcome becomes clearer, agents can make more-informed choices, either singly or collectively (O’Brien, Boulanger, Buchanan, et al. 2016). Again, the key to fitness lies in the effect social learning has on individual learning. Copying can be adaptive if it makes individual learning less costly or more accurate. This means that agents use individual learning when it is cheap and reliable and switch to social learning when individual learning is expensive or inaccurate (Boyd and Richerson 1995; Castro and Toro 2004; Kendal et al. 2018).

Fitness landscapes can be simple or complex, depending on the transparency of costs and benefits. A “Mount Fuji” landscape, for example, has a clear solution: The optimum peak is so visible that all you need to do is align your strategy toward the mountain and start climbing. You can get to the top on your own by walking, or you can copy others who are also taking the hike. On more rugged landscapes, however, the highest peak may be over the next ridge, so to prevent getting stuck on a small nearby hilltop, you need to copy others more and more frequently (O’Brien et al. 2019). Most of those others will also be copying others, who will be copying others, and you hope that somewhere there is someone who actually sees the highest peak. This is why copying only works if at least some people, even if only a minority, are actually looking at the world around them rather than at other people. In other words, we need at least a few producers to supply information to all the scroungers (Mesoudi 2008).

We need to make clear that not all decisions affect fitness, meaning that not all decisions are a matter of life and death. You might, for example, want to buy a mobile phone, but you have no idea of what color to get. Does your fitness rely on which one you pick? Probably not. You could simply look around, point to someone else’s phone, and say, “I’ll have what she’s having,” to quote a well-known saying (Bentley et al. 2011). However, just because choices seemingly do not have payoff differences with respect to fitness doesn’t mean they are always unrelated to fitness (O’Brien et al. 2019). Take, for example, carpet designs (Tehrani and Collard 2002), pottery decorations (Neiman 1995; Shennan and Wilkinson 2001), or synonymous words (Bentley 2008; Bentley et al. 2012). It is difficult to think of designs affecting one’s fitness. Whether a potter incises triangles or circles into a still-wet ceramic vessel does nothing to affect the ability of the pot to hold and steam food, and the same applies to the designs woven into carpets or whether we say “cop” or “policeman.” In the language Dunnell (1978, 1980) used, we would say that designs and synonymous words are stylistic, meaning they have neutral selective value.

Suppose, however, that designs are tightly restricted in terms of social norms, so that you have only a limited number of designs from which to choose. With respect to options 1, 2, or 3, your choice is selectively neutral, but if you pick from outside that range, you could face criticism or even ostracism. All of a sudden, what seemed to be a matter of style becomes a matter of function. Stylistic cultural elements have a payoff based on the particular distribution of choices among other agents, which may favor conformity, anti-conformity, frequency dependence, and so on, none of which depends inherently on the choice itself but rather on its frequency among other agents and their social-learning networks.

Clovis Populations and Patterns of Learning

In a growing and fast-moving population subject to the widespread environmental changes of, for example, late Pleistocene North America, it is understandable why biased-learning strategies, including prestige bias, would have played a key role in fluted-point technologies (Sholts et al. 2012; O’Brien, Boulanger, Buchanan, et al. 2016; O’Brien and Buchanan 2017): When faced with possible weapon failure, especially on an unfamiliar landscape, your safest bet might be to adopt the best model from whom to learn and not change. Under circumstances where ecological conditions change, say, on a generational scale, the mean trait value is often optimal, leading to frequency-dependent bias, or conformism (Henrich and Boyd 1998). In western North America, where Clovis technology apparently began (Beck and Jones 2010; Hamilton and Buchanan 2007; Meltzer 2009; Morrow and Morrow 1999; Waters et al. 2011), point production appears to have been fairly specialized in terms of form, perhaps a result of the focus on fewer prey species in a more stable environment compared to the East (Buchanan et al. 2011). This is consistent with a stronger degree of social learning (biased transmission) in the West relative to the East, as western groups produced fewer point forms overall, and a few particular forms were produced more frequently (Buchanan et al. 2017).

As Clovis groups began moving into eastern North America, they would have encountered environments that were more heterogeneous than those in the West (Thompson et al. 1993), incorporating a greater number of floral and faunal habitat types and greater variability in resource patches (Eren et al. 2015; O’Brien 2019a). A concomitant change in subsistence strategy could have come with a cost to forager time budgets (Buchanan et al. 2017), meaning that populations would have had to invest more time in accumulating knowledge about unfamiliar landscapes in order to understand where productive resource patches were located and in traveling between a greater number of smaller patches. Time available for detailed teaching and learning projectile-point production in the East could have been comparatively diminished, leading to a flourish of individual trial-and-error learning and experimentation, which resulted in higher rates of interregional variation (O’Brien and Buchanan 2017). If this can be demonstrated archaeologically, it says a lot about population dynamics. As we will see below, it can be documented.

Models of cultural learning indicate that a mix of social and individual learning is adaptive in environments “that change too rapidly for innate, genetic responses to evolve, yet not so rapid that previous generations’ solutions to problems are out-of-date” (Mesoudi 2014: 66). Increasing chronological resolution of the last several thousand years of the Pleistocene has shown that the transition to the Holocene ca. 11,700 cal BP was anything but gradual and uniform, especially in the East (Denton et al. 2010), suggesting this would have been a time when individual learning, at the aggregated group level, might have conferred an advantage, especially if coupled with conformist bias (Hamilton and Buchanan 2009). In other words, information producers took over a larger proportion of the learning process. This appears to account for the significantly greater diversity in Clovis points from the East than in those from the West (Buchanan et al. 2017).

Here, we are using diversity to refer strictly to differences in point shape. Two scenarios have been proposed for the diversity. In one, Clovis groups adapted their hunting gear to the characteristics of prey and local habitat, which resulted in regionally distinctive point shapes (Buchanan et al. 2014; Bement and Carter 2015). In the other scenario, there are no significant regional differences in shape, and any variation is attributable to stochastic mechanisms such as copy error, or drift (Morrow and Morrow 1999; Buchanan and Hamilton 2009). The two scenarios, however, are not mutually exclusive (O’Brien, Boulanger, Buchanan, et al. 2014; Eren et al. 2015). Colonizing populations do not necessarily stay in constant contact with one another, especially as geographic distance between them increases, and thus over time point shapes can begin to drift. Similarly, as they move apart, populations may begin to adapt point shape to regional environmental conditions that differ from those encountered by other groups. In other words, populations begin to explore different local fitness peaks (O’Brien, Boulanger, Buchanan, et al. 2016).

Diversity, however, can also refer to aspects of Clovis points other than shape, including the manner in which they were flaked (O’Brien 2019b). Several recent studies of flaking have shed considerable light on Clovis learning. One study used laser scanning and Fourier analysis to examine flake-scar patterns on a sample of Clovis points from sites across North America (Sholts et al. 2012). This analysis suggested that flaking patterns were similar across the continent, with no evidence of diversification, regional adaptation, or independent innovation. The authors proposed that the lack of diversification was tied to the importance of outcrops of desirable tool stone, where “Clovis knappers from different groups likely encountered each other … [which] would have allowed knappers to observe the tools and techniques used by other artisans, thereby facilitating the sharing of technological information” (Sholts et al. 2012: 3025; see Maher and Macdonald 2020). This sharing created the uniformity in production seen in their sample—a classic case of conformist bias (Sholts et al. 2012), which is a strong form of stabilizing selection.

One significant aspect of Sholts et al.’s study was their inclusion of 11 replicate Clovis points made by a modern flintknapper who is well known in the knapping world for his ability to make “superb Clovis points” that are “as thin as anyone could make them” (Whittaker 2004). He copied points from the Drake Clovis cache in Colorado and not only passed them off to highly knowledgeable collectors as authentic but, at least for a while, fooled any number of professional archaeologists highly familiar with Clovis points. How was he able to get away with it? For one thing, he was a master flintknapper and was able to reverse engineer certain aspects of the Drake points (Preston 1999) and then copy them. Until the study by Sholts and colleagues, it was widely believed that the replicas were all but perfectly executed and that his mistakes, which eventually revealed the points’ inauthenticity, was his choice of Brazilian quartz as the raw material for some of the replicas (archaeologists assumed the stone was simply from an unknown western North American source) and his use of red clay to buffer the effects of a rock tumbler that knocked off the sharp flake-scar ridges, which would have been sure signs of modern replication.

Analysis by Sholts and colleagues showed, though, that there was another dead giveaway: As skilled a knapper as he was, he could not consistently copy a Clovis knapper’s pattern of flake removal. In other words, the modern flintknapper—again, a person widely recognized as one of the best there is—could sometimes replicate the flake-removal pattern of a Clovis knapper, but he was inconsistent in his ability to do so. As the flintknapper later told a journalist (Preston 1999), “I just stopped and looked at [a] piece and said, ‘That really looks like a Drake-style Clovis if I stop right there.’ Until then, I had always kept going, cleaning up the edges, making the point smoother, getting the symmetry dead on, and really dressing the thing up. What I’d been losing was its immediacy, its simplicity.” The real reason, of course, for his failure to consistently match the flaking pattern was because he was born 13,000 years too late to have worked side by side with a Clovis craftsman. He was a master emulator but only a so-so imitator (O’Brien and Buchanan 2017).

Eren and colleagues (2015) subsequently used a sample of 115 Clovis points from three chert outcrops in the Upper Midwest as an additional test of the findings by Sholts et al. (2012) that there was no evidence for diversification, regional adaptation, or independent innovation in flaking pattern. Bradley et al. (2010: 177, 106) had proposed that “Clovis flaked stone technology exhibits a bold, confident, almost flamboyant strategy” that “focuses on the removal of large well-formed flakes.” Eren and colleagues formulated a straightforward, quantitative measure of “boldness”: the number of flake scars on a face divided by the square area of a fluted point. The smaller the value, the bolder a point’s flaking pattern. They also used geometric morphometrics to assess variation in shape, but as opposed to the sample used by Sholts and colleagues, which came from scattered regions of North America, the sample used by Eren and colleagues came from a more restricted, environmentally homogeneous region in order to maximize the probability that any patterned variation in point shape should be attributable not to differential adaptation by Clovis groups but rather to decreased social interaction among them. Statistical analysis of flake-scar patterning confirmed that the production technique was the same across the sample—matching the findings of Sholts et al. (2012)—but geometric morphometric analysis also showed distinct differences in point shape associated with the stone outcrop from which particular Clovis points originated.

The dichotomous, intraregional results from the Upper Midwest strongly suggest that Clovis foragers engaged in two tiers of social learning (Eren et al. 2015; O’Brien, Boulanger, Buchanan, et al. 2016; O’Brien and Buchanan 2017; O’Brien 2019b). The ancestral tier, which is an example of deep homology, relates to point production and can be tied to conformist transmission of ancestral tool-making processes across the larger North American Clovis population (Sholts et al. 2012), where dispersing Clovis groups were still socially connected across large regions of the continent and directly exchanging technological knowledge, resulting in a low interregional variance in how points were being flaked. The derived tier is tied to point shape, which shows more interregional variance (Eren et al. 2015; Buchanan et al. 2016), which resulted from individual populations spending more time at different stone outcrops. In that tier, the apparent pattern of increased experimentation in shape is what we would expect from guided variation, which is unbiased transmission plus environmental (individual) learning (Boyd and Richerson 1985). In other words, in the absence of strong selection, a population will move toward whichever trait is favored by individual-learning biases (Mesoudi 2011a; O’Brien, Boulanger, Buchanan et al. 2014; Gingerich et al. 2014; O’Brien and Buchanan 2017). Again, this occurs even when the strength of guided variation is weak (Mesoudi 2011a). It should come as no surprise that shape and flake-removal patterns would be driven by different learning and transmission processes (O’Brien and Buchanan 2017). Flaking patterns are a form of “structural integrity,” in which key components are more conservative and therefore less likely to change relative to other components—a phenomenon that occurs in other aspects of culture as well (Mesoudi and Whiten 2008).

Over time, the continent-wide method of point manufacture began to shift. In a follow-up study to the one by Sholts et al. (2012), Gingerich et al. (2014) examined flake-removal patterns on specimens of Early Paleoindian eastern fluted-point types that immediately postdate the height of classic Clovis-point manufacture and found more variation and bifacial flake-scar asymmetry than what Sholts et al. (2012) had found among Clovis points. Gingerich et al. (2014: 117) proposed that the differences could represent “a time-transgressive shift, where Clovis interaction and the direct transmission of knowledge responsible for consistent reduction techniques is breaking down, causing biface symmetry to become more variable with greater flake scar variation.” In other words, once individual Clovis populations began settling down, and thus encountering other populations on a more limited basis, even the conservative aspects of point manufacture began to dissolve (Sholts et al. 2012; Smallwood 2012; Eren et al. 2015). The resulting regionalization in the East produced a series of morphologically distinct unfluted and fluted forms, reflecting a “relaxation in the pressure to maintain contact with distant kin, a reduction in the spatial scale and openness of social systems, and a steady settling-in and filling of the landscape” (Meltzer 2009: 286).

Conclusion

We would be the last to claim that the theoretical models and analytical methods discussed here can be easily applied to the study of population dynamics generally (Shennan 2020). The dispersal of Clovis groups across North America represents an exceptional case because it occurred within such a short time span and across an area that had at best small resident populations that had not been there very long. Also, Clovis hunters used a stone weapon tip that, despite regional and temporal differences in shape, is a highly visible time marker. The result is that we have temporal resolution rarely seen in archaeology. Compare the resolution available for the spread of the Clovis techno-complex, ca. 13,400–12,500 years ago, to what archaeologists working in the Old World deal with, where resolution can range into the tens of thousands of years, if not more.

The spread of Clovis involved various kinds of learning. Early on, individual populations apparently maintained close social ties as they spread across the landscape, with the result being a pattern of flake removal on Clovis points that was reinforced across generations. Sholts et al. (2012) propose that this reinforcement came about as a result of groups meeting up at chert outcrops, which served as hubs of regional activity (Bradley et al. 2010; Sholts et al. 2012; Smallwood 2012; Waters et al. 2011). For a thinly scattered, mobile population such as Clovis or its immediate descendants, outcrops would have acted as ideal meeting spots because, once found, they would have served as predictable places on an emerging mental landscape map (Gardner 1977; Goodyear 1979; Miller 2016; Miller et al. 2018). Outcrops were places where Clovis groups could not only resupply but also exchange information and the like. This resulted in a low interregional variance in flaking patterns.

Over time, groups began to spend more time at specific chert outcrops (Eren et al. 2015), and although points were flaked similarly across regions, blade shape began to change. This interregional variance could have resulted from drift as well as from adaptation to different environments. At the level of the group, this increased experimentation in shape is what we would expect from individual learning (Boyd and Richerson 1985). Again, in the absence of strong selection, a population will move toward whichever trait is favored by individual-learning biases (Mesoudi 2011a). By the end of the Clovis period in the East, around 12,500 years ago, even the flaking pattern had become diversified (Gingerich et al. 2014), which strongly supports the notion that at the macroscale, social learning had been more or less eclipsed by guided variation.

Future work will be directed toward phylogenetic and morphometric analyses of post-Clovis point assemblages to assess what they might tell us about population dynamics in the resource-rich river valleys of eastern North America. We know, for example, that the Upper Southeast—the modern states of Missouri, Arkansas, Kentucky, Tennessee, North Carolina, and Virginia—contains more post-Clovis projectile-point shapes than any other region in the East (Eren et al. 2016), which is consistent with proposals that the river valleys of the Ohio, Tennessee, and Cumberland were arteries for colonizing populations moving east (Anderson 1990, 1996; Smallwood 2012; Broster et al. 2013). If those diverse type forms are proxies for populations, then they should be useful for tracking various groups that budded off and start moving to the Northeast and Southeast, encountering what perhaps were new fitness landscapes, complete with never-before-scaled fitness peaks and requiring a new mix of individual- and social-learning strategies. This is an exciting prospect for those of us interested in identifying prehistoric populations using items found in the archaeological record.

Note

  1. 1.

    The program used to create the tree was PAUP* (version 4) (Swofford 1998).