Introduction

Paleolithic archaeology has benefitted from many new methods and approaches, some of which are illustrated by the papers in this special issue, that are providing valuable new insight into human biocultural evolution, ecology, and society. However, the central theme of this issue of the Journal of Paleolithic Archaeology is to explore how the cultural taxonomies that are pervasive in Paleolithic archaeology can be modified, improved, standardized, and made more useful. It developed from presentations and discussions at the CLIOARCH conference at Åarhus University in November, 2019, in which participants sought to address the analytical and interpretive challenges posed by the numerous classifications of artifacts and assemblages, accumulated over a century and a half, that are problematic in various ways when applied to the Upper Paleolithic archaeological record (Reynolds & Riede, 2019; Riede et al., 2020).

Of course, there is a substantial literature on the epistemology and utility (or lack thereof) of cultural taxonomies and the classifications on which they are based (e.g., Binford & Sabloff, 1982; Clark, 1993, 1994; Clark & Lindly, 2015; Clark & Riel-Salvatore, 2006; Dibble et al., 2017; Holdaway & Douglass, 2011), with recent critiques by Shea (2014, 2019) describing them as NASTIES (NAmed Stone Tool IndustrIES). Shea points out that, despite impressive methodological advances over the past 25 years, there are widening disconnects between the goals of twenty-first century Paleolithic archaeology and the culture-historical approaches within which these techno-typological categories were long embedded. However, these critiques seem to have had relatively little impact on actual archaeological practice, and even the most sophisticated studies often remain tethered to Hamburgians, Pavlovians, Azilians, Epigravettians, and the like (c.f., Bosselin & Djindjian, 1999; Straus & Clark, 2000). While these categories have been variously referred to as cultures, industries, and technocomplexes, they are widely interpreted (implicitly and occasionally explicitly) in the literature as proxies for those elusive bodies of shared, socially transmitted, knowledge and practices that anthropologists refer to as “cultures”—whence the explicit focus of this special issue on “cultural taxonomies.”

While other contributors to this volume assess how classification can be improved and made more rigorous, our goal is to contribute to the aims of this issue by examining Upper Paleolithic artifact classification as a proxy for cultures. We want to be clear at the outset that we have no doubt that the Upper Paleolithic (and earlier) denizens of Europe had “culture” and organized themselves into identify conscious social groups mediated by cultural knowledge. Given recent studies of animal behavior that make a credible case for traditions of socially transmitted behaviors in Chimpanzees and other animals (Boesch et al. 2020; Jelbert et al., 2018; Schofield et al., 2018), hominins of Pleistocene Europe certainly possessed culture. Likewise, there is clear geographic and temporal patterning in the Upper Paleolithic archaeological record of Europe, and the technology represented in this record was certainly a result of shared knowledge and practices. The issue is what the classifications generated by archaeologists to organize the spatial and temporal variability in Upper Paleolithic material culture mean, and whether and to what extent they are representative of the cultural knowledge and social groups in which these ancient people lived.

Paleolithic cultural taxonomies are hierarchically organized constructs based on classifications of similarities and differences among archaeological assemblages. Assemblage groupings are based on similarities and differences in the frequencies and presence of artifact classes or types. Individual artifacts are classified into types on the basis of their observed morphology. Hence, to address relationships between current classifications of prehistoric material culture and cultures, we review the Upper Paleolithic archaeological record at similar levels of analysis: artifact typology, implications of artifact typology for classifications of assemblages, and the relevance of assemblage classes to culture. Within each of these levels, understanding how categories were created and assigned meaning help to contextualize how they continue to be maintained and applied today. We close with some observations on the objectives of Paleolithic research.

Artifact Classification and Typology

Underlying many of the current cultural classifications of Paleolithic Europe are typologies, or classification systems, for artifacts recovered from archaeological sites. Fortuitous finds and cases of exceptional preservation demonstrate that Upper Paleolithic people, like historic hunter-gatherers, made and used a wide array of technologies as well as objects of personal adornment. However, most such items are poorly preserved and rarely encountered; the vast majority of Paleolithic sites produce only stone artifacts. Even artifacts made from durable organic material (bone, antler, teeth, shell) are usually so rare that Paleolithic cultural taxonomies ultimately are based almost entirely on lithics—with a few exceptions where bone artifacts contribute to but do not define cultural taxa (e.g., Aurignacian split-based bone points or Magdalenian and Azilian antler harpoons). Lithics do display considerable morphological variability, with apparent patterning in that variability across space and through time, and lithic assemblages can be assessed according to the degree of similarity or difference among them. The issue is whether there is a discernible relationship between the variability in lithic morphology that archaeologists record and classify, and patterns of cultural knowledge among ancient societies of Upper Paleolithic hunter-gatherers. To address this issue, it is useful to examine the ways archaeologists have used typologies to measure lithic variability and the ways in which morphological variability is generated in lithic technology.

Most lithic typologies found across Europe today use versions of classifications developed by French prehistorians prior to World War I (e.g., Gabriel and Adrien de Mortillet, Edouard Lartet, Edouard Piette, Georges d’Ault du Mesnil, Henri Breuil) and between the wars (e.g., Francis Turville-Petrie, Denis Peyrony, René Neuville, Dorothy Garrod, Diana Kirkbride). The evolution of these classifications continued after World War II, still dominated by French prehistorians (e.g., Françeois Bordes, Denise de Sonneville-Bordes, and to a lesser extent Georges Laplace). Between the early 1950s and the mid-1970s, Bordes’ typologies became the “industry standard” for Europe—standardizing artifact-type names and listing them in a fixed sequence so that assemblages could be characterized quantitatively with cumulative graphs. However, many of the type names originated in the nineteenth century with French prehistorians, who perceived them as stone versions of contemporary metal tools and assigned them names like racloirs (side scrapers), grattoirs (end scrapers), burins (gravers), and couteaux (knives). After the first use-wear studies appeared in the 1980s, most prehistorians came to acknowledge that these names had little functional significance but were simply historical terms for morphological classes. Nevertheless, usage in the literature suggests that many still implicitly associated grattoirs with scraping hides, burins with engraving, and couteaux with cutting, even though microwear studies indicate that most lithic artifacts were used for multiple and diverse purposes (e.g., Barton et al., 1996; Frison, 1968; Hardy et al., 2008).

Also retained was a pervasive assumption (sometimes explicit) that each of these types was intentionally shaped by ancient artisans through retouch into the forms found in the archaeological context. Lithic morphology came to be seen through the lens of a modern industrial paradigm where tools of metal and other materials are designed and mass produced in factories by specialist workers. “[T]hese artifacts would be viewed as distinct tools, analogous to those in a modern toolbox, whose forms were planned to correspond with their intended uses and with stylistic considerations determined by cultural traditions.” (Barton, 1991) This paradigm underlies the rationale for dividing lithic morphology into distinct classes which encode information about culture and/or behavior. It also has implications for the way in which lithic assemblages are classified, and what those assemblage classes are perceived to mean for Paleolithic societies. From this perspective, lithic types are modal patterns of variability, discovered by archaeologists, that originated as designs in the minds of ancient artisans who shaped lithic artifacts to conform to these “mental templates.” Similarities in artifact forms represented by these types are thus considered to be a proxy for shared cultural knowledge about stone tool designs. More recent emphasis on technological production sequences or chaînes opératoire likewise assumes preconceived design, whose final form is the one found in archaeological assemblages (Bleed, 2001; Shott, 2003; Tostevin, 2011).

Upper Pleistocene foragers certainly transmitted cumulative cultural knowledge through social learning and were able to preconceive designs that were subsequently executed in material culture. But, it is considerably less certain that patterns of lithic morphology are clear-cut or relatively direct products of such processes. While this industrial paradigm is internally consistent, and consistent with the way we experience technology in the modern world, there are reasons to maintain a healthy skepticism about its relevance for the lithic technology of Paleolithic hunter-gatherers.

First, lithic technology is very old, extending back to the earliest recognizable human ancestors as much as 3.3 ma ago (Braun et al., 2019; Harmand et al., 2015). While the cognitive and physical capacity for flint knapping (e.g., hand–eye coordination, ability to control force and striking angle, etc.) may have a genetic basis, the practice of making stone artifacts was almost certainly transmitted by extra-somatic means, just as it is today (Shennan, 2020; Tostevin, 2012). However, the low diversity and very slow rate of innovation that characterizes lithic technology of the Lower and Middle Pleistocene suggests that cultural knowledge and social learning took different forms than they do in humans today. That is, the links between lithic morphology and cultural knowledge were probably maintained and expressed in ways quite different from the relationships between culture, social learning, and the forms of objects like atlatl weights, arrow decoration, and pottery designs. Additionally, fracture mechanics generate significant equifinality in the (few) processes by which humans can chip stone, leading to a great deal of technologically driven convergence in resulting forms that can override any hypothetical “cultural” component (Will & Mackay, 2020). To further complicate things, many people today sculpt, carve, and otherwise decorate wooden objects, and make ceramics, while the regular practice of lithic technology has been extinct in much of the world for millennia and worldwide since the nineteenth century, precluding direct observation of such linkages.

Second, there is a compelling alternative to the execution of preconceived design to explain variability in lithic morphology. Based on ethnography and detailed morphological studies, this alternative holds that most lithic artifacts were generic tools with multiple potential functions at the time of initial manufacture. Their final form (i.e., the form found by archaeologists), however, is a result of their life histories as they were used for a task or a series of different tasks and modified to varying degrees though retouch to maintain their utility until finally discarded as no longer useful. Hence, they often bear little resemblance to the morphology they had when initially manufactured, even if an ancient artisan had a preconceived goal in mind (Barton, 1991; Dibble, 1987, 1995; Dibble et al., 2017; Frison, 1968; Sackett, 1988). In other words, lithic assemblages in archaeological contexts (unlike stone “tool kits” carried by living foragers) are dominated by undesired morphologies, not desired ones. From this perspective, lithic technology can be viewed as a decision tree (Bleed, 2001) in which decisions to retouch or discard a flake or blade are opportunistic responses to the tasks that need to be accomplished and the immediate or anticipated availability of raw material (Barton, 1991; Barton & Riel-Salvatore, 2016; Dibble, 1995; Hiscock, 2007; Riel-Salvatore & Barton, 2004).

Some retouched stone tools, like Solutrean foliate bifaces, certainly appear to have been preplanned and little altered subsequently (Dibble et al., 2017). Similarly, stone projectile tips and microliths designed for insertion into pre-made armatures of bone, wood, or other material of compound weapons could not have been manufactured in purely opportunistic decision trees or have forms that resulted only from life histories of use. Similarities in such artifacts among different assemblages could then signal socially transmitted cultural knowledge about morphologies best suited to tipping particular armatures. However, there is evidence that even the stone components of compound weapons could be morphologically dynamic between initial manufacture and discard (Flenniken & Raymond, 1986; Goodyear, 1974; Neeley & Barton, 1994; Shott & Ballenger, 2007; Wilke & Flenniken, 1991), and there has been little systematic research to differentiate predesigned morphologies from those that result from life histories in archaeological assemblages. Beyond the tips of compound weapons, however, considerable evidence has accumulated from ethnography, quantitative analyses of lithic morphology, and use-wear studies for the life history paradigm to account for the great majority of lithic forms (e.g., Barton et al., 1996; Clarkson, 2005; Dibble, 1995; Frison, 1968; Gould et al., 1971; Hardy et al., 2008; Hiscock, 2004; Holdaway & Douglass, 2011; Kuhn, 1991; Neeley, 1989; Tomáŝková, 2005; Weedman, 2002).

The life-history paradigm has significant implications for the potential cultural significance of assemblage classifications based on lithic typologies. If most lithic variability is primarily a product of opportunistic decision trees and final artifact forms represent a threshold of uselessness, then most similarities among assemblages will be primarily a function of contextual phenomena—access to and characteristics of raw material (quality, package size, etc.), length of site occupation, tasks performed and their frequency, and how these phenomena are embedded in a settlement/subsistence system (Barton & Riel-Salvatore, 2014; Clark, 2002; Dibble et al. 2017; Freeman, 1994). It follows, then, that similarities in lithic morphology driven by such contexts, with which all foragers had to contend, would not be reliable proxies for social interaction or common cultural descent.

Finally, even for lithic artifacts where an industrial paradigm might be relevant such that their spatial–temporal distributions could serve as a proxy for social interaction, there are issues in using them to trace multigenerational traditions of cultural descent. For biological organisms, the best traits for identifying ancestor/descendant relationships are those unaffected (or minimally affected) by selection—i.e., homologous traits used to identify synapomorphies. Likewise, selectively neutral variations in material culture should be the better proxies for tracing cultural descent (e.g., O’Brien & Holland, 1990; Sackett, 1973; Shennan, 2020). Such neutral variation in artifact morphologies has been termed style, emblemic style, or isocrestic variability in long-running debates over the definition of style, and relationships between style and function in past decades (e.g., Dunnell, 1978; Sackett, 1990; Wiessner, 1985). However, cultural transmission is more complicated than genomes because it can be transmitted diagonally and horizontally between different social groups in addition to vertically from one generation to the next, and strong selective pressures can promote rapid horizontal transmission of adaptive technologies (i.e., innovations) that reflect social interaction but not necessarily cultural phylogeny.

Lithic artifacts were fundamental human technologies and an integral component of the human niche from our earliest ancestors until, in some cases, as recently as the nineteenth century. They were the primary technology for modifying other materials, including producing other items of technology. The fundamental role of stone artifacts in the human socio-technological niche means that most aspects of lithic technology were probably under strong selective pressure. For example, beginning around 2000 years ago, small projectile points replaced large ones across all of North America. These were almost certainly the tips of arrows, replacing the tips of atlatl darts. While this could be attributed to the spread of the “bow and arrow people” across the continent, it is much more likely (and widely accepted by archaeologists) that this dynamic was the result of selection pressures favoring bow-based weapons delivery systems over those of atlatls, possibly associated with shifts in hunting practices and prey, and the adoption of agriculture as a primary means of subsistence (e.g., Bettinger & Eerkens, 1999; Shott, 1996a).

In summary, we are not arguing that it is impossible to find features of lithic artifacts that could serve as proxies for cultural descent. However, the empirical support and explanatory power of the life history paradigm to account for morphological variability in chipped stone artifacts, the fact that most lithic artifacts were deposited when they no longer had any utility, and the probability that much of lithic technology has long been under strong selection pressure make identifying lithic morphologies that could serve as reliable markers of cultural descent difficult (Barton, 1997; Clark, 2002, 2011; Clark & Riel-Salvatore, 2006). Nevertheless, a systematic program to identify dimensions of lithic morphology that can be convincingly attributed to selectively neutral features that persist in the discarded lithic artifacts that dominate the Pleistocene archaeological record would be a valuable contribution to our ability to identify patterns of prehistoric social interactions and knowledge transmission (see Tostevin, 2019 for an example).

From Typology to Taxonomy

The phrase cultural taxonomy implies a system of classification for prehistoric cultures. However, in practice for Paleolithic Europe, this refers to the classification of archaeological (mainly lithic) assemblages thought to serve as proxies for unobservable prehistoric cultures. Note that this is not a taxonomic organization of artifact classes but rather aggregates of artifact types used to generate assemblage types, taken to serve as proxies for prehistoric societies, then organized into cultures that are, in turn, often presented as having phylogenetic significance.

Still used today, phylogenic approaches borrowed from paleontology were fundamental to the development of Paleolithic cultural taxonomies in the late nineteenth and early-mid-twentieth centuries. An example is Breuil’s “parallel phyla”—time-successive sequences of different industries, one “evolving” into another (e.g., the Aurignacian and Perigordian in Europe, the Ahmarian and Levantine Aurignacian in the Levant (see Binford & Sabloff, 1982 for a critique)). Initially, they involved identifying one or several (usually lithic) artifact types to serve as indicators of an assemblage type (a “type fossil” or fossile directeur). As noted above, these marker types were assumed to represent selectively neutral traits like their paleontological analogues (although their neutrality was never tested), appropriated by prehistorians as ambiguously defined artifact styles. This approach continues to be applied to some taxonomic units of the Upper Paleolithic of Europe. For example, Dufour bladelets identify Aurignacian assemblages, and microgravettes mark the Gravettian (Clark, 2006; Zilhão, 2001; Zilhão & d’Errico, 1999).

François Bordes and Denise de Sonneville-Bordes proposed in the 1950s to classify assemblages based on the proportions of a standardized sequence of retouched artifact types rather than fossiles directeurs (Bordes, 1961, 1972; de Sonneville-Bordes & Perrot, 1953). While conceptually similar to numerical taxonomy in biology, this was rather different in archaeological practice, using simple percentages, indexes of type groups, and visual aids like cumulative percentage graphs interpreted subjectively. When multivariate statistics were applied, most famously to Middle Paleolithic assemblages by the Binfords (Binford & Binford, 1966), they revealed assemblage groupings that did not match those recognized by Bordes and colleagues, initiating the long-running “Bordes-Binford debate” (Binford, 1973; Binford & Binford, 1969; Bordes, 1969, 1973, 1981; Buscot & de Sonneville-Bordes, 1970). The debate was never resolved and, as a consequence, there has been relatively little subsequent effort to apply the powerful multivariate methods of numerical taxonomy (e.g., cluster analysis, principal components analysis, etc.) to the analysis of variability in Paleolithic assemblages (for exceptions, see Riede et al. 2019; Weiss et al. 2017). Artifact-type proportions and aggregate indexes, subjectively interpreted, remain a primary means of classifying Lower and Middle Paleolithic assemblages today. Upper Paleolithic assemblage classification relies on a mixture of these methods, augmented by a limited number of type fossils (Clark & Riel-Salvatore, 2005). Simple quantification, a lack of neutrality test, and subjective interpretation, combined with imprecise definition and inconsistent identification of type fossils and variable definitions of taxa at all levels have led to increasing difficulties in using or assigning meaning to pattern in the archaeological traces of Paleolithic human behavior (Bisson, 2000; Reynolds & Riede, 2019; Shea, 2019). These problems and potential solutions are discussed comprehensively and cogently in other papers on this issue. We focus here on other fundamental conceptual issues linking artifact and assemblage-level classifications.

Since cultural taxonomies are generated from classifications of assemblages, in turn, defined on the basis of classifications of artifacts, the different conceptual frameworks discussed previously for the measurement and meaning of variability in lithic artifacts will have significant consequences for identifying patterning and assigning meaning to assemblage classes. This is exemplified in the aforementioned Bordes-Binford debate, although it focused on Middle rather than Upper Paleolithic assemblages. The debate centered on the interpretation of assemblages types, based in turn on artifact-type proportions, and was never satisfactorily resolved for several reasons (Holdaway & Douglass, 2011). One reason is that both sides based on assemblage types and their interpretations primarily on the artifacts with significant amounts of retouch (i.e., lithic “tools”) that make up a fraction (often a tiny fraction) of Paleolithic assemblages, even though ethnographic and microwear studies have repeatedly demonstrated that unretouched or minimally retouched flakes and blades were commonly used as tools and indeed were what prehistoric artisans probably had in mind most often when knapping toolstone (Hiscock, 2004; Holdaway & Douglass, 2011). Second, Binford, Bordes, and their supporters implicitly assumed that the forms of the retouched artifacts on which they based their respective assemblage classifications were results of preconceived designs desired by ancient stone workers, discussed above. The debate ultimately came down to whether prehistoric craftsmen intentionally made more of one form of racloir than another because it was proper (i.e., cultural tradition) to do so (Bordes) or useful (i.e., functional) to do so (Binford). Because the toolmakers are long dead, discerning their intent is problematic. Finally, Bordes, Binford, and most other lithic typologists largely ignored the reality that most of what makes up the archaeological record—whether initially created from a preconceived design or not—gets there as exhausted, discarded, unwanted trash, the most common of what Schiffer (1976, 1987) describes as cultural formation processes.

In review, the Bordes-Binford debate is potentially resolvable with more sophisticated analytical procedures, but only from an industrial paradigm perspective that assumes:

  1. 1.

    retouched lithics comprise the most behaviorally meaningful component of Paleolithic tool kits,

  2. 2.

    most unretouched lithic material is only production waste,

  3. 3.

    the forms of retouched pieces as archaeologists find them are the end result of a preconceived chaîne opératoire, and

  4. 4.

    we can distinguish between lithic artifact morphologies that are under selective pressure and those that vary as neutral traits (e.g., Neiman, 1995).

Such a program might provide a means to identify and distinguish between patterns of similarity and difference among assemblages that are the result of cultural descent and those that are indicative of social interaction, including selection-driven forms, regardless of cultural phylogeny. However, we remain skeptical about the potential for resolving this debate in the light of compelling evidence that:

  1. 1.

    unretouched artifacts were useful, used, and often the desired end product of lithic technology;

  2. 2.

    most Paleolithic retouched artifacts were simply the most heavily reused and resharpened pieces that began their use lives as unretouched flakes or blades;

  3. 3.

    many lithics were used for multiple tasks; and

  4. 4.

    the fact that nearly all artifacts in nearly all Paleolithic assemblages were discarded trash.

These theoretical frameworks for lithic technology were pioneered more than 30 years ago in the very different contexts of the European Middle Paleolithic and in the terminal Pleistocene to mid-Holocene transition in North America (e.g., Bamforth, 1986; Barton, 1990a, b; Dibble, 1984, 1987, 1995; Kuhn, 1991, 1995; Rolland, 1981; Shott, 1989, 1996b). Nevertheless, a scan of recent Paleolithic literature suggests that while data collection, sampling designs, and analytical procedures have become more sophisticated, the actual practice of lithic analysis and, more importantly, the interpretation of its results have changed very little since the days of the Bordes-Binford debate. While the emphasis on technology embodied in the chaîne opératoire approach is a welcome advance (Boëda et al., 1990, 1995; Richter, 2001; Shott, 2003; Tostevin, 2011), the underlying assumptions, preconceptions, and biases about what causes the pattern to occur remain much the same.

From the perspective of the life history paradigm, patterns of assemblage-level similarity based on current lithic typologies can be explained more parsimoniously by processes other than tool-making “cultural traditions.” Since this conceptual framework implies that such patterns are ultimately the result of discard practices, it might seem of lesser archaeological interest compared with social interactions, specialized tool manufacture, and cultural phylogenies. However, the life history paradigm can indeed be informative about equally interesting aspects of Paleolithic life. If assemblage characteristics are in fact an emergent outcome of decision trees that enabled Paleolithic foragers to flexibly adapt lithic technology to local circumstances, then patterns in discard practices can serve as proxies for interesting and important dimensions of prehistoric lifelike mobility strategies, settlement organization, food procurement, processing needs, and toolstone availability. Moreover, if lithic technology was indeed under strong selection pressure, we have a ready-made body of robust, extended evolutionary theory (e.g., human behavioral ecology, game theory, cultural transmission theory) to help account for such patterns. This approach has in fact proven productive in supporting a growing diversity of research topics such as forager mobility, land use patterns, resource use, social organization, niche construction, and even social interaction (Barton et al., 2011; Clark & Barton, 2017; Douglas et al., 2008; Kuhn, 1994; Marwick, 2008; McPherron, 2000; Riel-Salvatore, 2010; Riel-Salvatore & Barton, 2004).

We also emphasize that the life history paradigm does not preclude studies of prehistoric social interaction, although it does suggest that most heavily retouched stone artifacts are not useful proxies for these phenomena. For example, cross-assemblage similarities in details of core reduction might be a dimension of lithic technology to investigate for cultural descent, not least because the forms recovered archaeologically are more likely to represent the stone worker’s intent and different ways of creating functionally equivalent flakes or blades (i.e., neutral variants) (e.g., compare Gould et al., 1971; Hiscock, 2004). Chaîne opératoire methods are particularly well suited to such studies, although they will need to account for information transmission across low-density forager networks (Powell et al., 2009; Tostevin, 2011, 2019) and employ statistical methods to “unmix” the multigenerational palimpsests that comprise most Paleolithic assemblages (see below).

Assemblage Taxonomies and Paleolithic Culture

It has been widely recognized that, unlike sciences that communicate through the universal language of mathematics, archaeology suffers from a significant imprecision in basic analytical units and variables and a lack of consensus about the relationships among them. An important objective of the CLIOARCH conference was to try to rectify this deficiency. Toward that end, Reide and colleagues (Riede et al., 2020) identify a set of criteria for reorganizing current Paleolithic systematics to make taxonomic units more rigorous, reproducible, and useful. “Operational taxonomic units hinge on.

  1. 1.

    consistent criteria for their definition and delimitation,

  2. 2.

    a clear taxonomic system into which such archeological entities are placed,

  3. 3.

    agreement on the meaning of the relative ranks within such taxonomic system, and

  4. 4.

    their prehistoric reality vis-à-vis anthropological, ethnic or linguistic notions of culture.”

Other authors in this volume offer additional proposals for developing more rational, coherent, and behaviorally meaningful taxonomies for Paleolithic Europe (see also Riede et al. 2019). We discuss here some underlying conceptual issues for such a program that warrant consideration.

We begin with a review of relevant concepts of culture and society. In the mid-twentieth century, cultural anthropologists Kroeber and Kluckhohn (1952) identified over 160 definitions of culture; the number of definitions has doubtless increased in the past 70 years. A workable definition relevant here that embodies many of these different perspectives might be:

a body of knowledge, shared through social learning and transmitted across generations, that provides ways of understanding social and environmental phenomena, and generates human behavior appropriate for those contexts and responsive to their changes, including the knowledge of how to make and use technologies.

An important point is that culture cannot be directly observed, even in living people; it must be inferred from observed behavior, conversations (including writing and other media), symbolic representations, and technologies. This is even more difficult for archaeologists, who cannot infer past culture from ethnographic observation of, or conversations with long-dead informants (Perreault, 2019; Shennan, 2002). We are left only with what has been appropriately called material culture, the depauperate traces of once vibrant communities within which technological knowledge was only a part, albeit an important one.

Cultures are usually juxtaposed with, and often confused with, societies—particularly in archaeology. Unlike cultures, societies can be directly observed, at least in the present. Again, there are many definitions. But, a reasonable one for our purposes might be:

a group of individuals characterized by persistent social interaction who share a geographical area or territory, typically with the same patterns of relationships among individuals who share a language, a distinctive set of mores, customs and values that define a dominant set of expectations about behavior (i.e., culture). A given society can be described as the sum total of such relationships among its constituent members.

Society and culture are intimately interrelated, of course. Culture mediates and is transmitted through social learning and practice. But, while societies can be directly observed in the present, they cannot be observed in the past, and we must again depend on the physical remnants of technology to serve as proxies for past societies. In fact, the same material proxies that can be claimed to stand for shared cultural knowledge equally are proxies for groups of interacting social actors. Since the cultural transmission is a consequence of social interaction, it is difficult to disentangle these processes archaeologically—and attempting to do so may in fact be irrelevant in practice.

The similarity in artifacts or assemblages can result from within-group cultural transmission, between-group social interaction, or common situational contexts affecting lithic procurement, curation, and discard with which all foragers must contend. Selectively neutral morphologies that could serve as proxies for cultural transmission—if they can be convincingly identified—should theoretically have different space–time distributions than morphologies under strong selection pressure. However, distinguishing between these in practice in the discontinuous, temporally coarse-grained Paleolithic record is difficult and there have been few systematic attempts to do so (see Barton, 1997; Bettinger et al., 1997; Neiman, 1995).

Another confounding issue has to do with presuppositions about contemporaneity and time–space resolution. Paleolithic artifact assemblages are widely treated analytically as if they constitute the debris from one or a few episodes of short-term occupation of a locality by a single hunter-gatherer band. Often delimited by space (e.g., in a cave or rock-shelter), these concentrations of debris are not infrequently described as “living floors” (Bailey, 2007; Isaac, 1977; Leakey, 1971). This is almost never the case for the caves and rock-shelters that produce the majority of European Upper Paleolithic assemblages and is even uncommon for open-air sites (Barton & Clark, 1993; Dibble et al., 1997, 2017; Perreault, 2019; Wandsnider, 1992). Of course “Paleolithic Pompeiis” do exist, but they are rare (e.g., Cahen & Keeley, 1980; Clark et al. 1987; Marks, 1983). Ethnographic studies of recent foragers show that they leave few preservable traces during any single episode of use or occupation of a place (Binford, 1977, 1980). Prior to multi-year sedentism, it is only when a place has been reoccupied many times by mobile hunter-gatherers that the densities of discarded and preserved material culture rise to the level of archaeological visibility. Caves and rock-shelters are particularly attractive for Paleolithic archaeologists because they constrain the spatial and temporal dispersal of trash from such multiple occupations and further densify it with periodic slow deposition rates, making it archaeologically visible (Barton & Clark, 1993; Straus, 1990). In other words, the vast majority of assemblages are time-averaged palimpsests of many, often ephemeral occupations (Clark et al. 2019; Domínguez-Rodrigo, 2009; Perreault, 2019; Wandsnider, 1992). While a single social group may return regularly to a favored locale, radiometric dates and microstratigraphic analyses show that the accumulation of an assemblage from a single level, layer, or stratum typically spans generations or centuries, no matter how meticulously excavated (Jelinek, 2013).

The cave and shelter sites that dominate the European Paleolithic record are better thought of as “artifact traps,” analogous to natural faunal traps in karstic terrain (e.g., Martin & Gilbert, 1978). As such, they capture and render archaeologically visible a sample of the material debris discarded by mobile hunter-gatherer groups passing across the landscape. It is highly unlikely that only the discards of a single foraging society would be captured in this sample, and in fact, we would be well advised to assume the contrary without convincing evidence. Hence, we should question how well any assemblage can serve as a proxy for an ancient society of interacting individuals or any anthropological taxon characterized by a shared body of cultural knowledge (Perreault, 2019). In fact, computational modeling of lithic assemblage formation processes demonstrates that assemblage-level patterning is most apparent when assemblages are, in fact, time-averaged mixtures of many occupations, and that such patterning has nothing to do with the transmission of long-term cultural knowledge or the kinds of societies within which it is embedded (Barton & Riel-Salvatore, 2014).

This is not to claim that tracing Paleolithic cultural phylogenies is a priori impossible. However, to do so will require well-theorized models of cultural transmission and social learning (e.g., Boyd & Richerson, 1982, 1988; Grove, 2016; Perreault & Brantingham, 2011; Tostevin, 2019), detailed analyses of artifact morphologies and their causes, and high-resolution data on their space–time distributions (Perreault, 2019; Riede et al. 2019). The data currently available from the many thousands of Paleolithic sites collected over the past two centuries will never serve this goal (though they remain potentially useful for other objectives), given the ways in which artifact and assemblage classes were created, interpreted, and applied (Barton & Neeley, 1996; Barton et al. 2018; Binford & Sabloff, 1982; Clark, 1993; Clark & Barton, 2017; Dibble et al. 2017; Holdaway & Douglass, 2011).

Finally, even if a well-designed program to identify lithic markers of Paleolithic social identity could be carried out, we question the value of using this information to classify assemblages into a cultural taxonomy analogous to a phylogeny of biological species. The study and explanation of social and cultural evolution, while not the only goal of archaeology, is certainly an important one. Nevertheless, classification of assemblages into ancient cultures, however done, makes it difficult to realize such a goal and presents a similar issue for evolutionary biology (Lyman & O’Brien, 1997; O’Brien & Holland, 1990). When assemblages are classified into variety-minimizing essentialist cultural types, information is lost and variation within them becomes “statistical noise.” While we can measure the differences between the types, through time or across space, essentialist classes make it more difficult to explain why those differences exist and how one type evolves into another.

Given the practical and conceptual difficulties in creating robust and convincing cultural taxonomies for the Paleolithic and the fact that doing so would make it more difficult to measure and explain the social and cultural change, the question arises as to why we should want to do so. John Shea argues that Paleolithic archaeology would be better off without lithic industry types, whether organized into cultural taxonomies or not (Shea, 2014, 2019). Perhaps he is right, but it’s also worth noting that these “NASTIES,” for all their defects, nevertheless, have a vague and fuzzy utility as a lingua franca for the discipline. We further acknowledge that classification itself is useful and even essential across all scientific research as a tool to measure continuous and complex variation. It is also an inherent aspect of human cognition (and that of other animals), an adaptive means to coarse-grain a complex world in order to better understand, respond to, and predict it.

Humans also instinctively classify each other, using language, skin color, hairstyle, dress, and geographical origins as shorthands for compartmentalizing people they do not know personally—perhaps with an evolutionary basis in promoting hypersocial cooperation that is unique to our species (Bowles, 2009; Bowles & Gintis, 2004, 2013; Hrdy, 2007, 2016). These innate classifications of our fellow human beings are grounded in essentialist views of human diversity rather than treated as arbitrary yardsticks designed to measure largely continuous variation. Like the original Linnaean taxonomic system, they are often organized into hierarchies of superiority, leading to bigotry, xenophobia, racism, mysogeny, genocide, and a host of other social pathologies. Scientists reject such essentialist taxonomies of humanity today, of course, arguing that we can better understand humans and our biosocial evolution by studying multiple dimensions of variation and diversity rather than by employing the cultural classifications that were pervasive until the mid-twentieth century. But when we look to prehistory, we find a tendency to apply similarly essentialist classifications to past people on the basis of artifact typologies, superficial skeletal characteristics, or paleogenetic sequencing that bear little or no demonstrable relationship to human cognitive capacities or social behavior. Archaeologists would abhor using such methods to judge the abilities of modern people but show no reluctance to apply them to ancient ones.

By putting variable assemblages into essentialist boxes, we forego the ability to study the social and evolutionary processes that generate the patterns we observe in the archaeological record. We also mislead other sciences and the general public about the nature of archaeological research. Modern archaeology seeks to understand process and change, but others see us as seeking to discover past “cultures” and sorting them into schemata of “primitive” and “advanced” or “better” and “worse.” Most archaeologists acknowledge that assemblage classes are not cultures in the anthropological (Watson, 1995) or sociological (Parsons, 1937) senses of the term; we even call them “archaeological cultures” but this fine distinction is lost on all but archaeologists. And, it is clear from the literature that we also end up misleading ourselves about our most important disciplinary objectives by focusing on the discovery of who and when, rather than the science of how and why.

Conclusions

While we are skeptical about the practicality or value of creating better biology-like taxonomies of Paleolithic cultures from classifications of assemblages of lithic artifacts, we strongly affirm the importance of understanding the dynamics of ancient societies and cultures. If we agree with Shea that Paleolithic archaeology may be better off without “NASTIES,” we do so because we believe that human culture indeed should be a leading topic of research for Paleolithic archaeology. The evolution of cumulative culture is a fundamental aspect of how we became human (Boyd & Richerson, 2009; Boyd et al., 2011)—as are the changing dynamics of in-group and between-group social interaction, the evolution and diversification of technology, the intensification of niche construction in human ecology, the evolution of non-kin cooperation, and other fascinating characteristics that have made us a unique species (Boyd, 2017; Foley, 1987; Hill et al., 2009). To carry out innovative science on these topics, we can deploy an extensive suite of multivariate analytical methods like machine learning to identify patterns of variability (Burke et al. 2018; Klassen et al., 2018; Weiss et al., 2017), network science to study social interactions (Hill et al., 2015; Mills et al., 2013), geospatial technologies and new dating methods to reveal space–time distributions (Fernández-López de Pablo et al., 2019; Shennan et al., 2013), and computational modeling to operationalize and test hypotheses about the dynamics of Paleolithic systems (Barton et al., 2011; Powell et al., 2009; Premo & Hublin, 2009). These can be united under a robust theory of extended evolution that encompasses human behavioral ecology and cultural transmission.

We also echo and amplify recent efforts to make archaeological data transparent and accessible to all (Marwick et al., 2017; Reynolds & Riede, 2019). We study our joint human heritage; the data and knowledge we generate should not be the personal property of an individual or institution. These data need to be archived into standardized, digitally readable formats, with sufficient metadata to make them useable. While it would be ideal if all such data used common ontologies (i.e., terms and their referents), accessibility and metadata can go a long way toward making legacy collections useful for the large-scale comparative analyses essential for understanding cultural transmission, social interaction, human ecology, adaptation, niche construction, technological change, and biocultural evolution (Barton et al., 2018; Burke et al., 2018; Clark et al., 2019; Dibble, 1995; Marwick, 2017; McManamon et al., 2017; Neeley & Barton, 1994; Reynolds & Riede, 2019; Stephens et al., 2019).

Harkening back to the intellectually exciting workshop that generated the papers in this volume, Paleolithic Europe was indeed inhabited by a multitude of “fantastic cultures.” We look forward to a new generation of archaeologists applying innovative methods, robust theory, and responsible science to create a much-improved understanding of the human past.