Craft specialization and standardization are topics of continuing interest in archaeology. Since the work of Gordon Childe (1930, 1936, 1958) (see also Trigger 1986), craft specialization has been recognized as an extremely important reflection of, and motor for, wider social and political change. A widely accepted assumption is that, wherever we can document full-time specialization in one technological sector, equivalent levels of specialization are likely to exist in other crafts and activity spheres: in the simplest form of this statement, Childe noted that full-time itinerant metallurgists required others to produce their food. Even though Childe’s perspectives have been greatly revised and refined in recent years, it remains true that his focus on the organization of production is not only insightful from a technological perspective but also informative about broader social structures (e.g., Rice 1981; Brumfield and Earle 1987; Clark and Parry 1990; Stark 1995; Wailes 1996). Another common assumption is that craft specialization and standardization cannot exist without one another, although ethnographic and archaeological studies have shown that this is not necessarily the case (e.g., Hagstrum 1985; Arnold 1991; Blackman et al. 1993; Roux 2003; Costin and Hagstrum 1995; Rice 1989, 1991; Stark 1991; Kvamme et al. 1996; Longacre 1999; Underhill 2003; Humphris et al. 2009). Twenty years on, Costin’s (1991) review paper on the definition, identification, and explanation of craft organization remains a foundational contribution to this field. Since the publication of this paper, numerous researchers have been prompted to assess the context, concentration, scale, and intensity of different crafts as they are documented in the archaeological record and to pursue such research in an explicitly comparative way.

The study of crafts, however, need not be constrained by a narrow obsession with the degree to which a given example is specialized and/or standardized. Even in situations where these parameters can be established fairly easily, other important questions may remain, such as the extent of cross-craft interaction (e.g., Shimada 2007), modes of transmission of technological knowledge (e.g., Eerkens and Lipo 2005), or further issues related to labor organization and task allocation within large-scale productive enterprises (see below), to name but a few. When many objects of the same kind are documented archaeologically, potentially at the same site, questions arise as to whether they are the products of one or more workshops and as to how these might have related to each other spatially and economically (cf. the distinction between “site specialization” and “producer specialization” in Muller 1984). Even when it is possible to establish a single manufacturing center that is producing on a large scale—or indeed a single, large commission such as the bricks for a church—pertinent archaeological questions remain as to how labor was organized internally.

Researchers of the history of technological change in mainland China have proposed different models to categorize the way that production was organized, particularly with regard to the manufacture of Shang bronze vessels and other Bronze Age artifacts. One of the most influential models is that of Franklin (1983a, b, 1992), who distinguishes between holistic and prescriptive production systems. A holistic process is envisaged as a single, linear progression toward the manufactured object, where the same craftsperson or production unit is in charge of all the manufacturing procedures. In contrast, for a prescriptive process, production is divided into segments and each production stage is carried out by highly specialized individuals who are not necessarily acquainted with the entire production process. For example, based on excavated workshop evidence, Sun (2007) has recently argued for a holistic approach to the production of stone jue earrings in the Western Zhou period, given that debris from all production stages was mixed and no specialized activity areas could be identified. Li (2007) has refined Franklin’s model, adding further diversity to it and, crucially, defining clearer archaeological criteria to discriminate between different systems in bronze production workshops. His work also demonstrates the usefulness of adapting some well-defined concepts used in the car manufacturing industry, such as “flow line production” or “cellular production,” as hypotheses to be tested against archaeological data (see below). In the present paper, we will use the modern terms “flow line” and “cellular” production while bearing in mind that as descriptive terms, they could respectively equate to Franklin’s “prescriptive” and “holistic” models. Those modern terms are much more common in the literature, and their explicit use will facilitate critical comparisons between past and present manifestations while not engaging a priori with Franklin’s further interpretation of the social and political implications of either system.

Our ongoing collaborative project studying aspects of the Terracotta Army is specifically concerned with the above topics. The site presents great research opportunities for the study of how production is organized in complex societies and imperial systems, not least because it offers a very large, highly intentional, well-contextualized, narrowly dated, and, in archaeological terms, largely closed dataset. The main drawbacks with such a context are the fact that it provides no direct evidence for production areas in the form of manufacturing debris and that it demands a set of carefully designed sampling strategies. The special nature of this archaeological site has required us to tailor a methodological approach specific to our case study—one that involves adaptive sampling, extensive metric studies, non-destructive chemical analyses, and spatial statistics. However, such a joined-up approach has been successful in allowing us to better characterize the organization of labor behind this unique archaeological deposit. Here, we present our method as it developed, outlining its results, implications, and limitations. Notwithstanding the peculiar nature of the site, we propose that such a methodological approach has broad relevance to the study of craft activities in other contexts.

An innovative dimension of the study considered below will be the way we seek to use elemental analyses of metals to offer broader information about the organization of production at a single site. Chemical analyses are often used to determine the geological and, where possible, the archaeological provenance of artifacts, thereby contributing to the study of production and consumption on a broader scale. Particularly in ceramic studies, some researchers have tried to correlate chemical standardization in paste composition with production intensity and scale, but the approach is problematic unless applied at a very small geographic scale, with good control of archaeological contexts and with a very clear understanding of underlying geological variability (Arnold 2000; Arnold et al. 2000). The standardization in the composition of metallurgical slag compositions has also been employed as a proxy to examine variability in technological practices (Humphris et al. 2009; Pryce et al. 2010). When differentiating between the kinds of “intentional” and “mechanical” attributes that may be used to assess standardization, Costin (2001, p. 302) (see also Costin and Hagstrum 1995) classifies chemical composition as an intentional feature in the sense that, notwithstanding environmental and functional constraints, craftspeople are often able to choose their raw materials and the relative proportions in which they are combined. Mechanical attributes are, in contrast, those that are more affected by unconscious habits such as technical routines. As an example of intentional standardization in the composition of metal artifacts, one can cite the study of implements from the Royal Tombs of Sicán in Peru, where there is a close correspondence between alloy compositions and artifact typologies that can be explained with reference to functional and aesthetic concerns (Gordus and Shimada 1995). Here, however, we employ aspects of the chemical data as a guide to understand mechanical attributes that were not intentionally modified by the artisans. We shall propose that these reflect fundamental aspects of the organization of labor. We will also be elaborating on the concept of “the batch” proposed by Freestone et al. (2009a, b, 2010) as an analytical unit that offers great potential for the reconstruction of subtle aspects of technological organization in archaeological contexts.

The Terracotta Army: Warriors, Weapons, and Arrows

The Terracotta Army of Xi’an is arguably the most famous component of the much larger mausoleum built for Qin Shihuangdi, the First Emperor of China (259–210 BC). The construction of the mausoleum is generally thought to have been commissioned as soon as the young Shihuangdi ascended to the throne as the king of Qin in 246 BC, although it is clear that work intensified after the imperial unification in 221 BC and was largely completed by the time of the emperor’s death in 210 BC (Rawson 2007, p. 131). In less than 40 years, therefore, a colossal funerary space was brought into being, covering some 56 km2 and encasing a funerary pyramid, various pits with life-sized servants, acrobats and musicians, water channels with delicate bronze birds, bronze carriages fitted with gold and silver implements and lavishly decorated with polychrome pigments, as well as, surely, many more finds as yet undiscovered (Fig. 1). The terracotta warriors are some of the most famous features in this funerary assemblage. They are distributed in three pits at the eastern end of the complex and are thought to be there to protect the emperor in his afterlife. The largest of these pits, and the focus of our study, is Pit 1. The excavation of the front, easternmost section of this pit has so far recovered over a thousand ceramic warriors in battle formation, with eight chariots pulled by horses interspersed. Based on their spatial density and the size of the pit, it is estimated that around 7,000 warriors may be present in total.

Fig. 1
figure 1

Site plan of the First Emperor’s Mausoleum showing the location of the emperor’s tomb towards the center, the Terracotta Army to the east, and other elements of the complex

The main interest of our project is in the logistics of technology, standardization, and labor organization behind the construction of the mausoleum in general and the Terracotta Army in particular. Considering the scale of the site, it is unlikely to be overstating things to claim that thousands of workers were engaged in this monumental enterprise. Writing in the first century BC, historian Sima Qian mentions 700,000 workers (Yang and Yang 1979). The construction of the main burial chamber, for example, involved digging down to a depth of 30–40 m, providing further access ramps, diverting water courses, and arranging a huge number of burial goods before covering all of this with a pyramid of over 80 m in height (Yuan 1990; Rawson 2007, p. 132). This is in addition to the manufacture and transport required for the many implements placed in the tomb and also does not consider the fact that the emperor’s tomb is but one part of a much larger mausoleum (for an overview, see Ledderose 2000, pp. 52–57; Wu 2007). Creating Pit 1 alone required the removal of over 70,000 m3 of earth (Nickel 2007, p. 161).

Following Costin’s (1991) model, it would seem relatively safe to propose that the workforce constructing the Terracotta Army took the form of a “retainer workshop,” i.e., a “large-scale operation with full-time artisans working for an elite patron or government institution within a segregated, highly specialized setting or facility” (Costin 1991, p. 9; see also Clark 1995). However, as noted above, further relevant questions may be addressed. Based on the inscriptions carved on ceramic fragments that carry the names and places of the origin of the conscripts involved in the project, we already know that the workforce was drawn from all over the empire and that it included criminals recruited as forced labor (Rawson 2007, pp. 132–33; see also Barbieri-Low 2007, pp. 202–256 on conscript labor). It is even possible that these were killed after completion of the work since many are buried in a cemetery near the emperor’s burial chamber. With regard to the overall organization of the work, while it is obvious that some form of blueprint must have existed, many of the pits are not arranged in an orderly or symmetrical way, suggesting that the various buildings may have been constructed in succession, as part of several stages (Rawson 2007, p. 132). Building materials such as roof tiles bear seals noting the workshops that made them, and these indicate that several production sites or units were involved (Rawson 2007, p. 133), even if their chronological or spatial relationships are not as yet clear.

With regard to the Terracotta Army in particular, previous work has suggested that various foremen were in charge of the production of individual figures. The seals or signatures of at least 87 foremen have been identified on the warrior’s backs, indicating a form of personal accountability for the quality of each individual warrior—although each of these probably supervised a larger number of subordinates (Ledderose 2000, pp. 50–73; Nickel 2007, p. 179). At the same time, technical observations on the figures have revealed the use of prefabricated modules such as legs, torsos, hands, or heads, made from a relatively small number of moulds, that would be assembled together before adding individualizing features and firing them in large kilns. For example, it appears that only eight head moulds were employed, even if facial features such as eyebrows, beards or hairstyles were finished individually to conceal the evidence of mass production (Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1988, pp. 144–150; Ledderose 2000, pp. 68–70; Nickel 2007, p. 170). This could imply that, even if a number of separate production units were ultimately responsible for each finished warrior, parts may have been supplied by a more centralized production chain (but see discussion below).

So far, our research has concentrated on the bronze weapons carried by the warriors, in advance of any consideration of other elements such as the warriors themselves. However, as discussed below, we believe that the study of the weapons may serve as an indirect proxy to understand the way production was organized for the whole army and, potentially, the wider mausoleum complex. The weapon assemblage includes hundreds of crossbow triggers, swords, lances, spears, halberds, hooks, ceremonial weapons, and the ferrules that were fixed at the end of wooden hafts, in addition to over 40,000 arrowheads. Lances and halberds bear long sentence inscriptions chiselled on their surfaces, while the swords, triggers, hooks, and ferrules were only partially marked with numbers, a note of the workshop, and/or other symbols. Shorter inscriptions including numbers and symbols probably denote some form of quality control—for example, matching symbols often appear in the various parts of a given trigger, and they were clearly added after the filing that ensured an accurate fit. The long inscriptions indicate the regnal year when the weapons were produced, the name of the person in charge of production, the official or workshop, and the name of the specific worker who did the work (Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1988; Yuan 1984; Li et al. 2011; see Li 2012 for information updated from recent finds). It is quite possible that further inscriptions were painted or written on the weapons’ surfaces, but these have not survived.

This paper will largely concentrate on the arrowheads as they constitute the largest typological group and one where chemical analyses were particularly crucial for identifying structure in the dataset. These arrows were meant to be used with crossbows and, hence, could be more narrowly described as bolts or quarrels, but we retain the more generic term here in step with past usage. The arrowheads, as preserved, are composed of two main parts that were cast separately and subsequently joined together: the arrowhead itself and a tang, which together weigh approximately 15 g. The head is a solid, triangular pyramidal tip, averaging 2.7 cm in length and 1 cm in width. It has a cylindrical socket on the back, where the tang was inserted. The tangs are straight rods of metal, a few millimetres in diameter and showing variable lengths, typically ranging between 7 and 15 cm (see below). Some tangs display clear cut marks at the distal end, indicating that they would have been cut from longer rods before being attached to the heads; others appear to have been cast individually. Even though it has only been possible to analyze one of the joints invasively, there is no indication of casting-on or soldering between tangs and heads. Rather, it appears that the tangs were simply inserted mechanically into the heads’ sockets (cf. Yuan et al. 1981; Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1998). In some cases, thin strips of metal are visible wrapping the tang at the point where it enters the socket. These features, which we term “necks,” are thought to facilitate a tighter grip between the two parts. Based on better preserved examples, it appears that the tang was wrapped in a linen cord before being inserted in a longer bamboo or wooden shaft, resulting in an overall arrow length of about 60 cm. Feathers would then be attached to the distal end of the arrows (Fig. 2). One of the bronze chariots recovered from the mausoleum includes miniature replicas that illustrate what a whole arrow might have looked like (Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1998).

Fig. 2
figure 2

a Schematic drawing of a bronze arrowhead showing the head proper, the tang, and the “neck” in between. b Detail of a few arrowheads showing different details such as the neck, linen string, and a bamboo shaft around the tang

The vast majority of the arrowheads were recovered in clusters or “bundles,” thought to have been preserved in this fashion after the decay of the woven hemp quiver that contained them. More precisely, 37,348 arrows come from 680 findspots in the easternmost five trenches of Pit 1, the area of complete excavation on which we have so far focused. These arrows were mainly distributed in the vanguard of the Terracotta Army as well as the side corridors that contained the left and right flanks (Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1988; Li 2012). This distribution closely matches that of the crossbow triggers, reflecting the organization of the battle formation (see Yates 2007; Lewis 2007, pp. 30–50 on military organization; Fig. 3). The rest of the arrows come from ongoing excavations elsewhere in Pit 1 (about 5,300 arrows); trial trenches in Pit 2 (about 1,200); and further archaeological survey, excavation, or chance finds at or around the tomb complex. A frequency distribution (Fig. 4) shows a distinct mode for bundles containing 100 arrows, indicating that this was the standard contents of a quiver. There is another mode around 1, probably explained by the presence of loose arrows that could not be easily ascribed to a specific bundle by modern excavators, although it is also possible that the warriors were carrying a single arrow in their hand or already loaded. The presence of some bundles smaller and larger than 100 is probably due to the loss of arrows and mix-up of bundles as a result of post-depositional processes and during subsequent excavation. Given the extremely high density of arrowheads on the site, it is not surprising that archaeologists could not conclusively associate all the arrows with specific bundles.

Fig. 3
figure 3

Plan of the excavated area of Pit 1 showing the distribution of different warrior types, crossbow triggers, and arrowheads (the latter, with one dot per findspot, regardless of bundle size). The arrow bundles subjected to chemical analyses by pXRF are marked with a different icon

Fig. 4
figure 4

Frequency distribution histograms showing the number of arrows per bundle: all the bundles, with bin = 10 (a); detail of the 1–10 segment, with bin = 1 (b); detail of the 70–130 segment, with bin = 1 (c)

Making a Warrior: Hypotheses About Production Organization

Equipping a crossbowman (all of the relevant terracotta warriors appear to be male) with his gear would require bronze arrowheads, tangs and triggers, bamboo and feathers to complete the arrows, wood for the crossbow frame, linen string, leather, and hemp for the quiver. This is in addition to the sophisticated ceramic engineering required for the making of the warriors themselves, in addition to lacquer and numerous natural and artificial pigments that would be added to complete these colorful figures (see contributions in Blänsdorf et al. 1999; Wu et al. 2001). A variety of materials and expertise would therefore have to be combined harmoniously for the production of each warrior before it was placed in the pit or during a final finishing episode within this space.

Considering the large array of materials and skills required, as well as the large output and the limited time available, our initial hypothesis was that the manufacture of the arrows and other mass-produced items in the mausoleum would have been organized according to the logic of a “flow line” production system (Dioguardi 2009, pp. 51–59; Groover 2010, pp. 18–19). We expected that different specialized workshops or production units would be manufacturing, respectively, arrowheads, tangs, shafts, and so on, more or less continuously, before the different parts reached an assembly unit where they would be fitted together. Possibly after some time in storage, they would then be bundled in groups of 100 before being placed in quivers (presumably produced in a different production unit). The finished quivers would then join the multicomponent crossbows (again, presumably produced along similar lines) and the warriors as they were placed in the pit. In fact, during our study of grinding and polishing marks on the bronze weapons, we found such remarkable similarities that we were forced to consider the possibility that all the weapons had been sharpened in the same specialized workshop (Li et al. 2011, p. 500). Implicitly, we were accepting a presumed model of predetermined, highly specialized production lines or units mass-producing individual components that would be assembled at a later stage. The high degree of formal standardization documented within the different weapon types (and discussed below) appeared to support the idea of a production line where all the technical procedures and engineering parameters were repeated, in each case, by a relatively small number of specialists.

Upon closer scrutiny, however, and briefly to anticipate some of our conclusions, the flow line hypothesis can be falsified and the possibility of a “cellular production” model (Dohse et al. 1985; Productivity Development Team 1999; Dioguardi 2009, pp. 51–69) gains strength. We shall outline the analytical methods and results in the section that follows before returning to consider this alternative way of organising production in the final section.

Chemical Analysis: In Search of Structure

We gave analytical priority to the 278 bundles containing 90 arrows or more. All of these were examined macroscopically and photographed. The initial impression from handling the bundles was that, notwithstanding many exceptions, there was a certain degree of internal coherence for most of them, most noticeably in the length, thickness, hardness, and straightness of the tangs; the arrangement of the linen around them; the presence or absence of clear casting seams; and even the state of preservation (Fig. 5). Thus, six arrows per bundle were randomly selected for detailed measurement and photographs (Li 2012). Eighteen bundles were also selected from across the excavated area for chemical analysis. The selection of bundles was designed to include samples from across the entire excavated extent of Pit 1 while also including some bundles that had been found in close proximity to each other (Fig. 3). From each of these selected sample bundles, 5, 10, or 20 arrows were randomly chosen and their heads and tangs analyzed separately.

Fig. 5
figure 5

Examples of arrowheads from three different bundles showing formal differences between bundles and the relative internal consistency in the appearance of the tangs from each bundle

All the chemical analyses discussed here were carried out using a portable X-ray fluorescence spectrometer (pXRF) from Innov-X Systems (now Olympus), model Alpha, equipped with a silver tube and a SiPIN detector with a resolution of approx. 180 eV full width at half-maximum for 5.9 kV X-rays (at 4,000 counts per second on a stainless steel AISI 316 sample) in an area of 6 mm2. All analyses were conducted at 40  kV, 30.5 μA, using a 2 mm aluminium filter in the X-ray path for a 25 s live-time count. The vast majority of the items were analyzed three times and the averages calculated. However, after the main analytical effort, a number of arrow bundles were analyzed only once for screening purposes in order to expand the sample and check that the patterns observed generally held up.

Portable XRF offers the potential of analyzing large numbers of artifacts relatively quickly, inexpensively, and without having to move them to a laboratory. Unlike ceramics, glass, or other materials, all the major elements present in pre-modern copper alloys have relatively high atomic numbers; thus, in principle, they can be accurately quantified by pXRF even if the analyses are not carried out in vacuum. However, the surfaces of archaeological metal artifacts are often corroded, uneven, or contaminated by soil deposits. It is also possible that artifact surfaces were intentionally gilded or decorated in the past. All of this means that the composition as recorded on the surface may differ from that of the bulk of the object in a way that cannot be easily predicted or modeled. Regardless of the analytical method employed, unless the corroded surfaces are abraded prior to analyses, this sampling uncertainty means the results cannot be considered as fully quantitative measures of overall composition. The bronze weapons from the Terracotta Army are generally undecorated and very well preserved, but as demonstrated below, the fact that unprepared surfaces had to be analyzed means that our results are not totally free of the above problems. As such, the data were explored in search of general trends rather than focusing on the absolute weight percentages reported by the instrument. Despite this limitation, the patterns observed seem consistent enough to prompt some important conclusions.

An overview of the chemical data for the arrows shows a clear trend in that the heads proper tend to exhibit a higher tin content than the tangs (Fig. 6). With very few exceptions, this general pattern applies to every individual arrow, where the head analysis invariably reported higher tin levels than the tang. This result implies a careful selection and optimization of alloys: high-tin bronzes are very hard and can be polished to a sharp finish, increasing the penetration power of the arrow, but at the expense of a higher brittleness. Conversely, the tangs were made of a lower tin bronze that would ensure a higher toughness, reducing the risk of fracture when inserted in the bamboo shaft and perhaps allowing for a certain degree of flexibility for its oscillation during the arrow’s flight.

Fig. 6
figure 6

Frequency distribution histogram comparing the tin levels in tangs and heads of all the arrows analyzed

The broad correspondence between arrow parts and tin levels is what Rice (1989, p. 110) would call “resource specialization,” and using Costin’s terminology (Costin and Hagstrum 1995; Costin 2001), it implies “intentional specialization.” In other words, it is likely that weapon makers consciously chose to add more tin to the melting crucibles when they were going to cast arrowheads to ensure optimum performance characteristics (sensu Schiffer and Skibo 1987). The compromises associated with this technological choice include a likely higher cost for the tin employed for the arrowheads as well as the extra organizational effort required to create specific alloys depending on the weapon or weapon part being cast. It should be noted here that copper and tin (and probably lead) are likely to have entered the workshops as nominally pure metals that the weapon makers would mix in the preferred proportions to form the alloys that we record here. Conversely, minor and trace elements (typically below reliable quantification limits for the pXRF and not reported here) may be related to the geochemistry of the ores exploited and the impurities present in them, as well as to the different smelting and refining procedures. These would not be noticeable to metal makers and users and are therefore informative of metal provenance rather than of conscious technological choices.

Of more interest for our purposes here, however, is the degree of internal chemical coherence of the major elements within each bundle, which—using the same terminology—we deem a mechanical rather than an intentional attribute. When the lead and tin contents of the arrowheads are plotted as a scatterplot, each bundle forms a relatively tight cluster that is slightly different from the next (Fig. 7). When many bundles are plotted together, this pattern is obscured by the compositional overlaps among bundles—after all, they all are tin bronzes with variable quantities of lead. Even so, it remains true that the arrowheads within a given bundle tend to show similar compositions. The same pattern applies to the tangs (Fig. 8). Metal impurities such as antimony and arsenic lend further support to this picture as their patterns of presence or absence at detectable levels are generally consistent within bundles (results not shown). In order to confirm this impression, and continuing to bear in mind the limitations of noninvasive procedures and pXRF sampling uncertainties, 20 arrows from a single bundle were analyzed, differentiating between better preserved arrows and those with more substantial patination. The best preserved examples show a much closer chemical clustering, whereas the more corroded ones scatter more widely, typically showing higher lead and tin levels (Fig. 9). This strongly suggests that the degree of chemical similarity between the arrows in a bundle is even higher than detectable by surface pXRF. Thus, although some outliers are noted in the chemical groups, this may be due to the sampling uncertainty resulting from the analysis of unprepared surfaces or mix-ups among bundles during archaeological recovery.

Fig. 7
figure 7

Scatterplot of the lead and tin values of a sample of arrowheads, discriminated by bundle

Fig. 8
figure 8

Scatterplot of the lead and tin values of a sample of tangs, discriminated by bundle

Fig. 9
figure 9

Scatterplot of the lead and tin values of 20 arrowheads from the same bundle, discriminating between the better preserved and the corroded ones

We believe that the most likely explanation for this pattern is that every bundle of arrowheads represents an individual batch of metal coming from a single crucible load, while each set of tangs would constitute another batch. This aspect is quite revealing with respect to the organization of production as it suggests that the arrow bundles would leave the workshop as a finished item, with no room for the mix-up of different alloy batches before the arrow parts were assembled and grouped in bundles. It is quite unlikely that this is an intentional phenomenon related to weapon makers’ concerns with performance since all the arrows are tin bronzes of comparable properties and mixing chemical batches would not affect the outcome. Instead, we believe that this patterning is a side effect—and, as such, evidence—of the work being organized in semi-autonomous, multi-skilled cells of laborers that produced the different parts and the finished assembled products rather than being indicative of a single production line.

Batches and Cellular Production

Working with ceramics and glass, Ian Freestone and colleagues (2009a, b, 2010) have promoted the concept of “the batch” as an analytical unit of great potential for the study of artifact production and distribution. Identifying individual batches through chemical analyses is not easy as it generally requires high degrees of analytical precision (see also Bezúr 2003; Uribe and Martinón-Torres 2012). However, where possible, it allows higher-resolution inferences about the organization of production that take us beyond the superficial ascription of products to generic production regions. Batches stem from singular production events such as furnace or kiln loads, and they bring our analytical focus down to the more human scale of individual workshops and single acts of manufacture—they are therefore a more robust, materials science equivalent of the “analytical individual” often sought in stylistic studies of pottery or sculpture (cf. Morris 1993). Chemical batches can be traced to study marketing and consumption patterns (for example, groups of items produced, sold, or bought as a “set,” cf. Freestone et al. 2009a, or batches of tiles commissioned for the different rooms of a palace, cf. Freestone et al. 2009b); in this manner, they provide information about “microprovenience” rather than “macroprovenience” only (sensu Rice 1981, p. 219). It is worth stressing that the term “batch” is employed here in a narrow chemical sense (for example, all the items produced with a single crucible load), in contrast to the term “bundle” which is simply defined archaeologically as a group of approx. 100 finished arrows found together in the pit. More precisely, we argue that each arrow bundle reflects two discrete metal batches (one for the arrowheads and another one for the tangs) as well as, most likely, individual batches of 100 bamboo shafts and feathers.

The fact that chemical batches were preserved as coherent sets indicates that a relatively small, well-defined group of workers would have cast 100 arrowheads and as many tangs and immediately proceeded to finish and assemble them before casting the next two batches. Quite probably, other multi-skilled units would be producing similarly finished articles at the same time, as perhaps suggested by the inscriptions on warriors, tiles, and some other weapons, seemingly relating to more than one workshop or production unit in each case (see below). This is the labor organization model known as “cellular production” (Dohse et al. 1985; Ohno 1988; Productivity Development Team 1999; Dioguardi 2009, pp. 51–69).

If the alternative hypothesis had been true, with production proceeding via an assembly line, one could envisage one specialized workshop unit producing arrowheads continuously, while another produced tangs, another bamboo shafts, and so on. The different parts would then reach one another at different junctures of the assembly line. Under this logic, they would then be polished, assembled, and finished by different specialized units before they were eventually grouped in bundles of 100 arrows. In such a model, however, it is much more likely that different metal batches would be mixed up among the various bundles found in the pit as the different units would be working at different paces and relatively large stocks would be produced—and possibly stored—before sending the parts to the next stage of the flow line. In other words, the temporal or even spatial separation between these production stages would inevitably lead to the mix-up of chemical batches—a feature virtually absent in our dataset (Fig. 10).

Fig. 10
figure 10

Schematic representation of alternative production models for the manufacture of arrow bundles and their predicted effects in the distribution of chemical batches. Each color represents a different batch. a A single flow line of production and assembly. b Cellular production of finished bundles in semi-autonomous units

We can take this argument further. Even though we only have the head and tangs preserved, their occurrence in compositionally homogeneous bundles implies that not only would the metal parts leave the workshop complete, polished, and assembled but also that, at this stage, they would be finished with shaft and feathers already attached, bundled, and possibly placed inside a quiver. It is plausible that the quiver would already be attached to a specific crossbow or even to a complete crossbowman manufactured in the same or a closely related production cell (but see further discussion below).

The ongoing study of other weapon types broadly supports this hypothesis too, although some unresolved questions remain. Given the relatively lower numbers of artifacts in other categories and the more limited analyses so far, it has not been possible to isolate patterns from background noise and thereby confidently identify chemical groups akin to those discussed above. For these other weapons, we have therefore relied mostly on typological and metric research, coupled with spatial analyses. For example, detailed measurements of the approx. 220 crossbow triggers has allowed the identification of a number of subtle but undeniably different subgroups that suggest the existence of different casting molds and/or production units. When we examine the distribution of trigger subgroups on the site plan, they also form more or less clear groups whose significance as meaningful spatial clusters can be confirmed statistically. Most likely, the clustered patterns of trigger subgroups in the pit reflect the existence of different workshops producing marginally different crossbow triggers and equipping certain zones of Pit 1 in one go with these weapon batches. It is also possible that the clusters indicate that the pit was divided into “activity areas” that were assigned to different groups of workers, allowing them (and conceivably the different workshops or storage units that supplied them) to operate more or less independently and in parallel while, of course, following some form of overall master plan (Bevan et al. in press; Li 2012). Thus, this pattern would also seem consistent with a cellular model.

There is, however, an aspect of the weapons’ production that appears more difficult to explain with reference to this model. As noted above, some of the long weapons such as halberds, dagger-axes, or lances bear inscriptions noting the “regnal year” when they were produced (Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1988; Yuan 1984; Li et al. 2011; Li 2012). If they had been manufactured by versatile cells working in parallel, we would expect the production dates for different weapon types to overlap. However, this seems not to be the case: inscribed halberds and dagger-axes are dated to 244–237 BC, whereas lances date to 232–228 BC. In the face of this situation, several potential explanations may be proposed. First, it is possible that these scantier and more elaborate weapons, typically carried by higher-rank warriors, were produced and stored by more specialized artisans who did not operate under the general cellular model. Second, a related possibility would be that these weapons were not made specifically for the Terracotta Army but obtained from existing workshops’ stock meant for ordinary military purposes. Indeed, it has been suggested that the beginning of the construction of the Terracotta Army postdates the latest date recorded on a weapon (Nickel 2007, p. 179), which would indeed support this proposal. A third option would be that these weapons were indeed produced in multi-skilled cells and that their lack of chronological overlap is not significant—perhaps related to the actual progress of the works or to the vagaries of preservation and recovery. In this sense, it should be noted that the dated weapon groups discussed here consist of only 5 halberds, 1 dagger-axe, and 16 lances. For the group of halberds and dagger-axes, three officials and six different workers are recorded in those inscriptions; for the group of lances, two officials and three workers. As it is reasonable to suggest that all the halberds or lances needed for the Terracotta Army could have been made by a sole production unit in a single year, the impression we get in any case is not that of a single, specialized workshop—at least, not one producing only for the emperor’s funerary commission.

A further issue to be considered is the possibility that not only finished sets of weapons but also the ceramic warriors carrying them would be produced by the same cells. Such an option would multiply the range of supplies and technical skills required for each cell, but, as discussed in the next section, it could also facilitate the timely delivery of warriors to the pit as the works progressed. To some extent, however, the workshop marks on both the warriors and weapons argue against this hypothesis as they show no overlaps between the ceramic and metal. Thus, although we are currently working on the hypothesis that the production of warriors was organized in cells too, we assume for now that these would function independently of the metallurgical cells, albeit perhaps in close cooperation. At the same time, it is worth highlighting that the marks of some workshops or production units appear in both warriors and ceramic drain pipes. This strongly suggests that, as with the weapon makers, multi-skilled cells of ceramic workers could be deployed to different tasks as and when needed (Ledderose 2000, pp. 69–73).

Why Cellular Production?

The organization of production into cells might appear counterintuitive at first. Considering that the bulk of the Emperor’s Mausoleum and its contents was produced for a single commission, it would seem more efficient to arrange specialized production lines for each of the elements required—be they arrow tangs, warrior legs, linen cords, or ceramic bricks. This hypothetical system, potentially supported by the use of molds and prefabricated modules (Ledderose 2000, pp. 50–73), would certainly increase the bulk output and avoid the need to duplicate human and material resources in the different cells. Conversely, we are arguing here that each production unit would include their own furnaces, metals and molds for the various metal parts, polishing tools, textiles, feathers, bamboo, and the necessary skills to turn these into finished arrow bundles; we furthermore contend that several units with equivalent resources and expertise may have functioned in parallel and potentially, although not necessarily, at different geographical locations

There are several reasons that may serve to justify a cellular logic for weapons production. To begin with, one needs to bear in mind the important fact that the mausoleum was the first of its kind. Many subsequent emperors would try to emulate Qin Shihuangdi, but at the time of this construction, there was no obvious model or previous experience to draw upon—not even an established tradition of figurative sculpture—other than the generic inspiration taken from Western mausolea (Ledderose 2000, pp. 65–68). As such, there was no way to accurately predict the exact numbers of items needed or the time it would take to produce them. It would thus seem more sensible to produce warriors and weapons on demand, as the work progressed, as this would avoid overstocking, allow prioritization of tasks, and accommodate changes in any potential master plan.

Another reason for the organization of production into cells may relate to the very arrangement of the warriors in battle formation inside the pit. As noted above, crossbowmen are predominantly located in the vanguard and along the flanks of the army, but in the vanguard in particular, they are frequently intermixed with other warrior types, sometimes behind chariots, etc. In fact, the warriors are so tightly packed in the pit that, once a row of finished warriors was in place, the resulting cohort was practically impenetrable. Each equipped warrior therefore had to be delivered to its place as a complete item and at the right time, or else the whole enterprise would be stalled. The advantages of having versatile groups of workers capable of producing equipped crossbowmen, but also sword-wielding officers, as and when needed, are obvious.

Overall, given the inevitably fluctuating nature of the funerary project as well as the interdependence of different crafts and subprojects, the organization of production in the form of semi-autonomous, multi-skilled, relatively self-sufficient cells would minimize the negative impact of potential breakdowns while maximizing adaptability.

The above justifications for the cellular production system are based on the assumption that this labor organization was largely directed by mortuary behavior, i.e., the specific commission of the mausoleum. Although this point remains to be tested in future work, we would also like to raise the possibility that a cellular arrangement operated not only in the manufacture of weapons but in the construction of the whole Terracotta Army, or even the entire mausoleum. Even so, in light of the evidence currently available, it is also possible that the cellular model applies to the weapons only and that weapon makers were operating as normal—i.e., always in autonomous cells—rather than adopting a peculiar production system for this funerary commission. In fact, the similarities between the inscriptions on weapons from the mausoleum and those found outside (Yuan 1984) do indeed suggest comparable organization of production. In any case, the cellular organization of labor would be useful for ordinary weapons’ manufacture (i.e., for the actual battlefield) since the multi-skilled cells could be more easily moved with a real army to repair and produce arrows or other weapons as needed.

A modern parallel from the car production industry may be illustrative here, especially since the definition of flow line production and cellular production was first articulated with reference to the automotive industry. The typical model of a moving assembly line in constant flow was famously formalized by Henry Ford in the early twentieth century. “Fordism” did not invent the assembly line, but it made it more efficient by breaking down tasks in such a way that they could be performed by low-skilled laborers aided by highly specialized machines set up in sequence. The obvious advantage of this system is that it ensured low production costs, high productivity, and consistent standards at a time of heavy demand and little competition. A basic underlying assumption was that, in due course, the market could take as many cars as they managed to produce (Dioguardi 2009, pp. 51–59; Groover 2010, pp. 18–19). Toyota, another car maker, has popularized cellular production, especially since the 1970s (Dohse et al. 1985; Ohno 1988; Productivity Development Team 1999; Dioguardi 2009, pp. 51–69). The so-called lean production or Toyotism has become a synonym of cellular organization in a context of growing market competition. The making of Toyota cars involves assembly lines too, but these are arranged in typically smaller work units whose members operate in closer proximity and produce cars only when demand is in place (a production strategy known as “just-in-time”). The costs of storage and inventory are thus reduced, as well as the risks of overstocking. Workers are on average more skilled, much more versatile, and ready to assume a variety of tasks as needed, not only related to the production of different items but also to carrying out inspection and repair functions. As such, they tend to be more involved with the whole production process. Instead of producing large stocks of cars or parts thereof, they keep informed of production schedules (particularly, the progress of other cells or parts of the cell) and modify their plans accordingly, reducing downtime and adapting to the need or demand. They operate under a “lean manufacturing” management philosophy, a customer-oriented approach that strives to minimize waste while preserving product value. For example, “workers waiting” or “producing more than required” are considered waste. Even though individual cells may manufacture different products, they work in close cooperation and synchronize their output—as was perhaps the case with weapon- and warrior-making cells.

Taiichi Ohno, the engineer responsible for the planning of Toyota’s lean production system, defines the production process as “thinking about the transfer of materials in reverse direction.” As he explains, “In automobile production, material is machined into a part, the part is then assembled with others into a unit part, and this flows towards the final assembly line. The material progresses from the earlier processes toward the later ones, forming the body of the car. Let’s look at this production flow in reverse: a later process goes to an earlier process to pick up only the right part in the quantity needed at the exact time needed. […] Then the method of transferring the materials is reversed. To supply parts used in assembly, a later process goes to an earlier process to withdraw only the number of parts needed […] to control the amount of production” (Ohno 1988, p. 5). Obviously, both models—the Terracotta Army and a Toyota factory—are much more complex than outlined here. Not all the parameters are applicable or even discernable in our archaeological case study, and discussion of their relative success in modern contexts is beyond the scope of this paper. However, in its basic principles and potential advantages, the organization of labor documented in the production of the Terracotta Army seems much closer to Toyotism than to Fordism.

An anonymous reviewer of this paper observed that the clustered pattern of weapon batches and bundles might also occur if several relatively independent groups of people were making gifts to the emperor as complete weapon “packages.” This interpretation is plausible and cannot be rejected outright at present. In practice, however, there would be little difference between what we might call “cells of craft specialists” retained by the emperor and equivalent cells retained by elite groups whose products were then donated as gifts—given that in any case they would have to operate under tight prerequisites to meet standardization requirements, timing for delivery, etc.

Quality Control and Standardization

Having established that the model of labor organization in the production of weapons for the Terracotta Army appears to have been cellular, it is worth looking into more detail at issues of quality control and standardization. How standardized where the products of different production cells and how were those standards kept?

Standardization is a relative term, and one that is difficult to assess comparatively when working at different scales. The coefficient of variation—CV, typically expressed as (standard deviation/mean) × 100—is useful because it constitutes a measure of dispersion expressed as a dimensionless number. As variation is scaled to magnitude, the CV allows for comparisons of datasets with very different means (Stark 1995; Longacre 1999; Roux 2003; Underhill 2003; Eerkens 2000; Eerkens and Bettinger 2001; Eerkens and Lipo 2005). Thus, we can make meaningful comparisons of CVs for sword length and arrowhead width, for example. Eerkens and Lipo (2005) have used CVs to study artifact variability with a view to discerning unintentional copying errors from intentional variability in cultural transmission. Their null hypothesis, based on ethnographic observations and experiments, is that CVs of up to about 5 % may be explained as a result of human copying errors since that is approximately the limit of human ability to reproduce items of the same dimensions without external aids such as rulers (the minimum amount of variability obtained by humans for length measurements can be as low as a CV of 1.7 %, but small errors in motor skills and memory will introduce additional variability). The point of relevance for our work here is that the same threshold, known as the Weber fraction, may be used in the opposite way: even when rulers and molds are employed (as was obviously the case in the Terracotta Army weapons), CVs can give us an indication of the extent to which artifact standardization left room for improvement or of whether variability would be noticeable to observers.

Li (2012) measured the width and length of over 1,600 arrowheads from Pit 1 (six arrows from each of the 278 bundles with 90 arrows or more) as well as the length of as many tangs. For the heads, the average length ± standard deviation is 2.71 ± 0.08 cm and the value for the width is 0.98 ± 0.01 cm. This corresponds to overall CVs of respectively 2.8 and 4.1 % for the whole sample. Within-bundle CVs tend to be similarly low (Table 1). These CV values are admittedly just high enough to be discernable by the human eye in a few cases (e.g., when putting arrows with extreme length values side by side), but still indicative of a very remarkable degree of standardization. We must remember that 40,000 arrows have been recovered, with many more thousands probably yet to be excavated, thus representing a very large number of production events (or “generations”) that could potentially have led to much higher cumulative errors (see the simulation models in Eerkens and Lipo 2005). As a matter of fact, the low CVs suggest that a relatively small number of artifact models, perhaps provided to each of the cells by some central authority, would have been used routinely to produce new arrow molds rather than every new mold being copied from random arrows from a previous batch. Of course, given that we do not know the number of arrows that were carved into each casting mold, nor the number of times a mold could be used before needing replacement, nor even the number of cells, it is impossible to model the data with any precision. However, in evolutionary terms, this pattern is strongly indicative of “biased transmission” with a high “strength of conformity” (Eerkens and Lipo 2005). The CV values for the many dimensions measured on other commonly occurring weapon types at the site, such as crossbow triggers or ferrules, are typically lower than 5 %—confirming the overall impression of an extremely high degree of standardization (Li 2012).

Table 1 Summary statistics for the arrow measurements

The only significant exception to this pattern of high standardization is noted for the arrow tangs. Here, the frequency distribution shows two modes for the length, one averaging 7.51 ± 0.82 cm and another one 13.26 ± 1.7 cm (Fig. 11). It is possible that the two main tang lengths might be related to variable shaft lengths; indeed, the shorter tang bundles appear spatially clustered toward the center of the battle formation (Li 2012). Whatever the case, tang lengths have more perceptible CVs of respectively 10.9 and 13.0 % (Table 1). These values were calculated after excluding 21 bundles with a perceptible bimodal mixture of long and short tangs, allowing the possibility that all of these could be post-depositional mix-ups. The higher variability exhibited by the tangs is also noted, as mentioned above, in aspects such as their profile, straightness, or thinness (Fig. 5). This variability does not mean that their manufacture was careless, though. On the contrary, all of these tangs appear to have been filed down to smoothen their surfaces, and every single arrowhead was carefully ground and polished to ensure a sharp and shiny finish (Li et al. 2011). The less standardized appearance of the tangs was perhaps allowed because these would be inserted in the bamboo shafts and therefore invisible to observers, supervisors, or the warriors that used them. In keeping with a cellular production system, perhaps the main stage of external quality control took place only when the bundles were finished (with the tangs being therefore concealed).

Fig. 11
figure 11

Frequency distribution histogram for the tang lengths of the arrowheads measured showing two main modes, albeit with relatively wide scatters

When suitable analytical data are available, calculating CVs for artifact compositions can be a useful proxy for standardization in the selection of raw materials. In the present study, however, given limitations of surface pXRF analyses already discussed, the resulting values are bound to be unrealistically high and distorted by the variable surface patination. Figure 12 shows the calculated averages and CVs in tin values for the heads and tangs of all the bundles analyzed compared to the overall CV for the whole sample. The main observation that can be made here is the tendency for intra-bundle CVs to be lower than inter-bundle CVs—confirming our impression that each bundle is likely to constitute a single chemical batch. While CVs in tang compositions appear generally higher than those in heads, this cannot be taken as an indication of lower chemical standardization (obviously, the composition of the molten metal within a give crucible would be the same irrespective of the artifacts being cast). Rather, this is probably related to the generally higher tin contents in the heads, which led to the formation of more even and stable post-depositional patinas: as a matter of fact, there is a trend for bundles with lower tin to show higher compositional CVs, whether they are heads or tangs; conversely, no correlation was found between compositional and dimensional CVs.

Fig. 12
figure 12

Bar chart showing the calculated average and CV for the tin contents of the heads and tangs in each of the bundles analyzed by pXRF, arranged in ascending order. The horizontal lines mark the overall CV for the whole sample including all the arrows analyzed from all bundles

All in all, the fact that the Qin craftspeople involved in the production of weapons for the Terracotta Army managed to keep strict standards is apparent from the dataset. How they managed to do so is a different question, especially in view of the challenges posed by the likely existence of several cells producing in parallel and more or less autonomously. In our view, the answer to this question lies in the use of shared models, standards, and molds, but also in a pyramidal system of supervision. Some relevant information can be obtained from the inscriptions chiselled on some of the weapons and mentioned above (Museum of Qin Shihuang’s Terracotta Army and Shaanxi Institute of Archaeology 1988; Yuan 1984; Li et al. 2011; Li 2012): up to four hierarchical levels of supervision and accountability can be reconstructed from the long inscriptions, sometimes ranging from the Prime Minister to a larger number of individual workers. Thus, the combination of decentralized cellular production with a centralized supervision system in charge of models, molds, and quality control seems to be the main organizational strategy behind the weapons of the Terracotta Army.

Conclusion

Craft specialization and standardization continue to attract the interest of archaeologists as they inform us not only about technological aspects but also, by extension, about the broader socioeconomic contexts within which production systems operated. This study has explored the applicability of modern concepts such as “flow line production” and “cellular production” as null hypotheses for production models that may be tested against archaeological data. Using the bronze weapons from the Terracotta Army as a case in point, we demonstrated that chemical data can be integrated with typological, metric, and spatial information to address these questions. In particular, the recognition of chemical batches, likely deriving from individual crucible loads and preserved as chemically coherent sets of bundled arrows, was used as the basis to propose that weapon production was articulated in semi-autonomous cells comparable to, for example and without implying a wider equivalence, those in modern Toyota factories. The organization of production into multi-skilled cells would have allowed a much more versatile adaptation to the mausoleum’s master plan as the latter evolved, while a hierarchical structure of accountability and quality control would have ensured the high degree of standardization noted in our metric analyses. In this sense, this work has revealed that the centralized organizational structures that would make the Qin Empire historically famous were flexible enough to accommodate relatively small and versatile productive units to the benefit of efficiency and without compromising on standardization or quality.

While chemical analyses of archaeological artifacts are frequently used to address issues of material selection and provenance, these data are rarely used to reconstruct production organization and logistics. We thus seek to highlight a promising research strategy here, with “the batch” as a useful analytical category that, with the exception of the work by Freestone et al. and very few others, has generally not been recognized in previous research. Our focus is now shifting to the ceramic warriors themselves, with the intention of investigating whether the same or a different production system lies behind their manufacture and placement in the pit. We hope subsequently to expand our research to cover other areas of the mausoleum in order to understand the logistical organization that allowed the timely and efficient assembly of this colossal construction.

The large size, remarkable preservation, and narrow chronological range of the First Emperor’s tomb make it unusually well suited for this kind of approach. However, it should be possible to fruitfully apply similar approaches to other case studies. On a single-site basis, chemical and spatial data should be employed in tandem far more often than they have been as yet, for example, to study the bricks or metal reinforcements of a single commission such as a temple or of various constructions in a given settlement. Such studies could provide sharper information about whether the construction of a large building was structured in different sectors that were allocated to specific groups of workers or whether individual houses were built by their own dwellers in individual construction events. On a broader scale, the products of a known manufacturer found across a wider region—for example, ceramic vessels with the same manufacturer’s mark—could be analyzed with a view to determining how different batches were distributed and marketed. We anticipate that these strategies, aided by the availability of fast, reliable, and portable analytical equipment, will lead to further integration of instrumental analyses and broader archaeological research agendas.