Introduction

This paper brings together the results of a long-term comparative study investigating the energetics, efficiency, and identification of bipolar and freehand reduction in contexts of lithic miniaturization (Pargeter and de la Peña 2017; Pargeter and Eren 2017). The paper’s main aim is to draw the experimental project to a close by comparing the miniaturized flake assemblages made on milky quartz and flint and to explore potential differences between them. Our experiments followed the same protocols and were designed to achieve the same production goal: the manufacture of small elongated flakes. The results and discussion that follow have implications for archeologists interested in measures of blade/bladelet and small flake production and its relationship to aspects of raw material variation, technological variability, and the effects of raw material differences on choices of technology.

Bipolar reduction is Pleistocene archeology’s most enduring lithic strategy (Sánchez-Yustos et al. 2017) and nineteenth- and twentieth-century knappers still used it (Weedman 2006; Witthoft 1966). Bipolar reduction, in its basic definition, involves the use of a hammer and anvil for flake production and flake modification. This technique is relatively simple to learn and to transmit, and it allows toolmakers to stabilize smaller cores during flake production (Callahan 1987; Flenniken 1981). More complex variations of bipolar reduction involve anvil-assisted flaking in which freehand cores are modified after being placed on an anvil (Callahan 1987; Pargeter and Tweedie 2018) and bipolar cobble splitting (Duke and Pargeter 2015). Despite its widespread occurrence in the archeological record, bipolar reduction is still generally perceived of as a second-rate lithic reduction strategy, even by one of us recently (e.g. Eren 2010).

Bipolar reduction’s study in archeology has an ignominious history. Hayden (1973: 216) once noted, “it was probably the lazy and inept stone knappers who originally adopted the bipolar method.” Odell (2000: 294) echoed Hayden’s sentiments when he noted in North America, “[Bipolar reduction] would not be worthy of mention” and bipolar reduction is a strategy “nobody would normally engage in.” Some North American archeologists rejected the possibility of bipolar reduction as a legitimate reduction strategy, preferring to see it as a reflection of entropy and chaos in lithic reduction (Sollberger and Patterson 1976). Not all archeologists share Sollberger and Patterson’s perspectives (e.g. Binford and Quimby 1963; Goodyear 1993; Shott 1989a), but they have influenced archeological thought about bipolar reduction. Furthermore, archeologists working in North America and in Europe have dealt extensively with confusion between the use of wedges and chisels and the by-products of bipolar knapping (de la Peña 2011; Hayden 1980; Le Brun-Ricalens 1989, 2006; Shott 1989a, 1999). Africanist archeologists have historically been more inclined to see bipolar reduction as a legitimate core reduction strategy (i.e., Barham, 1987) and to include it as a major component in many of the continent’s lithic techno-complexes and industries, particularly for those assigned to the Later Stone Age (de la Peña 2015a).

Despite bipolar reduction’s ignominious history in archeology, it is now a widely theorized and discussed set of lithic reduction strategies (Shott and Tostevin 2015). There are three main explanations for bipolar reduction’s increased importance. First, bipolar reduction occurs in nearly all prehistoric contexts in which humans made and modified stone artifacts. From the earliest flake production and block modification using hammers and anvils to Late Pleistocene bipolar bladelet production, toolmakers used bipolar technology for many different purposes (de la Peña 2015a; Shott and Tostevin 2015). Second, traditional lithic systematics focused on reduction strategies that are more difficult to learn and transmit tended to underplay the importance of simple technologies in human behavioral evolution. The growth in evolutionary perspectives (e.g. Buchanan et al. 2014; Eren et al. 2015; Kuhn and Miller 2015; Lycett and Cramon-Taubadel, 2014; Muller and Clarkson 2016; O’Brien et al. 2015; Schillinger et al. 2015; Shott 1989b; Surovell 2012), that on the other hand aim to understand strategic variation in human behavior and its material manifestations, have accommodated the co-occurrence and reoccurrence of simpler and more complex technologies. Third, bipolar reduction is relatively simple to learn and to execute. As Callahan (1987) phrased, it a simple but not a simplistic strategy, its use usually is related to expedient (time) strategies and sometimes to more 'complex' technologies on other raw materials (such as wood or bone). It also appears as if bipolar reduction's simplicity in performance is inversely proportional to its identification in the archaeological record.

Experimental and ethnoarchaeological research have helped archeologists to explore potential motivating factors for why and how humans used bipolar reduction. Experiments have shown how certain techniques of bipolar reduction (e.g. splitting quartz cobbles) need considerable skill (Duke and Pargeter 2015). Others have shown how expert toolmakers converted bipolar core mass into flakes and cutting edge more efficiently than novices (Morgan et al. 2015). Ethnoarchaeological research among the Konso hide workers in Ethiopia shows that they use bipolar techniques to produce and modify small (c. 10–30 mm long) stone hide scrapers and to conserve time and raw materials. They do so because the distances to rock sources are relatively far (up to 25 km away) (Arthur 2010). The Duna-speaking groups of the Papua New Guinea highlands (White 1968), on the other hand, use bipolar techniques to produce expedient (quick), standardized, and thin cutting edges. They also conserve time and energy for raw material procurement. Here, bipolar reduction is a situational response to low tool stone supply and time constraints. In brief, such studies show that time for tool production and the associated time for raw material procurement toolmakers’ choice to use bipolar reduction (also see Schillinger et al. 2014; Torrence 1983).

Archeological investigations add to the ethnographic observations to show how toolmakers wove bipolar technology into other more complex lithic reduction strategies. Stone toolmakers used bipolar technology to test raw materials before reduction, to manage core reduction, and to extend small cores’ use-lives (Flenniken 1981; Hiscock 2015; Pargeter 2016; Prous et al. 2010; Pargeter and Tweedie 2018). Bipolar reduction’s benefits accrue with relatively fewer costs regarding the skills necessary to learn the technique. Moreover, they reduce the time for raw material procurement and tool production.

Bipolar reduction is an especially profitable strategy for lithic miniaturization that is the systematic production of small stone tools from small cores (Pargeter 2016). Foremost, this approach provides maximum stability for small cores’ reduction (Hiscock 2015). It allows for the extraction of long and thin flakes that extend most way down the core face (Callahan 1987) thereby exploiting a core's maximum surface area. Bipolar reduction can facilitate flake reduction from small, narrow, and crushed core platform edges, thereby opening new possibilities to reduce small cores with platform morphologies that would otherwise pose significant challenges.

Southern African Late Pleistocene LSA (LPLSA) (c. 44–12 kcal BP) miniaturized lithic assemblages show a unique combination of high bladelet production, low retouch frequencies, fine-grained raw material use, and high rates of bipolar reduction (Low and Mackay 2016; Pargeter 2016; Porraz et al. 2016). Initially, archeologists referred to these miniaturized bipolar cores as “wedges” or pièces esquillées, outils écaillés, and core-reduced pieces following western European analytical traditions (see Deacon 1984; Mitchell 1988). Subsequent experimental work and archeological analyses have demonstrated that bipolar cores were also a key component of the small flake and bladelet production strategies in several LPLSA lithic assemblages (Pargeter 2016 Deacon 1984; Mitchell 1988; Low and Mackay 2016; Porraz et al. 2016) archaeologists have. However, questions still remained as to the relationship between bipolar reduction and bladelet production and what decision-making criteria might compel toolmakers to use this strategy over other available options in southern Africa and elsewhere (i.e., Brantingham et al. 2004). Our experimental project was borne from these archeological questions, and we designed the experiments to simulate reduction scenarios relevant to the southern African Late Pleistocene lithic record.

Experimental background

Archeologists have made extensive use of experimental data to identify and describe bipolar products in lithic collections (Callahan 1987; Curtoni 1996; de la Peña 2011; Díez-Martín et al. 2009a; Díez-Martín et al. 2011; Driscoll 2011; Eren et al. 2013; Flenniken 1981). Most studies focus on bipolar cores (Jeske and Lurie 1993) as these are relatively straightforward to identify and describe (Díez-Martín et al. 2011). Yet, because bipolar reduction tends to increase core discard thresholds, it tends to produce lower core to flake ratios. This leaves the numerically superior flake components large components of bipolar dominated lithic assemblages unidentified and unquantified. Fewer studies have focused on these more ubiquitous bipolar flakes (Eren et al. 2013).

Barham’s (1987) quartz and chalcedony bipolar experiments were among the first to provide clear bipolar flake descriptions, which he described as “broad and irregular in plan view with crushed butts and sheared bulbs of percussion” (Fig. 1c, d, e, and h). Barham noted that different raw materials respond differently during bipolar reduction with friable quartz resulting in greater frequencies of crushed butts than chalcedony. Kuijt et al. (1995) used high-grade trachydacite to produce nine bipolar flake assemblages with a wrapped core bipolar method (cf. White 1968). They describe bipolar flake assemblages as having high degrees of flake fragmentation and high cortex retention (Ahler 1989). Other authors have pointed to the high incidence of flake fragmentation in quartz bipolar flake assemblages (Driscoll 2010; Mourre 2004). Díez Martin et al. (2011) compared the features of flakes produced through bipolar and freehand reduction on Naibor Soit quartz. Their experiments highlight metric differences between freehand and bipolar flakes with bipolar flakes being shorter, narrower, but overall thicker than freehand flakes. They also show greater wedging initiation frequencies with few clear bulbs and crushing on both the proximal and distal ends of bipolar flakes (cf. Bradbury and Carr 2012; Jeske and Lurie 1993). Contrary to popular belief (e.g. Kobayashi 1975), bipolar flakes do not always show double bulbs of percussion or crushed ends. de la Peña (2015b) focused on the qualitative recognition of bipolar knapping on quartz and flint cores and flakes. She recognized a wide variety of bipolar blanks including flakes, bladelets, and 'chunks.' Some of the by-products are what Cotterell and Kamminga (1987: 685) term "compression" flakes whereby flakes are 'squeezed' off of cores rather than 'peeled' as in freehand reduction. Some blanks develop compression flake characteristics on both extremities (no bulbs of force, crushing, and a flat ventral surface), others on one side only. Flint bipolar flakes show fissured, linear, and broken platforms; they do not exhibit a distinguishable impact point, sometimes the ripples on the bulbar faces are very marked and close to one another. Flint bipolar flake profiles tend to be rectilinear and, as a specific feature of bipolar flint blanks, sometimes they show a “pronounced hinged bulb” (Fig. 1b). Quartz bipolar blanks often exhibit a marked blunted whitish edge; they also show broken, fissured, and linear platforms. Their profiles tend to be rectilinear with low frequencies of hinge and step terminations, features they share with flint bipolar flakes. One of de la Peña’s (2015b) main conclusions was that the distinction between freehand and bipolar knapping on quartz is not clear from a qualitative point of view. She argued that quantitative approaches, such as the ones proposed in Pargeter and de la Peña (2017) and in this paper could help unpack some of these complexities.

Fig. 1
figure 1

Flint and quartz flake attributes. a Hertzian cone, b hinged bulb, c sheared bulb, d sheared bulb (quartz), e crushed ventral surface (quartz), f hertzian cone (quartz), g platforms with shape (flint), h crushed platforms (quartz)

Several experimental studies have begun to generate data on bipolar reduction’s efficiency by recording its relative cutting edge to flaked mass ratios. Díez Martin et al.’s (2011) experiments show no statistically significant differences between freehand and bipolar flakes’ cutting edge to mass values. Both flake populations showed values of 0.85 and 0.89 mm/g respectively. Morgan et al.’ (2015) experiments on small chert nodules show significantly higher mean cutting edge to mass values for freehand flakes (18.08 mm/g) compared to their bipolar flakes (7.65 mm/g). More recently, Muller and Clarkson (2016) created bipolar flake assemblages with lower mean cutting edge to mass values (10.36 mm/g) when compared to the other eight reduction strategies in their experiment (the others were all > 16 mm/g). Because bipolar reduction is relatively less efficient in some contexts, there must have been several reasons prehistoric humans chose to use it so often.

Gurtov and colleagues (2014) tested the widely held perception that human’s use of bipolar reduction reflects limitations in the quality of specific raw materials (also see Bradbury and Carr 2012; Gurtov and Eren 2014). Their experiments compared flake shape and efficiency (number of flakes made per original core mass per unit time) on quartzite and basalt bipolar flakes with a specific reference to the Earlier and Middle Stone Age lithic records at Olduvai Gorge, Tanzania (cf. Díez Martin et al. 2009b). They found no significant shape differences between the quartzite and basalt flake samples. They did, find significant differences in the number of flakes made per original core mass per unit time with each of the raw materials. They interpret this result to mean that the association between quartzite and bipolar reduction at Olduvai Gorge can be explained by constraints and advantages of both quartz and basalt alike. Duke and Pargeter’s (2015) bipolar cobble splitting experiments demonstrated how efficient this strategy is to open quartz cobbles for further freehand and bipolar reduction bipolar reduction's time-budgeting qualities would have created strong incentives for toolmakers to choose this stratetgy over others.

One problem with the energetics and efficiency arguments outline above is they contrast bipolar reduction experiments designed to replicate widely differing reduction goals. Díez Martin et al. (2011) and Gurtov and Eren (2014) were concerned with replicating flake shapes and sizes relevant to the Earlier and Middle Stone Ages in eastern Africa. Morgan and colleagues and Muller and Clarkson (2016) were simply concerned with trying to extract small flakes from small bipolar cores while Duke and Pargeter (2015) tested questions relevant to larger quartz cobble flaking. Experimental outcomes differ dependent on whether toolmakers produce small elongated flakes, larger flakes, initiate flaking on cobbles, or simply bash a cobble to produce flakes (Hiscock 2015). Because bipolar reduction is so variable in its application, it is particularly susceptible to variations in the resulting flakes and comparative analyses and experiments need to account for this factor.

An experiment examining bipolar and freehand reduction in the context of lithic miniaturization

Our work built on these previous experimental studies with added emphasis on the costs and benefits of bipolar versus freehand reduction on flint and quartz (Pargeter and de la Peña 2017; Pargeter and Eren 2017). To clarify issues of reduction outcomes, it focused on a single reduction goal: the production of small elongated flakes. To minimize ambiguity around the type of bipolar reduction, we used a pure axial technique in which toolmakers aimed their blows directly down into the core. This reduction goal references broad processes of lithic miniaturization and bladelet production characteristic of the southern African Late Pleistocene and other regions of the world (Brantingham et al. 2004; Deacon 1984; Low and Mackay 2016; Pargeter 2016; Porraz et al. 2016). The experiments addressed two key questions:

  1. 1.

    In contexts of lithic miniaturization, what benefits in time, energy, and efficiency does axial bipolar reduction provide over freehand reduction?

  2. 2.

    Can we reliably distinguish the products of axial bipolar reduction from those produced with freehand reduction?

We collected a total of 40 nodules/cobbles of quartz and flint for each of the experiments (see Pargeter and de la Peña 2017 Pargeter and Eren 2017 for further details). We randomly assigned 20 nodules to each of the reduction experiments (axial bipolar reduction and freehand reduction). Paloma de la Peña and Metin Eren, two expert quartz and flint knappers respectively, made the experimental assemblages, while Justin Pargeter recorded the energetics data and analyzed the assemblages. We acknowledge at the outset that the effects of two different knappers may confound the effects of raw material variability in comparisons of quartz and flint. However, it is common for archeologists to compare results across experiments made by different knappers. Moreover, our experimental controls are designed to minimize the effects of individual knapper’s idiosyncratic behavior on our results. The following section provides details on these experimental design criteria.

The experiments’ overall goal was to reduce the cores until accidents (step and hinge fractures, obtuse flake angles) were insurmountable and when bipolar cores were small enough to endanger the knapper’s fingers. For freehand reduction, the initial preparation of the striking platform was done by simply removing a flake to create an appropriate flake release angle, taking into account the initial shape of the nodule. During reduction, the knapper placed their hand over the flake release surface in order to avoid, as much as possible, the formation of hinge and step fractures on the resulting flakes. This protective measure also ensured the production of as many elongated blanks as possible in keeping with the goal of the experiments. In the bipolar reduction experiments, knappers held the core with the long axis perpendicular to the anvil’s surface and then struck downwards into the core from directly above. This strategy involved immobilizing and compressing the core between the anvil and the hammer. Cores were rotated when necessary and to ensure the striking platform remained viable.

We used a large plastic canvas sheet to recover all chips, chunks, and flakes from each core. Using a random number generator, we then selected 20 flakes, regardless of size, from each of the core debris bags. The total sample from each raw material amounted to 800 flakes (20 flakes per core, 400 flakes per reduction strategy) (Table 1). We then conducted attribute analysis and calculated cutting-edge measurements on the 1600 randomly selected flakes. All the cores were analyzed, but this paper focuses specifically on the flake patterns.

Table 1 Overview of experimental sample and sample sizes

Results overview and background to the raw material comparisons

The answers to our experiments’ questions provided surprising results. On both raw materials, bipolar reduction yielded a greater amount of cutting edge per gram of rock than freehand reduction. As expected, it also reduced a significantly larger core volume. It achieved both goals in less time and with less energy expenditure than freehand reduction. Bipolar reduction also yielded more bladelets than freehand reduction. These results suggest bipolar reduction is a more effective and efficient strategy for lithic miniaturization than freehand reduction. Our results support the hypothesis that we should expect to find increased bipolar reduction in contexts where there were strong incentives to economize raw material in the context of lithic miniturization.

This project identified a series of attributes to distinguish bipolar and freehand cores and flakes statistically. Some technological attributes agree with previous experimental observations (i.e. bulb and platform crushing). Other novel attributes included sheared bulbs (Fig. 1c, d), splintered flake terminations, and several core shape attributes. These attributes are relatively straightforward to identify and measure archeologists can use them to standardize their language in future bipolar reduction studies. The experiments replicated most “small” core subtypes in southern African Late Pleistocene typologies (i.e., flat bladelet cores, rice-grain cores, core-reduced pieces) without purposefully doing so (Pargeter and de la Peña 2017; Pargeter and Eren 2017). Our results showed that these core types are the products of axial bipolar reduction, not separate and distinct or specialized reduction strategies (e.g. wrapped bipolar reduction) as typically argued (e.g. Deacon 1984).

The project used statistically significant technological attributes to generate a series of multi-variate generalized linear predictive models for distinguishing bipolar cores and flakes. Applied to the combined bipolar and freehand experimental assemblage, the predictive models worked well (> 90% prediction accuracy rates) with sample sizes of c. 400 flakes and > 20 cores (Pargeter and Eren 2017). One would not expect predictive power to decline among larger samples in archeological contexts unless the technological contexts differed from our experimental framework.

Following from this brief background to the experiment, we now compare the quartz and flint experimental results with a specific focus on patterns of variation in the milky quartz and flint flake assemblages. Our comparison addresses several key aspects of flake variability holding flake production goals constant. First, our comparison focuses on the flake cutting edge to mass values generated in the two experiments to assess the degree to which raw material differences effect these values. Second, we compare flake fragmentation rates between the two experiments to test the long-held assumption that quartz bipolar reduction will result in greater flake fragmentation than non-quartz bipolar reduction. Third, we compare several technological attributes on the flakes’ ventral surfaces, distal ends, and platforms to assess the degree to which raw material differences effect differences in the structure of these attributes. Fourth, we use ANOVAs to provide insight into the respective roles of raw material variability and technological strategy in flake shape variations. Following recent calls for greater transparency and reproducibility in research (e.g. Marwick 2017), we include our R code and raw data as supplementary materials (see Supplementary documents 1 and 2). The code files and datasets can also be found through our Open Science Framework project at DOI https://doi.org/10.17605/OSF.IO/38TSN.

We chose not to start with identical saw-cut quartz and flint blocks, but to sacrifice absolute control of block size and shape for a more actualistic framework (i.e., higher external validity, Eren et al. 2016) in which the knapper’s selection was allowed to constrain the reduction process. Although the selected nodules varied in size and shape, the range of blocks picked for the two experiments is comparable (Fig. 2). A one-way ANOVA showed statistically significant differences between the range of flint and quartz nodule forms (flatness/elongation) values [F(1, 78) = 5.08, p = 0.027, d = 0.25] (Table 2). The ANOVA test effect size (the proportion of the variation in a response variable that is associated with membership of the different groups) is low (< 0.3, see d value above) suggesting that nodule form accounts for a low overall amount of variance across the flint and quartz nodules. The flint and quartz nodule populations have close enough shape equivalence for inter assemblage comparisons.

Fig. 2
figure 2

Nodule shape and raw material comparisons

Table 2 ANOVA analysis output for the model comparing nodule form and raw material

To further avoid the confounding effects of size differences in our experimental flakes, we compared the cutting edge (mm) produced in terms of each flake’s mass. We follow the protocols set forth in Mackay (2008) and Brown (2011) to calculate the flake cutting edge to mass ratios on our two samples. Brown’s (2011) methodology corrects for the problem of negative linear correlation between flake cutting edge and flake size (cf. Dibble and Rezek 2009) by dividing cutting edge values by the cube root of mass, which results in a linear relationship between the two variables. Flakes were compared in terms of shape instead of size variables to further reduce the effects of core size differences on these results. We used the R statistical program (R Core Team 2013) for all statistical analyses.

Results

Figure 3 depicts the flake cutting edge to mass values for the milky quartz and flint bipolar and freehand flake assemblages. These data exhibit marked differences with respect to their location, scale, and shape. Bipolar and freehand milky quartz flakes show lower cutting edge to mass values compared with the flint bipolar and freehand flakes. A two-way ANOVA comparing cutting edge to mass, raw material, and technological strategy, and their interactions show significant differences (Table 3). Raw material [F(1, 698) = 161.19, p = < 0.0001, d = 0.48] and technological strategy [F(1, 698) = 9.665, p = 0.002, d = 0.12] and their interaction [F(1, 698) = 15.88, p = < 0.0001, d = 0.15] influence cutting edge to mass production. This result suggests that raw material's effect on the flake cutting edge to mass ratio is dependent or contingent on the knapping strategy. In our case, freehand reduction produced greater flake cutting edge to mass ratios than bipolar reduction on flint flakes but not on quartz flakes.

Fig. 3
figure 3

Flake cutting edge to mass compared across raw materials

Table 3 ANOVA analysis output for the model to determine the relative influences of technological strategy and raw material variability on cutting edge to mass production

Flake fragmentation

Overall, we find a difference between the flake fragmentation patters in the bipolar and freehand flake samples (Fig. 4). A chi-square test shows significant differences between the two samples, with bipolar flakes showing greater complete flake and non-identifiable flake fragment counts (X2 [1, N = 1600] = 10.67, p = 0.001, phi = 0.01). The effect size statistic (phi = 0.01) suggests a low practical significance to this outcome. Figure 4 shows the complete flake and flake fragment counts. Complete flakes include flakes with proximal, mesial, and distal flake portions, while flake fragments include flake elements. When analyzing quartz assemblages, some archeologists refer to these fragmented elements as “block or shatter” (e.g. Bernstein et al. 1993). These data show several differences driven by raw material variations within the bipolar and freehand flake classes. Flint bipolar flakes show higher complete flake counts than quartz bipolar flakes. A chi-square test with modified alpha level (i.e., Bonferroni correction) to adjust for multiple testing and reduce type-I error, shows significant differences in the milky quartz and flint bipolar flake fragmentation categories (X2 [1, N = 800] = 56.09, p = < 0.0001, phi = 0.18) with low practical significance (phi = 0.18). The flint freehand flakes show higher complete flakes compared to the milky quartz freehand flakes (X2 [2, N = 800] = 73.31, p = <0.0001, phi = 0.3). The effect size statistic (phi = 0.3) suggests a medium practical significance for this outcome.

Fig. 4
figure 4

Flake completeness counts (counts provided atop bars) by raw material and reduction strategy

Technological attributes

The bipolar and freehand flake samples show marked differences in the frequency of various technological attributes on the flake platforms, ventral surfaces, and distal ends. Figure 5 shows the count of crushed and shaped (i.e., platforms with an identifiable morphology) platforms relative to raw material and technological strategy (see Figs. 5 and 1e, h). The bipolar flakes show statistically significant differences (X2 [1, N = 566] = 154.67, p = < 0.0001, phi = 0.52) with the quartz sample showing higher crushed platforms and lower platforms with shape to them (quadrangular, rectilinear, and punctiform platforms). This effect size statistic is high (phi = 0.52). The quartz freehand flakes show non-significant differences (X2 [1, N = 527] = 1.223, p = 0.668, phi = 0.04) with the flint flakes showing higher shaped platform counts.

Fig. 5
figure 5

Flake platform morphology counts (counts provided atop bars) by raw material and reduction strategy

Figure 6 depicts the counts of ventral surface features on the bipolar and freehand flakes by raw material. The “other” category depicted in this figure includes identifiable bulbs of force and/or errailueur scars. The milky quartz bipolar flakes show significant differences in their higher ventral surface crushing and shearing (Fig. 1c–e) and lower frequencies of identifiable Hertzian cones (X2 [2, N = 566] = 224.18, p = < 0.0001, phi = 0.62) (see Figs. 6 and 1). This effect size is large (phi = 0.62). The flint bipolar flakes, on the other hand are more likely to show bulbs of force and errailueur scars. The milky quartz freehand flakes also show significant differences (X2 [2, N = 566] = 25.87, p = < 0.0001, phi = 0.22) driven largely by their higher incidence of ventral surface crushing (Fig. 1e) and lower bulb counts. The practical significance of this result is low (phi = 0.22).

Fig. 6
figure 6

Flake ventral surface feature counts (counts provided atop bars) by raw material and reduction strategy

The final technological attribute we examined was the presence of absence of rebound force scars on the distal ends of the flakes. Rebound force scars should occur more often on bipolar flakes as their distal ends come into contact with an anvil during the reduction process. The data depicted in Fig. 7 show higher overall rebound force scar counts in the bipolar flake sample demonstrating that this assumption is correct. However, the counts do differ by raw material type. The flint bipolar flakes show significantly higher elements with no rebound force scars compared to the milky quartz flakes (X2 [1, N = 588] = 51.82, p = <0.0001, phi = 0.29). This result has a medium to low practical significance (phi = 0.29). The milky quartz and flint freehand flakes both show similar and non-significant patterns of rebound force scarring (X2 [1, N = 588] = 4.06, p = 0.0436, phi = 0.08).

Fig. 7
figure 7

Flake rebound force scar counts (counts provided atop bars) by raw material and reduction strategy

Flake shape, technological strategy, and raw material variability

An ANOVA analysis was conducted to determine the relative influences of technological strategy and raw material variability in flake shape variation. We used flake form (elongation/flatness) as an outcome variable and technological strategy (bipolar or freehand), and raw material (quartz or flint) as the independent variables. The model includes an interaction term to describe the relationship between raw material type and technological strategy. We log transformed the flake shape variable that was prior to the analysis to ensure it did not deviate from the ANOVA assumption of normality.

Table 4 presents the ANOVA model outputs. The two-way ANOVA yielded a main effect for raw material type, F(1,1091) = 23.65, p < 0.0001, such that the average flake form was significantly different for quartz (median = 0.58; range: 0.06–3.62) compared to flint (mean = 0.44; max: 0.01–3.90). The main effect of flaking condition was also significant, F(1, 1091) = 14.09, p = 0.0008 with the average flake form significantly different for bipolar (median = 0.53; range: 0.01–3.9) compared to freehand flakes (mean = 0.44; max: 0.06–2.9) (Fig. 8). The effect sizes for flaking condition (0.1) and raw material (0.14) are however both small. The interaction effect between raw material and flaking strategy was not significant, F(1, 1091) = 1.75, p = 0.18, indicating that variations in flake form are better explained by raw material type and flaking strategy alone.

Table 4 ANOVA analysis output for the model to determine the relative influences of technological strategy and raw material variability in flake shape variation. We used flake form (elongation/flatness) as an outcome variable, technological strategy (bipolar or freehand), raw material (quartz or flint), and their interaction as the independent variables
Fig. 8
figure 8

Flake shape comparisons by raw material and technology

Discussion

Before discussing the implications of our present experiment, we must acknowledge some limitations of the data as generated in this paper. At present, our sample size of knappers (n = 2) is small and will not reflect the total amount of variance in the archeological record. Moreover, because our experiment focused on the difference between bipolar and freehand reduction, we did not explicitly test the effects of knapper variability on our results. We did, however, attempt to minimize these differences by using a standardized replication framework and explicit protocols. Our approach follows other experimental studies (e.g. Jennings et al. 2010; Shott 1994; Shott et al. 2000) that compare the results of knapped debris and cores made by several knappers to gain insight into other variables, such as fracture mechanics, raw material differences, or technological organization. In spite of these limitations, we offer several novel insights into the role of simple, but not simplistic, technological strategies in prehistoric human decision-making. Pending further experiments and more detailed comparisons with archeological assemblages, our findings are best considered working hypotheses.

Our experimental results provide the opportunity to revisit several long-standing questions in research on bipolar technology. These include to what extent flake efficiency measures (cutting edge to mass), technological stigmata, fragmentation rates, and flake shapes are affected by differences in flaking strategies and raw material variability. Our experimental protocols controlled for knapping goals allowing us to assess these various questions irrespective of knapping motives. Our quartz and flint experimental nodules showed similar patterns of variation allowing us to contrast the flake products on these two rock types.

Our cutting edge to mass comparisons show that flint bipolar and freehand flakes show significantly higher cutting-edge production values compared to the quartz bipolar and freehand flakes. However, this outcome had greater effective significance (a higher effect size score) in the freehand flakes compared to the bipolar flakes. This result suggests that when flaking efficiency is the primary production goal (which it need not always be) toolmakers are better off deploying a freehand strategy on flint compared to quartz (e.g. Morgan et al. 2015). The result also suggests that the two raw material's cutting edge to mass outcome is more likely to converge with bipolar reduction.

Our bipolar reduction efficiency results speak to the broader issue of flaking goals and flaking strategies. Our previous experimental work as well as those of colleagues addressing similar questions (e.g. Flenniken 1981; Morgan et al. 2015) show that bipolar reduction outcomes differ dependent on whether toolmakers produce small elongated flakes or simply bash a cobble to produce flakes. We designed this project’s experiments to produce small, elongated flakes, and to simulate reduction scenarios relevant to the Late Pleistocene in southern Africa. Given the well-known economic advantages of bladelet production (Eren et al. 2008), it is no surprise that when bipolar experiments are designed to produce larger, longer, wider, flakes, they fall short of this project’s reduction efficiency and economy measures. This discrepancy, may well explain the dissimilarities between ours and other bipolar reduction experiments. It is also logical then to hypothesize that these different bipolar modalities might have an impact on the qualitative and quantitative characteristics of the assemblages produced. The source of variation among different bipolar experiments discussed here might be mitigated by these different reduction goals. Further research comparing different types of bipolar reduction but controlling for raw material and flaking goals are warranted.

Comparisons of the flake fragmentation patterns show the flint and quartz bipolar flakes with statistically distinguishable complete flakes and flake fragment counts, but with low practical effect sizes. Overall, the bipolar flakes show higher complete flake counts than the freehand flakes. The freehand versus bipolar flake fragmentation data partially support two surprising, but commonly discussed, aspects of bipolar reduction. First, that when bipolar reduction is used as a controlled reduction strategy, it leads to greater complete flake production (i.e., Flenniken 1981). Our data show that this is the case for quartz flakes, but not for flint flakes, which instead showed similar complete flake frequencies in the bipolar and quartz samples. Second, that axial bipolar reduction can produce relatively thin and intact flakes and flakes of an even thickness (Manninen 2016). These may have been two of the many factors motivating toolmakers to employ bipolar reduction as a strategy for lithic miniaturization. Raw material comparisons support the notion that flint and quartz bipolar and freehand reduction differ in their fragmentation patterns. However, this result has greater practical significance for our freehand sample in which quartz showed higher flake fragment counts. Regardless, it is possible that archeologists underplay the importance of flake fragments in prehistoric toolkits. For example, Knutsson et al. (2015) conducted use-wear analysis on a large (n = 544) sample of Scandinavian Mesolithic and Neolithic shattered quartz bipolar flake fragments. Their data show that > 100 of these flake fragments were used.

Our previous analyses (Pargeter and de la Peña 2017; Pargeter and Eren 2017) outlined in detail the differences in technological attribute states between our quartz and flint bipolar and freehand flakes. The results showed clear quantitative and qualitative differences between these two flaking strategies on the two raw materials. Following calls by de la Peña (2015b), we argue that quantitative comparisons of these attribute states show differences structured by raw material type. The bipolar and freehand flakes show different platform, ventral surface, and distal termination configurations depending on what raw material is referenced. The effect size values suggest relatively high practical significance for these differences. Our results show that the application of attribute-based approaches to identifying bipolar reduction need to account for this degree of raw material variability.

Lastly, our linear model showed the complex and subtle contributions of technology and raw material variability to the form of bipolar and freehand flakes. These results speak to previous findings by Gurtov and Eren (2014) and Gurtov et al. (2015) suggesting that raw material variability plays a minor role in determining bipolar flake shape, which in turn is a conclusion broadly consistent with a wide array of experimental and archeological findings suggesting raw material is rarely the predominate source of variation in stone tool morphology (Bar-Yosef et al. 2012; see also Clarkson 2010; Costa 2010; Eren et al. 2011b, 2014 and references therein). Our results show that while raw material variability and technological strategy both contribute to variations in flake forms, they do so at a low level of practical significance (low overall effect size statistics). The data suggest that subtle variations in raw material morphology may govern flake form differences, but larger factors are also responsible. Our alternative hypothesis is that flaking goals and considerations of production efficiency are more important factors (other than raw materials) when considering variations in the form of bipolar and freehand flakes.

These results also provide interesting insights into variation attributed to individual knappers. One would intuitively expect large differences in the products from two knappers reducing two different raw materials. Yet, this is not on average what our results show. We found that most of the significant differences between our flint and quartz flakes were of low practical value. Those that showed medium to high effect sizes (i.e., ventral surface features and platform types), are most likely linked to differences in the structural qualities of the two raw materials. Tallavaara et al. (2010) argue that increased toolmaker skill will decrease variation between individual knappers. While this observation is almost certainly task dependent, the fact that our two knappers are both experienced knappers probably contributed to the lack of overall variance between the two experiments. Moreover, we hypothesize that bipolar reduction requires considerably less skill than freehand reduction for small bladelet production. Thus, while some of the significant differences we found between flint and quartz bipolar reduction might be attributed to differences in toolmaker skill (Bleed 2008; Eren et al. 2011a; Morgan et al. 2015; Stout 2002), or experience (Eren et al. 2011b), or slight variations in technique or tools (e.g. Schillinger et al. 2015, 2017), all of this means that raw material’s contribution to bipolar flake shape is in all likelihood even smaller than the pittance it already seems to be supplying. That said, future work should certainly focus on controlling for the knapper by having one knapper work with multiple raw materials.

Conclusions

This project established the first controlled experimental baseline for lithic miniaturization production variability in southern Africa. Its experiments build on a foundation of clear protocols and production goals. The two elements were critical for the bipolar reduction experiments. The experimental outcomes would have differed if our goal was to simply to bash rocks. Here we compared the efficiency and technological attributes on the experiment’s milky quartz and flint flakes. The results show few practically significant differences between the two flake samples. The strongest differences are in the flakes’ ventral surface and platform features. Otherwise, we confer with previous experimental studies that show certain types of milky quartz behave in essentially the same way as other brittle materials such as flint. Although our experimental results were not expected to explain the use of bipolar reduction or lithic miniaturization everywhere, because their production goals were explicit, the results speak to wider cases where toolmakers used bipolar and freehand reduction to make bladelets on quartz and flint, such as in northern Asia, eastern Africa, and Europe.