Introduction

The interpretation of palynological investigations into the nature of anthropogenic land-use can be greatly influenced by the presence or absence of putative cereal pollen. This is particularly so for studies aimed at identifying and tracking the onset, development and character of farming within a region during prehistory. In such situations, supporting macrofossil and archaeological evidence is often incomplete (and can be entirely absent), thus placing great importance on the careful identification and interpretation of any grass pollen grains that are located. The separation of the pollen of cultivated cereal crops from that of naturally occurring wild grasses, although notoriously difficult, is thus of considerable importance to the interpretation of the Holocene pollen record.

It is unsurprising that the identification of pollen within the family Poaceae has received considerable attention from palynologists. In one of the earliest studies, Firbas (1937) determined that domesticated grass pollen is generally larger than that of wild relatives. Since then, comparative studies of the dimensions and features of modern pollen populations have been completed by a number of authors, who suggest that with care it may be possible to distinguish between many wild taxa and the cultivated genera Avena, Hordeum, Secale and Triticum (e.g. Grohne 1957; Beug 1961, 2004; Andersen 1979; Vorren 1986; Küster 1988; Faegri and Iversen 1989) (all plant nomenclature follows BSBI (2004)]).

Andersen (1979) suggested that Poaceae pollen could be separated into four groups, primarily on the basis of annulus diameter and mean grain size, but also taking into account grain shape and surface sculpturing. Secale cereale can often be identified to species level on the basis of these characteristics (due to its prolate grain shape), but identification of other cultivated taxa is less certain as variable levels of overlap exist between the features that he studied. For example, Hordeum vulgare and Triticum monococcum pollen (contained within Andersen’s Hordeum group) cannot usually be separated from a number of wild species including the wetland genus Glyceria, and cultivated Avena and Triticum species cannot be distinguished from one another or the arable weed Avena fatua.

Other investigations have incorporated measurements of pore location and diameter, and the size and form of the annulus. Beug (1961, 2004) builds upon the earlier work of Grohne (1957) to propose that most wild grass pollen can be separated from cereal pollen via its smaller pore diameter, annulus width and annulus protrusion. The annulus of cereals also has a more distinct outer boundary. Further differentiation within this latter group can be obtained by measuring grain shape, pore location and most importantly the pattern of columellae observed under high magnification and phase contrast (see especially Beug 2004). Using these features many, but not all, wild grasses can be separated from domesticated species, and pollen from the genera Avena, Hordeum and Triticum can be distinguished. Küster (1988) incorporates the conclusions of Beug (1961) into his preliminary key. He suggests that pore diameter, the ratio of annulus diameter to pore diameter, and the thickness and appearance of the annulus can be used to separate Poaceae pollen of >40 μm grain size into five groups, four of which are applicable to most European studies. Cereal pollen has an annulus that protrudes significantly in optical section and has a sharp outer boundary, and a relatively large annulus diameter compared to pore size. Importantly, the approaches adopted by both Beug and Küster allow Glyceria pollen to be confidently identified in most situations (see also Vorren 1986).

Collation of the above studies suggests that Secale cereale is the most confidently identifiable European species, usually having a prolate grain shape (Andersen 1979) and an eccentrically positioned pore (Beug 1961), although caution is still required (O’Connell et al. 1999). Conversely, whilst pollen from the genera Avena, Hordeum and Triticum can often be separated (most securely via surface sculpture), these types are incorporated into groups that contain at least one undomesticated species. This is regardless of which of the above methodologies is employed (although note that extremely careful observations of well-preserved pollen may allow further refinement; see Beug 2004).

This presents obvious interpretational problems to palaeoecologists studying Holocene vegetation change and positive identification is further confounded by the difficulties encountered when determining the surface pattern of Poaceae grains recovered from palaeoecological samples. The differences between the three patterns of columellae used by Beug (2004) as an aid to differentiate between the key pollen types Avena-, Hordeum- and Triticum-type are relatively small, and in our experience can be extremely difficult to separate confidently in pollen grains that are not perfectly preserved, or where uptake of staining reagent has been poor.

In this paper we investigate the separation of sub-fossil grass pollen assemblages using a large Holocene dataset obtained from a series of profiles in lowland Yorkshire, England. As it was not felt possible to consistently identify fine surface detail for the majority of the Poaceae grains within the dataset (even under phase contrast), we use an investigative approach to consider the utility of morphological features that are more readily quantifiable. We classify all Poaceae grains within the dataset using the keys of Andersen (1979) and Küster (1988). The identifications suggested by these approaches are then compared and the possibilities of combining elements of the two keys, and employing multivariate statistical techniques to improve confidence in identification investigated. Finally, the implications of the above findings are discussed in terms of the detection and interpretation of episodes of vegetational disturbance within lowland eastern Yorkshire.

Materials and methods

Source material and laboratory methods

The sub-fossil pollen dataset originates from the Holocene sections of sediment profiles from four sites located in central Holderness, eastern Yorkshire (The Bog at Roos – 0°4′11″W, 53°44′24″N; Cess Dell - 0°5′14″W, 53°49′16″N; Gilderson Marr - 0°1′42″W, 53°46′40″N; Sproatley Bog - 0°10′14″W, 53°47′30″N; Fig. 1). The sites are all located below ca. 15 m OD and are small (ca. 2–4 ha) infilled basins that probably originated as steep-sided kettleholes formed during ice downwasting towards the end of the last glaciation. Organic detrital muds and peats characterise the Holocene sedimentary records (Tweddle 2000). To facilitate interpretation of the data, relevant summary percentage pollen diagrams for the four sampling sites are provided in Figs. 25. The pollen diagrams are based on percentages of total land pollen (TLP), where TLP includes trees, shrubs, heaths and herbs, but excludes spores and obligate aquatic species. The exception to this is the diagram for Cess Dell (Fig. 3), where extremely high pollen frequencies for Alnus glutinosa necessitated the exclusion of the taxon from the calculation of TLP. Percentages in Fig. 3 are thus based upon TLP excluding Alnus. A minumum of 500 land pollen grains were counted from each Holocene sample, although an additional constraint of a minimum of 300 TLP excluding Alnus was employed for data from Cess Dell.

Fig. 1
figure 1

Map showing the location of study sites: 1, Cess Dell; 2, Sproatley Bog; 3, The Bog at Roos; 4, Gilderson Marr. The inset map indicates the location of Holderness and adjacent areas of eastern Yorkshire in Britain

Fig. 2
figure 2

Pollen diagram (% TLP) for Gilderson Marr showing Poaceae pollen of >37 μm grain size, major pollen types and a summary of the main plant groups. ‘+’ indicates <2% TLP

Fig. 3
figure 3

Pollen diagram (% TLP excluding Alnus) for Cess Dell showing Poaceae pollen of >37 μm grain size, major pollen types and a summary of the main plant groups. ‘+’ indicates <2% TLP

Fig. 4
figure 4

a Pollen diagram (% TLP) for The Bog at Roos (1–152 cm) showing Poaceae pollen of >37 μm grain size, major pollen types and a summary of the main plant groups. ‘+’ indicates <2% TLP. b Pollen diagram (% TLP) for The Bog at Roos (154–616 cm) showing Poaceae pollen of >37 μm grain size, major pollen types and a summary of the main plant groups. ‘+’ indicates <2% TLP

Fig. 5
figure 5

a Pollen diagram (% TLP) for Sproatley Bog (48–164 cm) showing Poaceae pollen of >37 μm grain size, major pollen types and a summary of the main plant groups. ‘+’ indicates <2% TLP, b Pollen diagram (% TLP) for Sproatley Bog (168–496 cm) showing Poaceae pollen of >37 μm grain size, major pollen types and a summary of the main plant groups. ‘+’ indicates <2% TLP

Sample preparation methodology was chosen to provide optimal comparability with other studies, whilst minimising the risk of biasing results through the artificial alteration of pollen grain size (Faegri and Iversen 1989; Moore et al. 1991; Mäkelä 1996). Samples of wet sediment were prepared using standard techniques (Faegri and Iversen 1989), although a reduced acetolysis duration of 2 min was employed. Preparations were mounted in silicone oil of 12,500 cSt viscosity and all measurements were made within 2 weeks of sample preparation. Size measurements were carried out to the nearest 1 μm at ×1000 magnification under oil immersion. By carefully standardising the preparation technique, it was hoped to minimise any artificial changes in pollen grain size resulting from slight variations in preparation protocol.

A series of attributes were measured from all Poaceae grains that had a longest axis of at least 37 μm, and an annulus diameter of 8 μm or greater (Table 1). As previously stated, it was not felt possible to reliably identify fine surface detail for the majority of the Poaceae grains located. For this reason we have concentrated on the measurement of more readily quantifiable morphological features. The characteristics listed in Table 1 were chosen to provide a representative range of the types of feature measurable using standard laboratory techniques, and to satisfy the requirements of the keys of Andersen (1979) and Küster (1988). To facilitate future comparison with datasets compiled by different researchers, emphasis was placed on the use of objective measurements and their derivatives. An exception to this is annulus boundary type which is a subjective classification, but is integral to the key of Küster (1988). Andersen (1978, 1979) recommends that grains sizes are standardised by comparison with the diameter of Corylus grains located within the same samples; this is intended to negate the effects that differing preservation conditions can potentially have on grain size. Although desirable, the low levels of Corylus avellana-type pollen within the upper parts of the profiles from Sproatley Bog and The Bog at Roos (from which a large proportion of the dataset originates) prohibited such standardisation. Whilst it is theoretically possible therefore that some of the mean grain size variation evident may be an artefact of variation in preservation conditions, there is nothing to suggest that any of the other measurements (including annulus size; Andersen 1979) may have been affected in a similar way.

Table 1 Characteristics noted from Poaceae grains exceeding a longest axis length of 37 μm

Data analysis

All grass pollen grains within the dataset were classified using the keys of Andersen (1979) and Küster (1988). The former key was chosen as it incorporates measurements of annulus diameter, mean grain size and pollen index (see Tables 12 for details and explanation of terms). Whilst surface pattern was also included by Andersen as a supporting aid for his classification, it does not form an essential part of the key and its exclusion within this study should not invalidate our results. The key of Küster (1988) was employed because it represents a significantly different method for classifying Poaceae pollen and incorporates many of the conclusions of earlier workers (e.g. Grohne 1957; Beug 1961, 2004). Unlike the key of Andersen, this approach separates Poaceae pollen of >40 μm grain size (in glycerol—see below) into a number of groups solely on the basis of annulus and pore characteristics (Table 3) and thus serves as an interesting contrast.

Table 2 The pollen key of Andersen (1979). Cultivated species are shown in bold and names as originally cited are presented in parentheses
Table 3 The pollen key of Küster (1988). Cultivated taxa are shown in bold

Pollen grains mounted in glycerol are known to be larger than comparable grains mounted in silicone oil (Faegri and Iversen 1989; Andersen 1979; Moore et al. 1991). As the key of Küster refers to the dimensions of pollen grains mounted in glycerol (Küster pers. comm.), it was necessary to apply a conversion factor before identifications could be directly compared with the data of Andersen. Faegri and Iversen (1989) report that mean grain sizes increase by factors of between ×1.1 and ×1.3 when mounted in glycerol (or glycerine jelly) in comparison with silicone oil, and that pore sizes increase by between ×1.1 and ×1.5. Although we accept that this is an estimate, we reduced the dimensions cited in Küster’s key by the mid-points of the above ranges in order to allow meaningful comparison with the data obtained in our study (see Table 3). Hence, the grain size limit of 40 μm was reduced by a factor of 1.2 to 33 μm and the pore size limit of 4 μm was reduced by a factor of 1.3 to 3 μm. The latter reduction factor compares favourably with Andersen (1979), who found that the annulus diameter (and by extrapolation pore diameter) of Poaceae pollen grains decreased by a factor of 1.28 when mounted in silicone oil as compared to glycerol. In order to assess the degree of overlap between the two keys, the resulting classifications were then compared and the possibility of combining elements of both approaches considered.

Küster’s choice of mounting medium and size ranges could result in the classification of many wild grasses as cereal-types, and this effect is mitigated here because we are considering grains which are above 37 μm in diameter (as measured in silicone oil). The widespread use of glycerol/glycerine and the fact that a quoted conversion factor as low as 1.1 can be found in the literature, persuaded us further that evaluation of the Küster as well as the Andersen key was a valid exercise.

The large dataset size (n=536) and number of variables measured (n=9) provided an opportunity to investigate whether multivariate statistical techniques can further aid the clarity and confidence with which Poaceae pollen can be separated. Two techniques were considered. Principal component analysis (PCA) was used to investigate the degree of grouping and separability of the various pollen types identified. In order to assess how variation in the attributes affected separation of the unconstrained data, eigenanalyses of the correlation matrices of three variable combinations were performed. Data were not transformed prior to analysis. PCA-a incorporated the characteristics used by Andersen to define his pollen groups, although the length of the longest grain axis and the length of the axis at 90° to this were employed rather than mean grain size, as the former represents a more flexible combination of variables (i.e. attributes A, B, D and E in Table 1). PCA-b also included pore diameter and the ratio of annulus to pore diameter (attributes F and G in Table 1), two of the characteristics used in the key of Küster. PCA-c incorporated all of the variables measured in this study and as such combined the integral characteristics used in both keys (i.e. attributes A, B, D-I). Annulus protrusion and annulus boundary type (attributes H and I) were both included, but it should be noted that the validity of this is uncertain as they are both effectively ‘yes/no’ variables.

A second technique, linear discriminant analysis (DA) was used to assess the separability of the Andersen and Küster Poaceae groups, to predict to which groups the grains not fitting one or both keys were most likely to belong and as an internal check on the groupings suggested by the PCA. The same attributes that were employed in PCA-b were used in this analysis. All multivariate analyses were completed using MINITAB Release 11.21. Throughout the text, quoted sample ages are presented as uncalibrated radiocarbon years B.P. rounded to the nearest ten14C years.

Results and discussion

The dataset

A total of 536 large Poaceae pollen grains were located from deposits dated to between ca. 10,100 and <50 B.P. For ease of interpretation the data have been split into four broad age groupings which form the basis of the following discussions. Two discrete groups of large Poaceae pollen were located within the early Holocene sections of all profiles covering that interval (Figs. 24b and 5b). Group 1 contains 45 grains in sediments dated to between ca. 10,100 and 9600 B.P., and group 2 contains 16 grains in sediments dated to between ca. 8050 and 6800 B.P. Given their ages, it is highly unlikely that pollen within these groups could have originated from locally cultivated taxa. Whilst it is theoretically possible that phases of extremely long-distance transport of cereal pollen from the Near East may have occurred during these periods (Edwards 1989), no other non-native (exotic) pollen grains were located. This, combined with the poor dispersal ability of large grass pollen (Vuorela 1973) makes such long-distance transport highly unlikely. Similarly it is unclear why any contamination of samples with more recent pollen (during coring or processing) would have occurred consistently in the parts of each profile covering these periods, but not elsewhere within the pre-Neolithic sections of the cores (cf. Edwards and McIntosh 1988). It is thus probable that pollen grains in groups 1 and 2 solely reflect input from naturally occurring wild grass populations.

Groups 3 and 4 do not form temporally discrete groupings, but have been defined for theoretical and practical reasons. Group 3 incorporates all large grass grains in deposits dated to between ca. 6100 and 2000 B.P. (n=86). The group thus spans the Neolithic, Bronze and Iron Ages, and includes grains aged up to 100014C years older than the traditional elm decline (ca. 5100 B.P., Parker et al. 2002). Pollen aged between 6100–5100 B.P. was included because large grass grains have been found in sediments of this date by a number of authors (e.g. Edwards and Hirons 1984; O’Connell 1987; Edwards and McIntosh 1988), and are hypothesised to reflect early periods of woodland-based arable farming (Edwards 1993, 1998). It should be noted that this inference is contested (O’Connell 1987) and there is presently no independent macrofossil evidence to support this claim (Bonsall et al. 2002). Finally, group 4 (n=389) covers the time period from ca. 2000 to <50 B.P. (the Roman and Historical Periods; Figs. 4a and 5a), and is dominated by grains of Early Medieval age from the Sproatley Bog profile. Pollen in groups 3 and 4 could have originated from communities of both wild and cultivated grasses.

Pollen keys

Summary data showing the identifications suggested by the keys of Andersen (1979) and Küster (1988) for each age group are presented in Tables 4 and 5. During the earliest Holocene (between ca. 10,100 and 9600 B.P.; group 1), the vegetation of central Holderness was dominated by open Betula and Betula-Corylus avellana woodland (Figs. 24b and 5b). The low canopy density allowed a diverse herbaceous ground flora to flourish and damp ground taxa were frequent. Ecological conditions during this period clearly favoured the growth of grasses capable of producing large pollen (i.e. >37 μm mean diameter). Of the 45 large grass pollen grains located, 71.1% were identified as belonging to Andersen’s Hordeum group and 20.0% to his Avena-Triticum group (Table 4). A large number of wild grasses with ecologically feasible habitat preferences are included within the former group, including the wetland genus Glyceria and the open ground species Elytrigia repens. Avena fatua is the only non-cultivated species included within the Avena-Triticum group. It has been suggested that the taxon was introduced to Britain during the Iron Age as an arable weed (Dickson 1988; Dickson and Dickson 2000), or possibly as part of a mixed A. sativa/A. fatua crop (Godwin 1975). However the present authors are unaware of any macrofossil evidence for the species from contexts that significantly pre-date the Iron Age. This suggests that other wild taxa are capable of producing pollen with the characteristics of Andersen’s Avena-Triticum group. Twenty-two percent of group 1 grains were identified as Küster’s Bromus hordeaceus-type, 15.6% as Cerealia-type and 11.1% as Glyceria-type (Table 5). This suggests that a relatively diverse range of wild grass species was contributing to the large grass pollen rain between ca. 10,100 and 9600 B.P.

Table 4 Summary table showing the percentages and absolute numbers (in parentheses) of pollen grains fitting the various categories of Andersen (1979) and unclassified for each sample age group
Table 5 Summary table showing the percentages and absolute numbers (in parentheses) of pollen grains fitting the various categories of Küster (1988) and unclassified for each sample age group

All but one of the 16 grains found in deposits dated to between ca. 8050 and 6800 B.P. (i.e. group 2) were located in pollen assemblages from within or immediately after the initial local expansions of Alnus glutinosa. In the three pollen profiles covering this period, mixed woodland formed the dominant vegetation type (Figs. 24b). Seven of the grains (43.8%) fitted the required characteristics of Andersen’s Hordeum group and 5 grains (31.3%) from the Gilderson Marr core were identified as belonging to the Avena-Triticum group. As discussed for group 1, the Hordeum group grains could feasibly have originated from wetland or disturbed ground taxa, particularly within the record from Gilderson Marr where there is coincident evidence for the expansion of damp ground communities and increased woodland disturbance (Fig. 2). Three of the Avena-Triticum group grains were very large, having mean grain sizes of 50–59 μm, and four had annulus diameters of between 14 and 17 μm (Table 6). The latter observation is of particular interest, as on the basis of Andersen’s data the grains should all belong to the genus Triticum. However, there is no evidence to suggest that Triticum species occurred naturally within Britain prior to the onset of farming (Godwin 1975; but see O’Connell et al. 1999 for possible Lateglacial instances in Ireland). Three of these four grains fitted Küster’s cereal-type and the fourth had characteristics that did not fit any of his groups and was thus unclassified. Large grass pollen grains inseparable from those of cereals and dated ca. 7500 and 6900 B.P. have also been found in profiles from Connemara, western Ireland (O’Connell 1987). As for the group 2 grains of this study, the Connemara grains were associated with the initial local expansion of Alnus glutinosa. Closer to Holderness, Innes (1990) has recorded Triticum-type grains in the pre-Neolithic sections of profiles from the North York Moors. The Küster categories identified for the other group 2 grains were Bromus hordeaceus-type (25.0%) and Glyceria-type (25.0%), indicating a natural origin for the grains and supporting the suggestion that at least some of the pollen identified as Andersen’s Hordeum group may have originated from Glyceria.

Table 6 Avena-Triticum group grains located in deposits significantly pre-dating the currently accepted onset of arable farming within England. Asterisks denote grains identified as Triticum on the basis of the annulus diameters provided in Andersen (1979)

Of the 86 grains contained within group 3 (ca. 6100–2000 B.P.), 50.0% were identified as Hordeum group, 37.2% as Avena-Triticum group and 3.5% as Secale cereale (the latter all from The Bog at Roos). Due to a depositional hiatus, assigning an age to the Secale cereale grains is difficult, but an Iron Age date seems most likely. The dominant Küster types were Cerealia-type (39.5%) and Bromus hordeaceus-type (22.1%), although low frequencies of Glyceria-type (3.4%) and Small Poaceae-type (1.1%) were also located. Interpretation of the data is more difficult than for the preceding groups, as inputs from cereal crops cannot be ruled out and a wide range of vegetational environments existed at different times and in different pollen catchments (ranging from dense woodland to fen carr and extensively deforested landscapes; Figs. 24b). Group 4 (ca. 2000 B.P.-present) contained by far the largest number of grains (389), with 28.8% identified as Hordeum group, 30.3% as Avena-Triticum group and 32.9% as Secale cereale on Andersen’s criteria. The majority of the Secale cereale grains were located in the Early Medieval section of the Sproatley Bog profile and this almost certainly reflects cultivation of rye in fields surrounding the basin. As in the case of the group 3 grains, interpretation of the likely origins of the grains from the other two categories is more difficult. Cerealia-type accounts for 66.3% of the grains identified using Küster, with 11.1% Bromus hordeaceus-type and very low frequencies of Glyceria-type and Arrhenatherum-type (both 1.8%). The considerably higher incidence of Cerealia-type pollen within this group is of considerable interest, but as previously stated this cannot necessarily prove an input (or sole input) from arable crops.

A feature of both keys is that a relatively large percentage of the grains investigated do not fit the required characteristics, and are thus unclassified (Tables 4 and 5). This is clear evidence of the difficulties and uncertainties when attempting to identify large grass pollen from palaeoecological samples to a type. On average, 8.8% of grains were found to have characteristics deviating from those required by Andersen’s key, and a considerably higher proportion of 24.1% did not fit Küster’s key. Despite the variation in habitats represented (and presumably species that were contributing to the pollen rain), no significant difference was found in the proportions of classified and unclassified grains between age groups in either key (χ2, p>0.05 in all cases). The grains that do not fit Andersen’s key fall into two categories and provide evidence of grain and/or annulus size overlap between his groups. The majority of the grains (85.0%) had annulus diameters that placed them within the Avena-Triticum group (11–12 μm), but mean grain sizes consistent with the Hordeum group (34.5–39.5 μm). The remaining 15.0% had Avena-Triticum group grain sizes (>40 μm), but Hordeum group annulus diameters (9–10 μm). It is felt unlikely that the overlap is purely a result of alteration of mean grain size due to variations in preservation or processing conditions (see the results of the discriminant analyses below). Instead it confirms that a degree of overlap is an inherent drawback of Andersen’s classification. The 129 grains not fitting the required characteristics of Küster’s key fall into two categories. 73% of these had grain sizes and annulus to pore diameter ratios consistent with the Arrhenatherum-, Bromus hordeaceus- and Cerealia-types, but incorrect annulus forms; 62.0% had an annulus that was non-protruding, but had a sharp outer boundary, and 10.9% had a protruding annulus with a diffuse outer boundary. Although considered unlikely, given that outer boundary form is a subjective measurement it is possible that the characteristic may have been incorrectly determined. If so, this highlights the problem of incorporating subjective measurements into a pollen key. The second group of unclassified grains had Glyceria-type annulus to pore diameter ratios (i.e. <2), but protruding pores inconsistent with the group. Given that both measurements are objective it is unlikely that the data arise from analyst error.

When grains identified as unclassified in one or both keys are ignored, the lists of potential species indicated by the two keys overlap for 95.0% of the identifications. This is encouraging and suggests that despite relying on different suites of attributes, the two keys have a strong element of comparability. It also hints that greater confidence in identification may be achieved by utilising both keys, instead of just relying on one. For grains fitting Andersen’s Hordeum group, the potential exists to further refine species identifications by combining the two approaches. The Hordeum group of Andersen contains nine wild taxa that cannot be reliably separated on the basis of his data from the domesticated species Hordeum vulgare and Triticum monococcum (Table 2), posing a number of interpretational problems. Species contained in this group fall into four distinct types in Küster’s study (Table 3). The potential thus exists to split the Hordeum group using the measurements of Küster, hence decreasing the number of possible species from which a grain could originate (Table 7). There are a number of obvious problems relating to this methodology, not least the observation that there appears to be a degree of overlap between types. Therefore, although this was a purely speculative approach, all but two of the group 1 and 2 grains identified in this study as belonging to Andersen’s Hordeum group were suggested to originate from the wild genera Elymus and Glyceria (and possibly also the introduced species Bromopsis inermis), using the combinations of features outlined in Table 7. Groups 3 and 4 contained a much higher percentage of individuals (43.2%) identified as either Hordeum vulgare/Triticum monococcum (potentially originating from cereal crops), or Hordeum murinum (a wild grass of disturbed ground). The proportion of grains identified as Hordeum vulgare or Triticum monococcum within the Early Medieval section of the Sproatley Bog profile was particularly high and it is possible that much of the input reflects cultivated taxa. Given the high levels of pollen from Secale cereale and the arable weeds also present within this part of the profile (Fig. 5a), this suggestion seems reasonable. The problems outlined above mean that further work would obviously need to be undertaken before any firm conclusions regarding the value of such a combined approach could be made. However it is encouraging that the species group containing cultivated taxa was only identified consistently in deposits of Neolithic age or younger.

Table 7 Separation of taxa within Andersen’s (1979) Hordeum group via incorporation of characteristics from the key of Küster (1988). Cultivated taxa are shown in bold

Multivariate analyses

Principal component analysis

The results of the principal component analyses suggested that it was possible to resolve most of the variability within the dataset into 2–4 axes of variation depending upon which variables were included within the PCA (Table 8). A number of patterns emerge when sample points are labelled according to either the Andersen or Küster identifications, or to age groups. Only the most important PCA plots are reproduced in the text.

Table 8 Summary details of the Principal Component analyses completed: pollen attributes included and the proportion of total variance accounted for by the first four Principal Components

When the data are labelled by Andersen group, by far the clearest separation is seen in a plot of the sample data against the first two principal components of PCA-a (Fig. 6). The eigenvalues indicate that these two components (axes) account for 89.8% of the sample variation, and the clear grouping evident is essentially reflecting variation in overall grain size (a composite of both grain and annulus diameters; x-axis) and grain shape (pollen index; y-axis). Grains identified as Secale cereale are located towards the bottom of the plot and the less prolate Hordeum- and Avena-Triticum group grains towards the top. The larger Avena-Triticum group grains are plotted to the right of the smaller Hordeum group grains. Separation of the three pollen groups is generally good, but there is a degree of overlap between the Hordeum- and Avena-Triticum groups which supports previous observations (above). Many of the grains not fitting the criteria of Andersen are located within this overlap. Interestingly, a number of unclassified grains also appear throughout the cluster of Hordeum group grains, suggesting that they may originate from this group, a hypothesis that is supported by the results of the DA (see below). Although separation of Secale cereale grains from those belonging to the Avena-Triticum group is good, there is slightly more overlap with the Hordeum group grains.

Fig. 6
figure 6

Plot of sample scores labelled by Andersen (1979) group for the first two principal components of PCA-a. Abbreviations: ATG, Avena-Triticum Group; ATG*, compressed Avena-Triticum Group pollen grain (pollen index >1.26, but centrally located pore; characteristics otherwise as for ATG); HG, Hordeum Group; HG*, compressed Hordeum Group pollen grain (pollen index >1.26, but centrally located pore; characteristics otherwise as for HG); SC, Secale cereale; Unclass, pollen grain not fitting the required characteristics of the key

Unsurprisingly, when the data are labelled by Küster type, they separate most clearly when all characters necessary to his key are included within the analysis (i.e. PCA-c; Fig. 7). As for PCA-a, the first principal component for PCA-c is largely controlled by overall grain size, with large grains having high axis scores, whilst PC-2 appears to primarily reflect annulus and pore characters. Grains having diffuse-edged annuli and low annulus to pore diameter ratios lie towards the bottom of the plot, and those with sharp-edged annuli that are comparatively large compared to the pore are placed towards the top. The constituent pollen types show varying degrees of separation. Cerealia-type grains are clearly grouped in the central and upper region towards the right of the plot, and despite some overlap with Bromus hordeaceus-type, the majority of its members are clearly distinct. Arrhenatherum-type grains form a relatively tight group in the top left of the plot, although sample size is admittedly low (n=8). Separation of Bromus hordeaceus-type pollen is less clear, with the group showing significant overlap with both the Glyceria- and Cerealia-types. The tightness of the Cerealia-type group is encouraging and strongly suggests that unclassified grains within the central and right-hand parts of the Cerealia-type distribution also belong to this category. This is supported by the results of the DA (below). With the exceptions of those previously mentioned, the unclassified grains are concentrated in the region of overlap between the Bromus hordeaceus-, Glyceria- and Cerealia-type distributions and their probable identities cannot be confidently suggested on the basis of the PCA.

Fig. 7
figure 7

Plot of sample scores labelled by Küster (1988) group for the first two principal components of PCA-c. Abbreviations: SPT, Small Poaceae-type; AT, Arrhenatherum-type; BHT, Bromus hordeaceus-type; GT, Glyceria-type; CT, Cerealia-type; Unclass, pollen grain not fitting the required characteristics of the key

There is also evidence for grouping of the data when the sample points are labelled by age group (PCA-a, Fig. 8). By far the largest spread is observed for grains from the youngest age group (group 4), which are scattered throughout the plot. The high degree of spread is likely to reflect a combination of the presence of grains identifiable as Secale cereale, the variety of pollen types identified within the group, and the overall higher sample size (cf. groups 1–3). Grains lying to the bottom of the plot (y-axis values of <-1) have high pollen indices and are likely to originate from Secale cereale. If these data points are ignored, then some separation of grains lying in the upper half of the plot is evident. With the exception of several group 2 grains from the site of Gilderson Marr, the larger grains (high x-axis values) are all from groups 3 and 4 (i.e. Neolithic or post-Neolithic in date). The group 1 and 2 grains are primarily clustered towards the top-left of the plot, and are small with low pollen indices. Evidence for an increase in overall grain size through the Holocene is not unequivocal, however. Average grain and annulus sizes per millennium show no consistent, or significant, increase through the Holocene and in the record from The Bog at Roos (the most temporally complete and least truncated of the four studied), a wide spread of mean pollen grain size is seen. It is evident however that the largest grains (>45.5 μm) in this profile are all from deposits younger than 500014C yrs old, whilst all those >50.5 μm in size date to the Historical Period.

Fig. 8
figure 8

Plot of sample scores labelled by age group for the first two principal components of PCA-a

Discriminant analysis

DA was very successful in separating grains classifiable using Andersen’s key, correctly assigning 93.8% of the grains to the relevant group. Discrimination of the Hordeum group was particularly successful with 97.3% of the grains identified in the laboratory as belonging to this group correctly assigned. This suggests that the three groups are distinct on the basis of the attributes included within the analysis. The analysis of unclassified grains suggested that the majority could be confidently placed in the Hordeum group (39 out of 47, or 83.0% of individuals), with 7 of the 8 remaining grains being assigned to the Avena-Triticum group. The bulk of the unclassified grains assigned to the Hordeum group had annulus diameters characteristic of the Avena-Triticum group, suggesting significant overlap in annulus size between the two groups. Unlike grain size, annulus diameter does not appear to be modified by variations in preservation or processing conditions (Andersen 1979), suggesting that this overlap is not an artefact. Although there were some exceptions, the probabilities that grains belonged to their predicted groups were generally high at ca. 0.6–0.9, suggesting that it is possible to confidently predict the group that an unclassified grain should belong to in most cases. The analysis supports the PCA and laboratory observations that there is significant overlap between the Hordeum- and Avena-Triticum groups, but that Secale cereale grains can be separated reliably.

DA was also largely successful in separating grains identifiable using Küster’s key, but with less confidence than for Andersen’s groups. The technique correctly assigned 73.6% of grains to the relevant group, with discrimination most successful for Arrhenatherum-type (87.5% of grains correctly assigned), and Glyceria-type (84.2%). The technique also assigned 80.3% of Bromus hordeaceus-type grains correctly, but mis-classified 29.0% of Cerealia-type pollen. The 129 unclassified grains were assigned to three pollen types. Sixty-three (48.8%) were classified as Bromus hordeaceus-type, 34 (26.4%) as Cerealia-type and 32 (24.8%) as Glyceria-type. The probabilities that these grains belonged to their predicted groups were generally very high, with 52 of 129 predictions having probabilities in excess of 0.8, and the majority of the remainder above 0.6. This refines the results of the PCA by predicting that unclassified grains located within the region of overlap between the Cerealia- and Bromus hordeaceus-type distributions (Fig. 7) are all likely to belong to the latter pollen type, whilst those located within the overlap between the Glyceria- and Bromus hordeaceus-types and the region above the Glyceria-type distribution, are most likely to originate from Glyceria-type. As shown above for Andersen’s key, this suggests that the technique may be used to predict the likely classification of pollen grains not fitting Küster’s key and it is suggested that DA provides a considerably more confident method for assigning unclassified grains than PCA.

Implications of the results: detecting arable farming in the pollen record from Holderness

The results of this study show that greater confidence in identifying large Poaceae pollen grains from palaeoecological samples can be obtained if the approaches of Andersen (1979) and Küster (1988) are employed in parallel (and possibly combined). For example, a grain identified as belonging to Andersen’s Hordeum group may have originated from either a wild or cultivated grass. If the grain is also identified as belonging to Küster’s Glyceria-type, then a natural origin may seem more likely, whilst an identification of Cerealia-type may add weight to the suggestion that it reflects input from a cultivated species. This is a positive step, but the data highlight that overlap between the characteristics used to delineate the various groups of both keys is a significant problem. To an extent this overlap can be negated via the multivariate statistical techniques of PCA and particularly DA if the dataset is large enough. A not dissimilar situation was revealed when separation of grains within the Cannabaceae was attempted (Whittington and Gordon 1987).

However there are obvious limitations to this methodology, and while our data support previous assertions that Secale cereale pollen can generally be identified to species level, pollen from the other cultivated genera Avena, Hordeum and Triticum cannot be confidently separated from some wild grasses using the approaches that we employed. This is clearly exemplified by the five grains located in the Gilderson Marr profile from sediments dated to between 7160 and 6850 B.P. (Table 6). All five fitted the characteristics of Andersen’s Avena-Triticum group (four had annulus diameters indicative of Triticum) and three fitted Küster’s Cerealia-type, yet the sample ages strongly suggest that they originate from wild species. Had these pollen grains been located in younger (i.e. post-elm decline) deposits, a domesticated origin could perhaps have been suggested. The above results show that this is a potentially erroneous inference and it follows that the likely origins of all large Poaceae pollen grains must be rigorously assessed (not just those obtained from pre-Neolithic contexts), and the methods of identification clearly stated (cf. Dickson 1988). Whilst a step in the right direction, the results support the conclusions of Beug (1961, 2004) that greater confidence in identification can only be obtained via careful study of surface pattern. Where critical, image analysis techniques might assist such endeavours. Unfortunately, resolution of surface features was not consistently possible in this study, a situation that is likely to occur in many other palynological investigations. It should also be remembered that not all wild grasses can be separated from cereals even if pollen surface pattern is determined (Beug 2004).

In such cases the wider palaeoecological and archaeological records must be consulted in order to assess the likely origins of any large grass pollen (i.e. the weight of evidence; Edwards and Hirons 1984). This is far from ideal and will inevitably invoke a degree of personal interpretation. However, carefully qualified interpretations can still be of great benefit (O’Connell 1987; Edwards 1989). In Holderness, the absence of macrofossil evidence and a poor archaeological record (Van de Noort and Ellis 1995) mean that inferences concerning the role of arable agriculture in prehistory are based solely upon the pollen record. Consideration of the ecological preferences of the wild species that may be contributing to the fossil record helps to strengthen interpretation by reducing the number of potential wild taxa within the two keys. For example, the study sites would have been located well inland of the coastline for much of the Holocene (Jelgersma 1979; Coles 1998) and the coastal taxa Ammophila arenaria, Elytrigia juncea and Leymus arenarius are unlikely to have been represented, at least until the Roman Period. Nevertheless this still leaves a number of damp and disturbed ground species as possible contributors to the pollen rain.

In this study, all grains within pollen groups 1 (ca. 10,100–9600 B.P.) and 2 (ca. 8050–6800 B.P.) must logically reflect an input from wild grasses. Interpretation of data from pollen group 3 (ca. 6100–2000 B.P.; Figs 24b) is more difficult, as contributions from both wild and cultivated species are probable. Occasional large Poaceae grains were located sporadically throughout the sections of the profiles covering the later prehistoric periods. In the absence of any clear indicators of putative anthropogenic activity within the pollen record, or supporting archaeological evidence for a contemporary local human presence, it is difficult to infer with confidence whether such isolated incidences reflect inputs from wild or domesticated taxa. Whilst the occurrence of grains belonging to both the cereal-containing categories of Andersen (1979) and Küster (1988) is suggestive, the level of overlap between characteristics highlighted in this study means that conclusions must remain tentative. This frustrates attempts to detect the earliest occurrence of arable farming within the region with certainty. Furthermore, the poor dispersal ability of large Poaceae pollen (Vuorela 1973), a generally high canopy density at the time (Figs. 24b) and the presumed small-scale of arable cultivation during the earliest Neolithic, all combine to minimise the chances of discerning both a significant influx of large grass pollen into the sampling site and of any associated vegetational disturbance.

The isolated occurrences of group 3 pollen were relatively infrequent though, and large Poaceae pollen was only found consistently in association with prolonged periods of woodland decline (Figs. 24b). Pollen source area theory predicts that only those reductions in woodland cover that have a large enough combined areal extent to constitute significant local landscape-scale events, will be clearly visible within the pollen records obtained in this study (Jacobson and Bradshaw 1981; Delcourt and Delcourt 1988; Sugita et al. 1997, 1999). It thus follows that most of the group 3 Poaceae grains were found in samples dating to periods of significant, potentially anthropogenic, landscape-scale woodland disturbance. These events were typically of extended duration (between 250 and 80014C yr). As well as reduced influx from arboreal species, pollen from herbs frequently found in arable fields also increased in abundance at such times (in both percentage and absolute terms). Whilst this encourages the interpretation of the large grass pollen as resulting from arable activity, this need not be the case. Few taxa occur solely as weeds of arable crops (Behre 1981, 1986), and clearance may encourage disturbed ground (Edwards and Hirons 1984) and woodland-edge grassland communities, as well as promoting the long-distance transport of pollen from existing Poaceae populations (Edwards 1993). Increased surface water run-off as a result of deforestation may also favour the expansion of wetland grasses possessing large pollen grains.

Two possible interpretations exemplifying the above uncertainties are shown by the records from Cess Dell and Gilderson Marr. In the former site, there is evidence for a prolonged period of reduced Corylus avellana-type and Ulmus influx between ca. 6500 and 6160 B.P. (Fig. 3). A coincident increase in herbaceous diversity (and absolute influx) with a consistent presence of large grass pollen also occurs. It might be tempting to infer an anthropogenic cause for the changes, but closer inspection of the pollen record indicates an expansion of the wetland and wet meadow flora at this time. Therefore it is possible to explain the changes via entirely natural processes, such as a rising water table within the valley floor close to the basin. All but six of the Poaceae grains located belonged to Andersen’s Hordeum-group and Küster’s Glyceria- and Bromus hordeaceus-types, further supporting a natural origin. Of course, a domesticated origin also remains possible but is not overwhelmingly supported by the rest of the palynological record. In view of this, we adopted a conservative approach to interpretation.

Conversely, a human cause seems more probable for a prolonged period of woodland disturbance and intense soil erosion within the Gilderson Marr profile between ca. 5430 and 4190 B.P. (Fig. 2). Percentage and absolute pollen data indicate a reduction in arboreal input, an increase in open and disturbed ground taxa, and the consistent presence of large grass pollen. Charcoal levels are particularly high and % loss-on-ignition declines, most likely as a result of erosion of clay from surrounding slopes into the basin (Tweddle 2000). The spatial scale, high charcoal levels and extended duration of the event (or events) make a wholly natural origin for the disturbance seem unlikely, and it is probable that at least part of the signal reflects the activities of Neolithic peoples on the slopes surrounding the site. Interestingly, a marked increase in large grass pollen influx between ca. 4960 and 4190 B.P. is accompanied by consistently reduced loss-on-ignition values (Fig. 2). This raises the possibility that the data may reflect a change in the location or intensity of arable cultivation on the slopes surrounding the basin. Indeed, the majority of the Poaceae grains from this period fit Andersen’s Avena-Triticum group and Küster’s Cerealia-type. Frustratingly, although the above scenario of cereal cultivation and concomitant soil destabilisation remains plausible, the inability of the techniques employed in this study to consistently separate wild from cultivated grasses means that any such suggestion must be carefully qualified.

In Holderness, documentary evidence and much improved archaeological and plant macrofossil records indicate that the landscape local to the study sites was extensively managed during the Historical Period (Faull and Stinson 1986; Van de Noort and Davies 1993; Van de Noort and Ellis 1995). This is clearly evident in the Early Medieval record from Sproatley Bog (Fig. 5a), where the low input from arboreal taxa indicates an extensively cleared landscape. Pollen identified as Secale cereale is abundant and herb pollen including the arable weed Centaurea cyanus is diverse. The occurrence of Linum usitatissimum seeds and high levels of Cannabis sativa pollen within the sediment profile suggest that flax and hemp may have been grown close to the site, or retted within its waters (Tweddle 2000). Whilst it is still not possible to identify non-Secale Poaceae grains with certainty, the combined ‘weight of evidence’ suggests that at least some inputs are likely to result from non-rye cereals growing close to the basin. In this instance, the inability to separate large grass pollen reliably provides only limited hindrance to an inference of intensive arable activity.