Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Introduction

This study uses linguistic data to reconstruct the prehistory of agriculture in Mesoamerica, a cultural and linguistic area of Mexico and northern Central America.Footnote 1 Evidence is assembled indicating when, where, and for whom 41 cultivated and protected plantsFootnote 2 native to the New World became significant to peoples of the region in prehistoric times. The study of prehistoric agriculture has traditionally been the purview of archeologists interested in paleoethnobotany. Nevertheless, this investigation intentionally avoids reference to archeological findings and other nonlinguistic results that may or may not complement those presented here. All conclusions presented in this study are solely on the basis of linguistic data.

Specific goals of this study are (1) determination of the earliest date by which each of the 41 plants developed significance for people in Mesoamerica, (2) location of the general areas in the region where each plant initially became important to human groups, (3) determination of which of the 41 plants became important to what groups of prehistoric people, and (4) determination of when these plants became important. The comparative approach of historical linguistics is employed, with use of lexical reconstruction and glottochronology. The comparative method facilitates determination of which of the 41 plants were named by speakers of specific ancestral languages. Glottochronology determines approximately when ancestral languages were last spoken. This study employs a new glottochronological approach that yields dates for proto-languages that are entirely objectively derived.

Background

I have previously undertaken similar investigations that use the comparative method of historical linguistics to chart the prehistory of cultivated plants in the Americas. These include studies focused respectively on maize (Brown 2006a), common bean (Brown 2006b; see Chap. 10 this volume), and squash (Brown n.d.). Each of these investigations contains information on the prehistory of these three cultigens in the New World in general, including detailed results for Mesoamerica.

Historical Linguistics: Lexical Reconstruction and Glottochronology

The comparative approach facilitates reconstruction of vocabularies of languages of the remote past not preserved in written records. The basic method is to compare lexicons of modern genetically related languages to find words that are both phonologically and semantically similar. Such similar words are considered cognates if they can be shown to have developed from a single word in the vocabulary of the ancestral or proto-language from which the related languages have developed (descended). For example, Yucatec and Jacaltec are two genetically related languages of Mesoamerica, both descended from Proto-Mayan, their common ancestral language spoken at the latest around 2,400 years before present (BP). These two languages have phonologically similar words for chili pepper, respectively, ìik and . Because the sound segments of these two words regularly correspond, these terms attest to the occurrence of a word in Proto-Mayan for chili pepper from which terms for the plant in both Yucatec and Jacaltec developed. Comparing sounds of these two words and related similar words for the plant found in other Mayan languages, it is possible to reconstruct Proto-Mayan’s word for chili pepper, i.e., *iihk (Brown and Wichmann 2004:196).

Using this approach, I determine which of 41 cultivated and protected plants were named in 30 different proto-languages of Mesoamerica spoken in the prehistoric past (see Table 1). These proto-languages are all ancestral to modern native languages of the region, and some are also ancestral to other proto-languages included among the 30. For example, Proto-Mayan is the immediate ancestor of Proto-Greater Tzeltalan, which in turn is the immediate ancestor of Proto-Tzeltalan. Appendix 1 lists and organizes the 30 Mesoamerican proto-languages according to ancestor-descendant relationship. Modern languages affiliated with proto-languages are also presented and located on a topographic map.

Table 1 The 30 Mesoamerican proto-languages (A–d), with LD date (in years before present), and indication of which of the 41 cultivated/protected plants were named in each language (X indicates named plant; bold X indicates earliest instance of a named plant)

While it would be possible to do so, I have not reconstructed actual terms for the 41 plants in each of the 30 proto-languages as this is a large undertaking not pertinent to the goals of the present study. I have only determined the plants that appear to have been named in each ancestral language, given the available linguistic information. This task involved looking for cognate words for a specific plant occurring in genetically related languages. The existence of such cognates in the related languages, with appropriate distributions across the languages, constitutes evidence that their common ancestral language possessed a term for the plant.Footnote 3

In making decisions involving word cognation, I consult earlier studies in which vocabularies of Mesoamerican languages have been subjected to comparative analysis and reconstruction. These include Kaufman (1972, 2003), Brown and Wichmann (2004), Kaufman and Norman (1984), and Berlin et al. (1973) for Mayan languages; Wichmann (1995) for Mixe-Zoquean languages; Kaufman (1990), Rensch (1976, 1989), Gudschinsky (1959), and Longacre (1957) for Otomanguean languages; and Miller (1967) and Hill (2001, 2008) for Uto-Aztecan languages. In the vast majority of cases, decisions reported here concerning the occurrence of a plant name in a proto-language agree with conclusions of these studies.Footnote 4

Determination of a name for a specific plant in an ancestral language indicates that the plant was of considerable salience for its speakers. Berlin et al. (1973) in an important, but not widely cited study, assemble evidence that words for plants of high salience tend to be retained by offspring languages, whereas those for plants of low salience tend over time to be replaced. They present a very strong positive correlation between the lexical retention (stability) of plant names and the cultural significance of the plants they designate.

In the Berlin et al. (1973) study, plant names were collected from speakers of Tzeltal and Tzotzil, two closely related Mayan languages of Chiapas, Mexico, both of which are immediate daughter languages of Proto-Tzeltalan (see Appendix 1). Plant names in each of the two languages that refer to at least one identical species are compared for lexical similarity and possible cognation. Plants designated by these terms are grouped into four categories delimiting their cultural significance (from high to low): (1) cultivated plants, (2) protected plants, (3) wild-useful plants, and (4) wild-insignificant plants.

A total of 257 plant species have both Tzeltal and Tzotzil names (1973:161). Of these, 111 are designated by pairs of cognate terms attesting to a plant term’s pertinence to the Proto-Tzeltalan lexicon. Paired terms for 146 species are found not to be cognate. Fourteen pairs pertaining to cultivated plants are found cognate and two are noncognate; 29 pairs pertaining to protected plants are cognate and seven are noncognate; 52 pairs pertaining wild-useful plants are cognate and 63 are noncognate; and 16 pairs pertaining to wild-insignificant plants are cognate and 74 noncognate.

The correlation between cognation and cultural significance is extremely strong and statistically significant: gamma = 0.97 (on a scale from 0.00 to 1.00, where 0.00 indicates no association whatsoever and 1.00 is a perfect correlation), p < 0.001. In other words, plant names of Proto-Tzeltalan strongly tend to be retained by its offspring languages when the plants designated are high in cultural importance, and strongly tend to be replaced when designated plants are low in cultural importance.

The implication of the investigation by Berlin and his associates for the present study is that plants whose names are determined to have been present in the 30 Mesoamerican proto-languages necessarily were all of substantial cultural significance for speakers of those languages. A managed plant showing “substantial cultural significance” is one whose name and use are well known to all adult members of a language community. On the other hand, the failure of a plant name to reconstruct for an ancestral language does not necessarily mean that the plant was not present in the habitat of speakers; it means only that, if it were present, it was not especially salient. For example, such a plant might be known only to a small subgroup of a language community’s membership, such as a few agro-specialists. Given these findings, the earliest dates for cultivated/protected plants documented for Mesoamerica by plant name reconstruction and glottochronology may not necessarily always correlate closely with those for the same plants attested through archeological investigation.

Table 1 identifies those plants of the set of 41 named in each of the 30 proto-languages and presents Levenshtein distance (LD) dates for each of the ancestral languages. An LD date is the latest date at which a proto-language was spoken. For example, the LD date for Proto-Tzeltalan is 795 BP. This date is the hypothetical point in time just before Proto-Tzeltalan split into its two daughter languages, Tzeltal and Tzotzil.

LD dates are a new development in glottochronology (cf. Serva and Petroni 2008). Glottochronology was devised by Morris Swadesh (1951) in the mid-twentieth century as a method for determining the number of centuries since genetically related languages diverged from their common ancestor. This involves comparing the core vocabulary of two languages to determine the degree to which words in those languages are similar. Typically, core vocabulary is a list of 100 or 200 referents including common things familiar to all humans such as seed, blood, and water, and ordinary activities such as eat, sleep, and hear. Less similarity between two languages entailing words for these referents indicates greater chronological distance between the two languages, and more similarity indicates less chronological distance. Swadesh, working with the assumption that lexical replacement of core vocabulary on the average occurs at a relatively constant rate over time, developed a formula for determining the minimum number of centuries since a language divergence occurred. Applying this formula to the number of similar words in the core vocabulary list found for two languages, the number of centuries that have passed since the two split from a common ancestor can be computed.Footnote 5

Although glottochronological dates are frequently cited in the literature dealing with language and culture prehistory, some linguists have been critical of the method from its inception. In recent years, however, increasing numbers of scholars have come to embrace glottochronology and its results (cf., Brown 2006a, b). One severe criticism of the method is that it is subjective as it involves human decisions concerning which words pertaining to core vocabulary found in two compared languages are or are not to be considered cognates. Different practitioners of glottochronology often use different criteria in deciding word cognation. In addition, different practitioners using the same criteria sometimes obtain different results simply because of the fact that human decision-making rarely if ever is totally objective.

LD dates are derived through an entirely objective procedure developed by the ASJP consortiumFootnote 6 and first described in Wichmann et al. (2008). ASJP has assembled a database at present consisting of core vocabulary lists for over 2,400 languages and dialects.Footnote 7 These lists consist of words for 40-item subsets of the list of 100 basic referents proposed by Swadesh (1955). The 40 items, selected by a procedure described in Holman et al. (2008), are the most stable referents among the original 100.Footnote 8 The lists were transcribed in a phonologically simplified orthography known as ASJPcode described in Brown et al. (2008). Levenshtein distances (LDs) were calculated for all possible pairs of the 2,400+ languagesFootnote 9 on the basis of the 40-item lists. An LD is defined as the minimum number of substitutions, insertions, or deletions needed to transform one word into another with which it is compared.

ASJP modifies LD in the following manner to account for confounding factors such as word length and chance phonological similarities derived from similar phoneme inventories: The raw LD is first divided by the longest string among the two compared words to obtain a normalized measure, LD1. This is then further divided by the average LD1 of all pairs of words not having the same meaning to obtain a further normalized measure, LD2. Finally, the figures are converted to percentages.

The first step in calculating an LD date for a group of genetically related languages, such as Mayan, Indo-European, or Austronesian, is generation of percentages for all pairs of languages belonging to the group. Next, each family is partitioned into two objectively defined groups and the averages of the percentages for each pair of languages whose members belong to different groups are calculated. ASJP has determined a constant rate of lexical change based on the degree of similarity between languages measured by LD percentages: 73% of LD similarity is retained over a period of 1,000 years. Footnote 10 Using this constant in a formula into which the average LD percentage for a group is entered, an LD date for the proto-language ancestral to the language group is calculated. Generation of LD dates is achieved through machine automation. This process entails no human decision-making and is therefore entirely objective.

LD dates are presented for the 30 Mesoamerican proto-languages in Table 1 (and are also given in Appendix 1). These constitute the first glottochronological dates for Mesoamerican languages that have been derived through an automated, totally objective approach.

Interpretation of LD Dates

Like all other glottochronological dates, LD dates constitute the latest dates at which ancestral languages were spoken. After a given LD date, the proto-language to which it pertains has ceased to exist since it has diverged into offspring languages. In other words, dialects of the ancestral language are no longer mutually intelligible and have thus become distinct languages. Theoretically, any proto-language could be substantially older than its divergence date. For example, according to an analysis by Wichmann et al. (2008), languages are spoken an average of 900 years before their divergence into daughter languages. Such considerations should be factored into interpretations of prehistoric events based on LD dates.

The Data: Languages and Plants

The data for this study come from both ethnobotanical and lexicographical sources. These sources yield names for 41 cultivated and protected plants in 68 contemporary languages and dialects (henceforth, languages) spoken by native people of Mesoamerica. The 68 languages, listed in alphabetical order, and their sources are given in Appendix 2. Classification of these languages and their location on a topographic map are presented in Appendix 1.

An exhaustive approach was used in selecting the sample of 68 languages. All Mesoamerican languages were included whose sources appear to be reasonably thorough with respect to recording names for the 41 managed plants. These include dedicated ethnobotanical studies for nine languages: Chinantec (Comaltepec), Huastec, Mixe (Totontepec), Mixtec (Alcozauca), Q’eqchi’, Tzeltal (Tenejapa), Tzotzil (Zinacantán), Yucatec, and Zapotec (Mitla). Sources for the remaining 59 languages are dictionaries and personal communications. Many of the dictionaries include special sections in which the plants named are identified to scientific species (or to genus if not to species). Such presentations are typically found in the many dictionaries consulted for this study prepared and published by the Summer Institute of Linguistics (SIL). A large number of the dictionaries used have become available only within the last 20 years or so (mostly those produced by SIL). Therefore, two decades ago an investigation on this scale would not have been possible.

Words for plants from sources for the 68 languages figure into comparative analysis undertaken to determine the presence of names for specific plants in proto-languages. However, this analysis is not restricted to data from the 68 languages. Lexical sources for Mesoamerican languages not included among the 68, while not usually complete with respect to botanical inclusions and identifications, nevertheless provide information of some analytic use. Such data mainly consist of words for plants found in the several comparative studies of Mesoamerican languages mentioned above that were consulted for reconstructing plant inventories for ancestral languages.

The 41 plants investigated consist of species indigenous to the New World that are widely cultivated or protected by contemporary Native Americans of Mesoamerica.Footnote 11 I have attempted to be exhaustive in selecting plants to be included in this study, including all regularly managed plants known to me from personal experience in Mesoamerica and from the ethnobotanical and ethnographic literature. Plants excluded from the study are, for the most part, those of minor importance, typically only of circumscribed local interest, for which native-language names are seldom listed in dictionaries.

Native terms for the 41 plants for the most part bear a one-to-one correspondence with scientific species. Sometimes, however, they do not. For example, a single word in some native languages may designate several species of squash (Cucurbita spp.) as does the English word squash. Because of this ambiguity, names for some plants identified for ancestral languages cannot be assumed to have denoted some single specific species. All such names, with one exception, are accurately identified at least to genus. Thus, for example, the discovery that an ancestral language had a word for squash means that the language had a word that designated at least one, but conceivably more undetermined species of Cucurbita.

Special note should be made concerning the plant designated in this study by tomate (see Appendix 3). In Mesoamerica, the Mexican Spanish word tomate, depending on region, may designate either Solanum lycopersicum or Physalis philadelphica, both plants belonging to the family Solanaceae.Footnote 12 Similarly, in the vast majority of Mesoamerican languages surveyed for this study, these two species are nomenclaturally related, either by both being denoted by the same term, or in some other more complex way.Footnote 13 For this reason, it is not possible to determine the exact referent of terms for tomate reconstructed for proto-languages. Thus, any term for tomate found pertinent for an ancestral language may have designated either Solanum lycopersicum or Physalis philadelphica or both.

Appendix 3 lists the 41 managed plants in alphabetical order by most well known common name (either from English or Spanish), with plant scientific identification, and, when found, other common names for the plants in Spanish (as spoken in Mesoamerica) and/or English. When plants can be scientifically identified to genus only, names are given for individual species of the genus that are ordinarily designated by native terms in the languages of Mesoamerica. Also given in Appendix 3 are the average elevation of plants in meters (when data are available), the earliest LD date found for each plant, and the proto-language with which the date is associated.

Discussion of Table 1

Table 1 presents the major empirical findings of this study. It is arranged as a matrix with common names for the 41 plants given on the vertical axis and proto-language names with LD dates on the horizontal axis. The 30 proto-languages are listed from left to right according to magnitude of their LD dates, from oldest to youngest. In each proto-language column, Xs are used to identify those plants whose names are present in proto-languages. Plant names are listed from top to bottom of the plant column according to the date associated with the earliest occurrence of their name in a proto-language (indicated by a bold font X), with the earliest dated plant at the top, and the latest at the bottom.

Table 1 shows that reconstructed names for maize, maguey, avocado, nopal, and squash are found for the earliest Mesoamerican proto-language, i.e., Proto-Otomanguean (7034 BP).Footnote 14 Maize and avocado can be identified to species (respectively, Zea mays and Persea americana). The other three can only be identified to genus (respectively, Agave spp., Opuntia spp., Curcubita spp.). Table 1 also shows that by 1968 BP names for all but one of the 41 cultivated/protected plants have been reconstructed for Mesoamerican proto-languages. A name of one of the plants, annual sunflower, has not been reconstructed for any of the 30 ancestral languages, indicating that it is probably a very recent addition to the Mesoamerican assemblage of managed plants.

Ancestral-Language Homelands

Determining locations at which plants first became significant to prehistoric Mesoamericans requires identifying where proto-languages were spoken, i.e., identifying ancestral-language homelands. Comparing plant-name inventories reconstructed for proto-languages with plant-name assemblages of modern Mesoamerican languages is one approach to homeland location. Close similarity of a plant-name inventory of an ancestral language to that of some modern language indicates that the ancestral language was probably spoken in a habitat similar to that of speakers of the modern language. This approach contributes to the location of homelands of ancestral languages, and, hence, to determining where cultivated and protected plants known to their speakers were of significance. Of special importance in this analysis is the elevation of habitat above sea level.

The average altitude for managed plants is determined through use of online information supplied by the Missouri Botanical Garden (http://www.tropicos.org). This site catalogs a massive number of plant specimens from the neotropics and many other parts of the world, and is especially comprehensive for Mesoamerica and abutting areas. These data allow calculation of the average elevation at which specimens for individual species are found. Average elevations were determined for 32 of the 41 managed plants.Footnote 15 These are reported in Table 2 where for each plant is given its common name (either in English or Spanish) along with its scientific identification. In the table, plants are rank-ordered by average altitude (in meters) from lowest to highest.

Table 2 The 32 cultivated/protected plants for which average altitudes are available, identified by common and Latin names, and rank-ordered by average altitude (given in meters, from smallest to greatest)

Altitude of the location at which each of the 68 modern Mesoamerican languages is spoken is determined by consulting an online site named Global Gazetteer Version 2.1 (http://www.fallingrain.com/world/index.html). This source gives elevation above sea level (in feet and meters), geographic coordinates, estimated population, and map location for most towns and villages of the world. All lexical sources for the 68 languages (see Appendix 2) indicate the major town or towns in which speakers reside. These towns were searched in Global Gazetteer Version 2.1, yielding for each language an elevation in meters. In some instances, sources give more than one town in which a language is spoken. The altitude determined for the language and used here is the average elevation of all the identified towns for a language. The 68 Mesoamerican languages range in altitude from 0 m (Huave) to 2,596 m (Trique) (see Table 3) and have an average elevation of 1,212 m.

Table 3 The 68 contemporary Mesoamerican languages rank-ordered by MAPA (mean average plant altitude) from smallest to greatest, with associated altitude (in m) at which each of the ­languages is spoken, and with the number of the 41 cultivated/protected plants named in the ­languages. (See text for an explanation of color shading)

A mean average plant altitude (MAPA) is given in Table 3 for each of the 68 Mesoamerican languages. MAPAs are calculated by summing the average altitudes for each of the 32 plants named in a language (see Table 2), and dividing the result by the number of the 32 plants named in the language. For example, Popoluca (Sayula) names 25 of the 32 plants. The sum of the average altitudes (in meters) of these 25 plants is 16,170, which divided by 25 yields a MAPA for Popoluca (Sayula) of 646.8. Also given in Table 3 is the altitude for each of the 68 languages. In the table, languages are rank-ordered by MAPA size, from smallest to greatest. MAPAs range in size from 595.9 (Nahuatl [Pajapan]) to 1071.5 (Mazahua).

The correlation coefficient for the association between MAPA and language altitude (see Table 3) is a robust 0.74. This statistic means that languages with larger MAPAs strongly tend to be spoken at greater altitudes than languages with smaller MAPAs. Thus, MAPAs predict with considerable accuracy the general elevation at which languages are spoken. With only one exception, Zoque (Rayón), languages with MAPAs ranging from 595.9 to 689.2 are spoken at or below 1,001 m (see yellow-shaded information in Table 3). MAPAs for these languages, then, indicate language location in “hot country” or lowland areas of Mesoamerica (lowlands are commonly regarded as extending from sea level to around 1,000 m). Conversely, all languages with MAPAs ranging from 727.8 to 1071.5 are spoken at or above 1,372 m (see blue-shaded information in Table 3), which indicates that these languages are located in “cold country” or highland areas of Mesoamerica (highlands are commonly regarded as extending upward from around 1,500 m). Languages with MAPAs showing an in-between range (691.0–727.2) are indeterminate with regard to general elevation (see information with no color shading in Table 3). Table 4 summarizes these observations.

Table 4 Association of MAPA and general elevation of languages

MAPAs calculated for proto-languages of Mesoamerica indicate the general elevations at which these prehistoric languages were spoken. For example, words for 15 of the 32 plants (see Table 2) have been reconstructed for Proto-Popolocan (see Table 1). The sum of the average altitude of these 15 plants is 13,286, which divided by 15 yields 885.7, Proto-Popolocan’s MAPA. This MAPA falls within the range of MAPAs for modern Mesoamerican languages (727.8–1071.5) that are spoken in highland areas (see Tables 3 and 4). Consequently, Proto-Popolocan’s MAPA indicates that the ancestral language had a highland homeland.

A MAPA based on the average altitude of only a few plants is not sufficient to indicate the general elevation of a proto-language. Four of the 30 Mesoamerican proto-languages show terms for only one of the 32 plants for which average altitudes are calculated: Proto-Otomanguean (avocado), Proto-Amuzgo-Mixtecan (avocado), Proto-Uto-Aztecan (tobacco), and Proto-Southern Uto-Aztecan (tobacco). Another ancestral language, Proto-Totonacan-Mixe-Zoquean, shows terms for only three of the plants (manioc, quintonil, and sweet potato).Footnote 16 Thus, setting these proto-languages aside, MAPAs are determined for only 25 of the 30 proto-languages.

Table 5 lists the 25 Mesoamerican proto-languages for which MAPAs are determined, rank-ordered by MAPA size from highest to lowest. How these MAPAs translate into general elevation for individual ancestral languages is also indicated. Among these 25 proto-languages, 13 are determined to have been spoken in the highlands, 11 in the lowlands, and general elevation of one language, Proto-General Aztec, is indeterminate.

Table 5 Twenty five Mesoamerican proto-languages for which MAPAs are determined, rank-ordered by MAPA (from greatest to smallest), with associated general elevation

General elevations for five of the 30 ancestral languages are indeterminate because of MAPA inadequacy. Nevertheless, the general elevation of one of these five, Proto-Otomanguean, seems relatively apparent. Proto-Otomanguean almost certainly had a highland elevation as all of its offspring proto-languages, for which general elevations are determined, with one exception, show highland homelands: Proto-Otopamean, Proto-Otomian, Proto-Mixtecan, Proto-Zapotecan, Proto-Popolocan, Proto-Mixtec, and Proto Zapotec (only Proto-Chinantecan was a lowland language) (see Table 5). In presentations that follow, Proto-Otomanguean is included among Mesoamerican ancestral languages determined to have been spoken in highland habitats.

Basic Analysis

“Basic analysis” refers to observation and description of unambiguous patterns in data, with little or no attention paid to their broader implications.

Table 6 combines information from Tables 1 and 5. It lists the 30 proto-languages, rank-ordering them by LD date from oldest to youngest. General elevation (if not indeterminate) is given for each proto-language, as is the number of plants of the sample of 41 for which terms are found. (The actual plants designated in each proto-language can be retrieved from Table 1).

Table 6 The 30 Mesoamerican proto-languages, ranked ordered by LD date from oldest to youngest, given with general elevation, and number of plants of the sample of 41 for which terms are found

Table 6 indicates that managed plants, whose names are reconstructed, have their earliest association with proto-languages spoken in highland areas of Mesoamerica. Proto-Otomanguean, a highland language, shows the oldest LD date, 7034 BP, for a proto-language for which names for plants reconstruct. Reconstructed names for managed plants do not appear in ancestral languages of the lowlands until considerably later. Proto-Chinantecan shows the oldest LD date, 2455 BP, for a lowland language for which plant names reconstruct, followed closely by Proto-Mayan, another lowland language, with an LD date of 2400 BP. Proto-Chinantecan and Proto-Mayan could have first begun to add names for these plants to their lexicons hundreds of years before these dates. Whenever the actual dates encoding commenced, names for the plants apparently began to be added to lexicons of lowland languages of Mesoamerica several millenia after such additions were first made to vocabularies of highland languages.

In Table 7, the average altitudes of plants (Table 2) are cross-tabulated against earliest LD dates attested for them in proto-languages (Table 1). Values of these two variables are given for (1) plants with average altitudes above or below 800 m, and (2) plants whose earliest LD date attestations are earlier or later than 3200 BP.

Table 7 Cross-tabulation of average altitudes of plants against earliest LD dates attested for them in ancestral languages of Mesoamerica

Table 7 shows a statistically significant (p < 0.01), very strong positive correlation (0.83) between plant altitude and earliest LD date of attestation. Plants with average altitudes greater than 800 m tend to show earliest LD date attestations that are older than 3200 BP. Ten of the 32 plants have average altitudes greater than 800 m. Of these, 8 or 80.0% show LD dates earlier than 3200 BP. Conversely, plants with average altitudes less than 800 m tend to have earliest LD date attestations younger than 3200 BP. Twenty-two of the 32 plants show average altitudes of less than 800 m. Of these 22 plants, 16 (or 72.7%) show LD dates later than 3200 BP. These statistics indicate that managed plants adapted to higher elevations tended to develop substantial importance for speakers of Mesoamerican languages before plants adapted to lower elevations.

Table 6 shows that names for only a very few of the 41 plants have been reconstructed for the oldest proto-languages. The oldest language with reconstructed plant names, Proto-Otomanguean (7034 BP), had only five; the next oldest, Proto-Amuzgo-Mixtecan (4868 BP), five; the next, Proto-Totonacan-Mixe-Zoquean (4387 BP), four; and the next, Proto-Uto-Aztecan (3712 BP), three. With the passage of time, more and more plants became lexically recognized. Table 1 shows that by 4387 BP the number of different plants for which names are reconstructed for Mesoamerican languages in general increased from the original five to nine; by 3612 BP the number grew to 16; and by 3208 BP, to 21. Thus, from 7000 BP to around 3200 BP, a little more than half of the 41 plants became salient enough so that their names can now be reconstructed for Mesoamerican proto-languages.

From around 3200 BP to 2400 BP, 19 more plants developed substantial significance (as attested by plant name reconstruction). Thus, by 2400 BP, plant names can be reconstructed for a total of 39 or 95% of the 41 managed plants (see Table 1). It appears, then, that sometime between 4000 BP and 3000 BP commenced something of an explosion in the number of managed plants that became especially salient to speakers of Mesoamerican languages. This explosion is graphically apparent in Fig. 1, which shows the association of number of named cultivated/protected plants in Mesoamerican languages in general with years BP.

Fig. 1
figure 1_3

Association of the number of cultivated/protected plants named in Mesoamerican proto-languages in general with years before present

In summary, linguistic evidence indicates that sometime before 7000 BP some of the 41 managed plants began to develop considerable importance for speakers of languages in highland regions of Mesoamerica. At first, only a few such plants were involved, the oldest language for which terms for the plants reconstruct, Proto-Otomanguean, showing words for only 5. From 7000 BP to 3200 BP, managed plants known to speakers of other languages of highland areas steadily increased in salience, but at a relatively slow pace, such that names can be reconstructed for only about half of the 41 plants in proto-languages spoken before 3200 BP (see Table 1). The next 800 or so years, from around 3200 BP to 2400 BP, witnessed several major developments: (1) The pace at which cultivated/protected plants increased in salience accelerated substantially, such that by the end of this relatively short period of time reconstructed names for 95% of the 41 plants are pertinent to proto-languages of the region; (2) for the first time, cultivated/protected plants became especially important to speakers of lowland languages; and (3) most of the plants for which names are reconstructed for proto-languages of this period are those with average altitudes associated with lowland areas.

Expanded Interpretation

“Expanded interpretation” refers to analysis that takes into consideration broader implications of assembled linguistic data for the prehistory of agriculture in Mesoamerica than does the basic analysis. Conclusions reached in this section should be considered somewhat more tentative than those of the preceding discussion.

The linguistic evidence for the earliest plant management in Mesoamerica is the occurrence of terms for avocado, maguey, maize, nopal, and squash in the lexical inventory of Proto-Otomanguen.Footnote 17 This ancestral language was spoken at the latest some 7,000 years ago, probably somewhere in the highland area of southwestern Mexico where many Otomanguean languages are spoken today. Linguistic evidence does not preclude the possibility that these five plants were managed by peoples of Mesoamerica at a much earlier time.

For a period of about 3,800 years, from around 7000 BP to 3200 BP, highland groups slowly, but steadily, added managed plants to their inventories of important botanical resources. By around 3200 BP, 16 cultivated/protected plants in addition to the five noted above had become familiar and important to highland peoples of Mexico, who spoke Proto-Mixtecan, Proto-Zapotecan, and Proto-Otopamean. These were anona, black sapote, cacao, chayote, chili pepper, common bean, cotton, epazote, mamey, manioc, quintonil, sweet potato, tejocote, tobacco, tomate, and white sapote.Footnote 18

Circa 3200 BP,Footnote 19 and thereafter, major developments in Mesoamerican agriculture occurred. First, the pace at which managed plants developed substantial salience for people accelerated substantially. From the latter date to 2400 BP, groups had increased the Mesoamerican inventory of especially important managed plants to nearly twice as many as were known earlier to people of the region. These plants were achiote, chicozapote, chipilin, common purslane, copal tree, coyol palm, cuajinicuil, guacimo, guava, hog plum, jonote, lima bean, nanche, pacaya palm, papaya, pineapple, pitahaya, ramon, and zapote amarillo. The vast majority of the latter plants grow most successfully and abundantly in the lowlands where many were probably first cultivated or protected (this is indicated by the average altitudes for these plants given in Table 2).

These botanical additions reflect a second major agricultural event that occurred after circa 3200 BP, the development of plant management as a primary means of food procurement for groups of lowland Mesoamerica. Proto-Chinantecan and Proto-Mayan provide the earliest evidence for this development (see Table 6). These two ancestral languages, spoken at the latest around 2400 BP, both had lowland homelands probably located somewhere in the Gulf/Caribbean coastal plain of Mesoamerica where some of their offspring languages are spoken today. Table 6 shows that 24 of the 41 plants were named in Proto-Chinantecan and 32 in Proto-Mayan.

As discussed above, Proto-Chinantecan’s parent language, Proto-Otomanguean, was almost certainly spoken in a highland habitat, this suggesting that speakers of pre-Proto-Chinantecan probably moved from the highlands to the lowlands bringing with them an agrarian technology originally honed in cold country. As no Mesoamerican proto-language has been identified as ancestral to Proto-Mayan, we have no indication where their immediate ancestors might have been located. However, given linguistic indication of the beginnings of Mesoamerican agriculture in the highlands, pre-Proto-Mayan peoples, like speakers of pre-Proto-Chinantecan, might have moved from a highland area to the lowlands. It is also possible that the immediate ancestors of Proto-Mayan speakers were never situated in the highlands, and that Proto-Mayan speakers acquired at least some of their agrarian technology from highlanders through contact. This scenario would not preclude independent development of lowland agricultural resources by these people.

Whatever the details, by around 2400 BP speakers of both Proto-Chinantecan and Proto-Mayan practiced agriculture in lowland Mesoamerica, and probably had done so for hundreds of years preceding 2400 BP. The relatively large numbers of cultivated and protected plants named in these ancestral languages suggest that speakers of both languages lived in settled farming communities, perhaps the earliest such settlements found in the Gulf/Caribbean coastal plain of Mesoamerica.

Proto-Mayan’s inventory of named managed plants is considerably larger than that of Proto-Chinantecan, 32 compared to 24. The next largest inventory for a proto-language older than Proto-Chinantecan (2455 BP) and Proto-Mayan (2400 BP) is that of Proto-Popolocan (2659 BP), a highland language with names for 21 of the 41 plants (see Table 6). No older proto-language shows more than 15 named plants (see Table 6). In addition, only one of the 30 proto-languages, Proto-Tzeltalan (870 BP), shows a larger inventory than that of Proto-Mayan (34 vs. 32, see Table 6). The average number of plants named in the 68 contemporary Mesoamerican languages surveyed for this study is 26.6 (Table 3), thus Proto-Mayan’s inventory is clearly large even by the modern Mesoamerican standard. The large size of Proto-Mayan’s assemblage of named managed plants robustly suggests that its speakers were fully engaged in a village-farming way of life, probably surpassing in size and sophistication that enjoyed by their lowland contemporaries, speakers of Proto-Chinantecan.Footnote 20

Eight other lowland proto-languages emerged after around 2400 BP. Four of the latter are daughter languages of Proto-Mayan: Proto-Yucatecan (972 BP) and Proto-Greater Tzeltalan (1565 BP), and the latter language’s two immediate daughter languages, Proto-Cholan (1223 BP) and Proto-Tzeltalan (795 BP). Speakers of these four languages were lowlanders (see Table 6) all of whose ancestors had probably been lowlanders since Proto-Mayan times. Modern descendants of speakers of these languages, with the notable exception of Proto-Tzeltalan, still occupy lowland areas.

Four other languages of the eight are non-Mayan: Proto-Totonac (1081 BP) and Proto-Core Mixe-Zoquean (1492 BP), and the latter language’s two immediate daughter languages, Proto-Zoque (1081 BP) and Proto-Mixe (967 BP). Proto-Totonac is an immediate descendant of a highland language, Proto-Totonacan (1598 BP). Thus, speakers of pre-Proto-Totonac moved from a highland habitat to the lowlands. The three other lowland proto-languages are descended from prehistoric languages that are indeterminate with respect to general elevation of homeland. Consequently, while it is possible that ancestors of speakers of these lowland languages moved into hot country from highland locations, this migration cannot be confidently proposed at present.

Two dialects of Proto-Mayan developed into immediate daughter languages that remained in the lowlands, i.e., Proto-Greater Tzeltalan (1565 BP) and Proto-Yucatecan (972 BP). Speakers of other Proto-Mayan dialects left the lowlands and migrated to the highlands. These migrations could have occurred several 100 years before the breakup of Proto-Mayan (2400 BP). The first to move to the highlands were ancestors of speakers of Proto-Eastern Mayan (1614 BP) whose homeland almost certainly was in Highland Guatemala where all of its modern offspring languages are now spoken. (Proto-Eastern Mayan developed into Proto-K’ichee’an [1342 BP] and Proto-Mamean [1450 BP], both of which had highland homelands [see Table 6].) This migration was followed by that of ancestors of speakers of Proto-Greater Q’anjob’alan (1406 BP) who moved from the lowlands into the highlands of Chiapas (Mexico) and abutting highland areas of Guatemala. Today all speakers of Greater Q’anjob’alan languages are confined to the highlands. Finally, in relatively recent times, ancestors of speakers of modern Tzeltal and Tzotzil moved into Highland Chiapas. (Proto-Tzeltalan [795 BP], the immediate parent language of Tzeltal and Tzotzil, was spoken in the lowlands [Table 6]).Footnote 21

By 1968 BP, names of 40 of the 41 cultivated or protected plants of the sample are reconstructed for at least one Mesoamerican ancestral language (see Table 1). One of the 41 plants, annual sunflower, is not named in any proto-language of the region, suggesting that it became part of the general Mesoamerican inventory of managed plants only in very recent times. The plant may have even been a post-contact introduction to the region.

Conclusion

Using linguistic evidence, the development of agriculture in prehistoric Mesoamerica is investigated. This evidence indicates that avocado, maguey, maize, nopal, and squash were among the first plants to have been cultivated or protected by native Mesoamericans. This occurred, at the latest, by around 7000 BP somewhere in the highlands of Mesoamerica, probably in southwestern Mexico. From 7000 BP to 3200 BP, Mesoamericans of highland areas slowly but steadily added other managed plants to their inventories of important botanical resources. These included anona, black sapote, cacao, chayote, chili pepper, common bean, cotton, epazote, mamey, manioc, quintonil, tejocote tobacco, tomate, sweet potato, and white sapote. Beginning circa 3200 BP, the pace at which additional cultivated/protected plants developed in importance accelerated substantially. During the relatively brief period of around 800 years from circa 3200 BP to 2400 BP, the number of managed plants acquiring considerable importance for Mesoamericans nearly doubled. Also, at this time, people of lowland areas began to intensely manage useful plants. Plants gaining substantial importance at this time were, for the greatest part, those adapted to lowland habitats. These included achiote, chicozapote, chipilin, common purslane, copal tree, coyol palm, cuajinicuil, guacimo, guava, hog plum, jonote, lima bean, nanche, pacaya palm, papaya, pineapple, pitahaya, ramon, and zapote amarillo. By around 2000 BP, the modern Mesoamerican assemblage of important managed plants was all but fully established in the region, with the exception only of annual sunflower.

The first Mesoamericans of the lowlands for which linguistic evidence of plant cultivation and protection exists were speakers of Proto-Chinantecan and Proto-Mayan. These co-contemporaneous ancestral languages were spoken at the latest around 2400 BP, probably in areas of the Gulf/Caribbean coastal plain of Mesoamerica.Footnote 22 Names for large numbers of managed plants are reconstructed for these ancestral languages indicating that their speakers lived in settled agricultural communities, perhaps the earliest such communities of the coastal plain. The especially large plant inventory reconstructed for Proto-Mayan suggests that its speakers had a full village-farming way of life.