Keywords

1 Introduction

The American continents have a unique history when considering human dispersal all over the world. There is no evidence that any other hominids lived or evolved in the American continents prior to the arrival of the first Americans during the glacial age. It is thought that American continents received several migration waves of Native American ancestors from the Eurasian continent; entry and settlement were not a single event (Reich et al. 2012). According to the most accepted theory, the first American entered the American continent at the end of the glacial age approximately 15,000 years ago (15 KYA), and then dispersed across both continents within 2000 years, which is an abnormally short period of time for human migration during the prehistoric age. Due to the recent and rapid migration, there was not enough time to undergo and establish evolutionary alterations to the genomes of Native Americans within the American continents. As such, research is focused on: (1) the Eurasian continent of origin; (2) migration routes for dispersal across the American continents; and (3) how Native Americans lived and adapted to their new environment.

After settlement in the American Continents, and through the Stone Age, Native Americans produced some highly developed civilizations by the Age of Exploration, around the fifteenth to the seventeenth century. Facing expeditions from European countries of the age, some of these civilizations lost their independence and identity. In addition, an enormous number of people were transported from Africa to the Americas through the slave trade. Recent globalization at the end of the twentieth century has integrated indigenous peoples and cultures into the global economy. Because of these complex histories, American continents are racially and culturally more diverse than other areas in the world, and ethical and human rights problems related to unresolved racial issues still persist. It is, in other words, difficult for Native Americans to maintain their original cultures and their pedigrees as indigenous populations.

Release of the human genome draft sequence in 2001 (Lander ES et al. 2001) and progress of the International Hap Map Project (The International HapMap Consortium 2005; The International HapMap Consortium 2007; The International HapMap 3 Consortium 2010) that has identified SNPs within the human genome, dramatically changed the approaches and strategies for human genome analysis. We are now conducting this research in the midst of the era of Post-Human Genome Sequencing. Before the completion of whole genome sequencing, it was difficult to identify genomic loci to be analyzed because we did not know precisely which loci retained polymorphisms, or how diverse the loci were among individuals or tribes or races. Highly polymorphic regions of the mitochondrial genome, the HLA region, or genomic region coding cytochrome gene family (CYPs), have been traditionally used for comparative analyses. Entering into the Post-Human Genome Sequencing era, we already have information about polymorphic loci, and the linkage pattern of each region within the human genome passed down for generations. As analysis platforms, DNA microarrays, microsatellite probes, or even genomic sequences obtained by Next Generation Sequencers (NGSs) are supplied by companies and public databases, researchers can do whole genome analysis, either by themselves or through outsourcing. With this increased access to substantial genomic resources, many scientists without a specialization in human genetics have entered into this field of research. The influx of new perspectives and techniques has helped to further progress in the field, but there are limitations with respect to interpretation of these data. Rigorous interpretation requires deep insight into the history and relationships among the tribes to get an overview of genetic admixture at sites that is crucial for deciphering results of genome analyses. These limitations should be considered when reviewing the literature. Moreover, because of the complex history, serious ethical issues related to unresolved racial issues still persist as described above. When we analyze genomes of Native Americans, we should consider these ethical issues carefully.

2 Peopling of the American Continents During the Prehistoric Age

The American continents are the farthest destination of the Great Journey of Homo sapiens since leaving Africa. All human skeletal remains found in the Americas are anatomically modern Homo sapiens, and there is no evidence that other hominids had lived here before the modern humans entered the continents. The entry of modern Homo sapiens onto the North American continent was estimated at 15 KYA, a timeframe too recent for evolutionary change to have occurred locally. Therefore the main research objectives are:

  • From what part of the Eurasian continent did Native Americans originate?

  • What routes did they travel to disperse over American continents?

  • How did they establish their lives and adapt to their new environment?

2.1 Environment

The first Americans entered Alaska during the last ice age. At that period, the Bering Strait between Alaska and Eastern Siberia was land-bridged because the sea level was far lower than at present. This land bridge is called Beringia and it joined the two continents between 65 and 36 KYA. The presence of Beringia was affected by sea level, and the Bering Strait opened again between 30 and 13 KYA. It is thought that Beringia was not covered with ice but was tundra, allowing for the presence of many animals that could be hunted, including mammoth, bison, horses, and camels, enabling humans to adapt and survive in this environment by hunting and walking across Beringia from the Eurasian continent to the North American continent (Reviewed in Human Evolutionary Genetics).

The other factor affecting human migration was the ice sheets covering the North American continent. About 40 KYA, the Cordilleran (western area) and Laurentide (eastern area) ice sheets covered most areas of North Canada. The height and areas occupied by the ice sheets, like the sea level, also changed during the last ice age. Animals, including humans, could not pass through the ice sheets when they grew. In the warmer period, the volume of the ice sheets decreased enough to open ice-free corridors along the Pacific coast and Plains east of the Canadian Rockies, and many species spread from Alaska to the south, or vice versa. Precise calculation of timing for the opening of the coastal and interior corridors is still difficult because various Cordilleran glaciers behaved differently. Figure 11.1a indicates the fluctuation of sea levels during the past 24 KY since the last glacial period. At the “last glacial maximum,” sea level was lower than present by approximately 120 m. Sea level did not rise constantly with the passage of time, but instead rose rapidly every 400–500 years during a “Meltwater Pulse.” Sea level rose between 16 and 25 m during each Meltwater Pulse. Meltwater Pulse 1A was one of the highest rates of post-glacial sea level rise. It is thought that the coastal corridor opened by at least 15 KYA, whereas the interior corridor opened between 14 and 13.5 KYA (Mandryk et al. 2001; Dyke 2004) (Fig. 11.1b).

Fig. 11.1
figure 1

Fluctuation of sea levels, and formation of the Beringia land bridge and ice-free corridors. (a) Sea level change since the last glacial age. Image created by Robert A. Rohde/Global Warming Art (https://commons.wikimedia.org/wiki/File:Post-Glacial_Sea_Level.png). (b) Formation of the Beringia and ice-free corridors. White areas indicate glaciers. On the North American continent, the inland part between the two large glaciers and the pacific coast were ice-free during warmer periods as described in the text

2.2 First Humans of the American Continents

Before migration into Beringia, humans had to adapt and establish their lives in the severe environments of the Arctic area (Siberia in the Eurasian continent). By 32 KYA, people survived using some stone artifacts. The Yana Rhinoceros Horn site, which is located along the Yana River in the northwest of Beringia, provides some historical information of this time (Pitulko et al. 2004). The earliest well-verified archeological evidence was found in central Alaska at Swan Point, from eastern Beringia approximately 14 KYA (Holmes and Crass 2003). The features of their artifacts are similar to the ones found in the late Upper Paleolithic remains in central Siberia.

Representative examples for sites containing Paleo-Indian remains are shown in Fig. 11.2. Most assessed remains on the south side of the Canadian ice sheets were Clovis dated 13.2-13.1 KYA, indicating their presence in the North American continent as mobile hunter-gatherers (Haynes 2002). Simultaneous existence of Clovis cultures across the North American continent indicates that they expanded around the continent quite rapidly, within a few hundred years. They hunted mammoth and mastodon regularly, and it is thought that they were associated with the extinction of these big animals in the North American continent. The Clovis culture, however, was soon taken over by other styles of artifacts. In Central and South America, in contrast, few Clovis remains have been found (Morrow 2006). Many researchers now accept that Clovis and other people(s) lived in the American continents by 13 KYA. Because of the abundance of their remains, the “Clovis-first” hypothesis has persisted, which suggests that the Clovis people were the first humans to colonize the American continent, and then migrated south through an ice-free corridor. Was Clovis really the first American? Although some remains have been found in both American continents, the reliability is low for most of them. Among them, the Monte Verde site in Chile, from 14.6 KYA, is widely accepted as a pre-Clovis site (Dillehay 1997) and supports a model of pre-Clovis migration along the Pacific coast (Reviewed by Dixon 2001). In addition, a variety of artifacts have been found in these areas, both on the Pacific coastal side and in the Amazonian area (Fig. 11.2). For example, the remains from the cave “Caverna da Pedra Pintada,” near Monte Alegre in Amazonian Brazil, contain rock paintings and biological remains estimated from 13.2 to 12.5 KYA (Reviewed in Human Evolutionary Genetics 2nd Edition). Styles of artifacts in the Central and South Americas are distinct from Clovis, and they took place at roughly the same time.

Fig. 11.2
figure 2

Location of archeological remains mentioned in the text

Another question related to Paleo-Indian evolutionary history is whether they are ancestors of recent Native Americans or not. This will be discussed later in Sect. 11.3.

3 Genetic Analysis of Contemporary Native Americans

“The history of the earth is recorded in the layers of its crust; the history of all organisms is inscribed in the chromosomes” (Reviewed by Crow JF 1994). This is the word of world famous plant geneticist Dr. Hitoshi Kihara, who established the concept of “genome” as the minimum set containing all the essential genes. In the former section, we discussed the remains of civilizations recessed in layers of earth. In this section, we will consider the history of humans inscribed in our genome.

There are three different genomic regions used to analyze the history of humans: mitochondrial DNA (mtDNA), Y-chromosome markers, and autosomal chromosomes. As paternal mitochondria are selectively eliminated after fertilization in most animals including humans (Reviewed by Song et al. 2014), maternal lineage of a person can be tracked with mtDNA. Paternal lineage can be traced with the Y-chromosome because females do not have a Y-chromosome. Both mtDNA and Y-chromosome are small in size; therefore, polymorphic regions had been identified long before the completion of human whole genome sequencing and have been extensively used for genetic analysis. Some regions on the autosomal chromosomes were also highly polymorphic, and were utilized in addition to mtDNA and the Y-chromosome for investigations of the human genome. Those regions include HLA, cytochrome P450 genes (CYP), and repeat numbers in microsatellites or minisatellites. After whole genome sequencing was complete, a number of additional polymorphic markers were identified and accuracy of genome analysis was improved. Since then, along with the International Hap Map data for SNPs (The International HapMap 3 Consortium 2010), differences inscribed within the entire genome are potential targets for genetic analysis.

All three genome resources indicate that contemporary Native Americans came from Asia (Merriwether 2006; Karafet et al. 2006). The genome analysis of Native Americans, however, retains some serious problems. The first one originates as a regional issue. Native Americans living near a big city or civilized areas, with constant contact with other cultures and tribes and races, tend to admix and lose their genetic purity. Those who are pure Native American live in isolated regions that are hard to access, making it difficult to collect biomaterial as a source of DNA samples. Globalization in recent years has opened up isolated areas, resulting in the movement of young people to big cities and, consequently, old villages are gradually disappearing. Admixture with outside world people is inevitable in globalization. The second problem is associated with historical issues specific to American continents. Contrary to racial clashes and ethnic feuds among people living in neighboring areas of the Eurasian continent, completely different groups of people were coming to the American continents from other continents carrying far advanced weapons, around the fifteenth to the seventeenth century. These movements caused destruction of some Native Tribes and promoted genetic admixture of the Native Americans with those people. When we analyze the genomes of Native Americans, we should take care to remove the samples that have undergone such “artificial” admixture and select the “pure genome” as much as possible. Based on this perspective, genome resources of Native Americans obtained from the USA are not commonly utilized for genome analysis.

3.1 Genome Analysis with mtDNA and the Y Chromosome

Recent analysis showed that mtDNA of all humans can be tracked back to only one sequence. This sequence is referred to as “mitochondrial Eve,” which is the maternal most recent common ancestor (MRCA) for all modern humans (van Oven and Kayser 2009; Behar et al. 2012; http://www.phylotree.org/tree/main.htm). Similar to mtDNA, the origin of the Y chromosome also can be traced back to a common ancestor (International Society of Genetic Genealogy, http://www.isogg.org/). During a migration that covered tens of thousands of years after leaving Africa, mutations have been accumulating within genomic DNA including mtDNA and the Y chromosome. The route of migration can be visualized as a phylogenetic tree of mtDNA and Y chromosome lineage as shown in Figs. 11.3a and 11.4a (http://www.phylotree.org/tree/index.htm, http://www.scs.illinois.edu/~mcdonald/WorldHaplogroupsMaps.pdf for mtDNAs; http://www.isogg.org/tree/ISOGG_YDNATreeTrunk.html, http://www.scs.illinois.edu/~mcdonald/WorldHaplogroupsMaps.pdf for Y chromosomes).

Fig. 11.3
figure 3

Variation of mtDNA. (a) mtDNA lineage from MRCA. (b) World distribution of mtDNA haplogroup. These figures are reprinted with permission from the copyright holder. The original figures are in the “WorldHaplogroupsMaps” (http://www.scs.illinois.edu/~mcdonald/WorldHaplogroupsMaps.pdf)

Fig. 11.4
figure 4

Variation of Y chromosome markers. (a) Y chromosome lineage from common ancestor. (b) World distribution of Y chromosome haplogroup. These figures are reprinted with permission from the copyright holder. The original figures are in the “WorldHaplogroupsMaps” (http://www.scs.illinois.edu/~mcdonald/WorldHaplogroupsMaps.pdf)

A group of people sharing the same series of mutations on their mitochondrial genome or Y chromosome is called as haplogroup. Recent research revealed precise world distribution of haplogroups of mtDNA and the Y chromosome (Figs. 11.3b, 11.4b). For mtDNA, haplogroup L is only observed in Africa, whereas the distribution pattern of haplogroups in the Eurasian continent is complex. Modern Native Americans retain simple distribution patterns compared to the Eurasian complexity; these include mtDNA haplogroups A, B, C, D, and X, all of which are found among indigenous peoples in southern Siberia, from the Altai to Amur region (Derenko et al. 2001; Starikovskaya, et al. 2005; Zegura et al. 2004a, b). Within these haplogroups, three subclades of C1 sub-haplogroup are widely distributed among North, Central, and South America, whereas they are absent in Asia. This suggests that the subclades were established after settlement of the American continents (Tamm et al. 2007). Distribution of the Y chromosome shows patterns similar to mtDNA; haplogroups A, B, E are mainly observed in Africa, and Europe shows a complex pattern of haplogroups. The proportion of Native Americans retaining haplogroups C and Q is large; the genetic variation of the Y chromosome is extremely small even when compared to the diversity in Asia. Haplogroup Q is not observed in other areas except northeast areas of the Eurasian continent. Moreover, as a result of admixture with European colonists and African slaves, persons retaining haplogroups R and E, which are dominant in Europeans and Africans, respectively, are commonly observed (Zegura et al. 2004a, b; Stefflove et al. 2009).

Analysis of remains and skulls of Paleo-Indians helped develop a hypothesis that they came to the American continent(s) first, and was then replaced by ancestors of modern Native Americans (Swedlund and Anderson 1999; Owsley and Jantz 2001). It has long been considered that contemporary Native Americans were established from several waves of prehistoric migrations (Greenberg et al. 1986a, b). Genetic data, however, do not support these hypotheses. As all major haplogroups of mtDNA and Y-chromosome in Native Americans originated from central Asia and all haplotypes share a coalescent date, modern Native Americans might have spread from a single ancestral gene pool (Merriwether 2006; Zegura et al. 2004a, b; Wang et al. 2007). Recent analysis of mtDNA further suggests that current Native Americans were from a founding population of less than 5000 individuals (Kitchen et al. 2008).

3.2 Genome Analysis with Nuclear DNA

During the last 10 years, genome-wide analysis of nuclear DNA has been feasible due to the release of whole genome sequence of humans and accumulation of SNP data by the effort of Hap Map projects. The first genome-wide analysis of Native Americans was reported by Wang in 2007, using 678 microsatellite loci (Wang et al. 2007). Following the microsatellite analysis, genome-wide analysis with approximately 365,000 SNPs was also reported (Reich et al. 2012).

The results of genome-wide SNP analysis showed that all Native Americans and northeast Siberians (Chukche, Naukan, and Koryak) diverged from the Asian population (Fig. 11.5). This model is consistent with mtDNA analysis (Kitchen et al. 2008). They also revealed that the Arctic population first separated from an Asian population, then northern North American, North American, Southern Mexican, Lower Central American, and finally South American populations branched off one by one. This suggests that prehistoric migration occurred in a direction from north to south. Genetic variation data also support this migration pathway because the variation is reduced relative to the distance from the Bering Strait. Contrary to the generally accepted hypothesis in which the ancestors of Native Americans passed through ice-free corridors toward the south, as mentioned in Sect. 11.2.2, genome-wide SNP data showed that the correlation between genetic diversity and the distance from the Bering Strait is most parsimonious if ancestors migrated south along the coastline.

Fig. 11.5
figure 5figure 5

Phylogenetic tree to search ancestors of Native Americans. (a) The locations from where samples were collected. (b) Phylogenetic tree to search ancestors of Native Americans. These figures are reprinted with permission from the copyright holder. The original figures are in the article, Reich et al. Nature, 488, 370, 2012 (doi: https://doi.org/10.1038/nature11258)

Do genome-wide analyses provide information on the number of prehistoric immigration waves to the American continents? Contemporary Native Americans have a wide variety of languages as shown in Fig. 11.6, including over 200 dialects. These languages are divided into three major independent language groups: Eskimo-Aleut (the Arctic area including Alaska, Canada and Greenland), Na-Dene (mainly in Canada), and Amerind (mainly in South America and the USA). Based on this classification, many linguists consider three waves of migrations (Greenberg et al. 1986a, b). Analysis of mtDNA or the Y chromosome has not yet clarified whether there was only a single or multiple migration waves. Genome-wide SNP data described above supported a model of at least three episodes of gene flow into the American continents from the Asian population; nearly 90% of Amerind tribes analyzed originated from only “the First American,” whereas 50% of the ancestors of Eskimo-Aleut were identified as a second wave of gene flow from Asia. Interestingly, 10% of the genome of Chipewyan living in Canada using Na-Dane language consists of a third wave. These admixture patterns are consistent with the geographic location and the distribution of the language groups. The higher admixture rate at the Arctic area may be the result of a backward migration from the North American continent to northeast Siberia.

Fig. 11.6
figure 6figure 6

Language groups in the North and the South American continents. (a) Language groups in the North American continent. (b) The South American continent. These figures are reprinted with permission from the copyright holders. The original figures are in “Sekai minzoku jiten,” Edited by Tsuneo Ayabe, Kobundo, Tokyo, 2000; written in Japanese. A contains a slight modification to the original figure

Compared to the dynamic genetic admixture around the Arctic area and northern part of the North American continent, gene flow among Central and South American tribes is very low. Of special note in these areas are Chibchan, a Meso-American population, and the genetic flow from other continents. Chibchan is one of the language families in South America and people that speak this language live on both sides of the Isthmus of Panama. Their genomes retain both Northern and Southern genetic characteristics, suggesting admixture of both populations by a “back migration” after establishment of subpopulations. In the case of Meso-American populations, they have an extremely low level of genetic drift, indicating a larger effective population size since the settlement by ancestors of Native Americans. In contrast to the Meso-American population, South American tribes in the coastal region retain genetic features of both Europeans and Africans due to admixture with those people after the Age of Exploration around the fifteenth century.

The progressive advances in NGSs in recent years enable us to sequence genomes of ancient humans in large scale. NGSs are a powerful tool to analyze genetic alteration directly in chronological order. In fact, whole genome sequencing of a child (MA-1), who lived 24 KYA in Mal’ta, south-central Siberia, was recently completed (Raghavan et al. 2014). The sequence data showed that the MA-1 genome did not originate from eastern Asia but was closer to western Eurasians and Native Americans. These results strongly suggest that the origins of Native Americans are not only eastern Asia but also south-central Siberia. Present calculations estimate that 14–38% of the genome of Native Americans has derived from south-central Siberians, and that the gene flow occurred after Native Americans branched from Asian populations. In addition to this discovery, other exciting results were obtained from another ancient human genome. An adult male (named Anzick-1), who lived in Anzick around 12,707–12,556 calendar years BP (Before Present) using Clovis tools, demonstrated the same gene flow patterns as MA-1 even though the gene flow from south-central Siberia was 12.6 KYA (Rasmussen et al. 2014). Interestingly, Anzick-1 was genetically closest to Native Americans than other modern people. The results of lineage analysis among Northern Amerind, Southern Amerind, and Anzick-1 revealed that divergence between Northern and Southern Amerind was much older than the divergence of Anzick-1. How much alteration have the genomes of Native Americans experienced since they branched from Asian populations? We do not have the exact answer for this question at present.

When we focus on adaptation events, there is no evidence of evolutionary change specific to Native Americans. It is likely that the time period since their ancestors migrated to the American continents is too short for any genetic alterations to have occurred and persisted in the population. The current progress of NGS analysis will unveil many events that we have not been able to address so far.

The field of Human Genetic Research is now entering a new phase: mining large datasets. Consequently, many scientists are entering into this research field with backgrounds far from Human Genetics, as mentioned in Sect. 11.1. However, deep historical insight in combination with a holistic vision of the relationships among tribes is essential to decipher the results of genome analyses. We should not forget this important point or the data would mislead us.

4 Genetic Complexities and Ethical Issues on Genome Analysis of Native Americans

Genome data are necessary for analysis of population genetics. Generally, population geneticists utilize genome data deposited into public databases, or collect human biomedical materials by themselves on site. Collection of samples presents some challenges; it takes a long time, sometimes exposes the collector to dangers, requires laborious negotiation for getting informed consent, requires special handling and equipment to maintain the samples in good condition for extraction of genomic DNA, etc. To avoid these problems and facilitate access to genome data, some public resource centers release a wide variety of population samples for research in recent years, which includes genomic DNAs, cell lines, and other biomaterials. The Human Genome Diversity Project Cell Line Panel (HGDP-CEPH ; http://www.cephb.fr/en/index.php), Coriell Institute (https://catalog.coriell.org/1/NHGRI), and RIKEN BioResource Center (RIKEN BRC; http://www.brc.riken.jp/lab/cell/english/), for instance, maintain a vast collection of various populations. These resource centers receive resources from the researchers who collected them, preserve the quality, and provide the resources to other researchers working for academic research institutes.

Consideration of genetic background must be integrated into genome analyses; for example, we must consider whether target individuals or tribes are racially pure or mixed-breed, and when admixture occurred if they are not pure. Genetic admixture is closely linked with history. Genetic complexity is higher in the American continents than in the Eurasian continent. Many people retain genetic background from both Native Americans and Europeans, and even from Africans. It is due to their long and complex history, and this genetic complexity makes genome analysis difficult and confusing.

Native Americans had developed several cultures in both the North and the South American continents thousands of years ago. Highly developed civilizations with large cities were built by the sixteenth century, especially in Central and South America, during the Age of Exploration when the Spanish and Lusitanian explorers intruded into American continent. Then, the Europeans conquered the American world and imported African slaves as labors, established a social hierarchy of victor and vanquished. During this time, a large number of Native Americans died from diseases brought from the Old World and that the Native Americans had never been exposed to before. Because of this history, racial mingling has progressed gradually over many generations.

Within the history of the American continents, Native Americans have often been the targets of racial discrimination (Summarized by Burger 1990). We should be very careful that the results of genome analyses will not be used to promote discrimination. Some organizations and resource centers take countermeasures against these ethical issues; HGDP-CEPH will not provide their resources to profit-oriented organizations and users, the Coriell Institute requests that users disclose the purpose of their research so that it can be evaluated by an ethical committee outside the Coriell Institute. In the case of RIKEN BRC, the office requests evidence of approval from an ethical review committee from the institutions of the users. As an individual researcher, we should disclose the purpose of each project to the donors who provide biomaterials used for genome analysis, and then obtain their written consents. Informed consent is considered one of the methods available to respect the human rights of donors.