SUNDAY REVELATIONS: When did the Aryans come to India?

It is difficult to deduce the direction of migration either into India or out of India during the Bronze Age

On June 17, The Hindu published an article by Tony Joseph (“How genetics is settling the Aryan migration debate”) on current genetic research in India and stated that “scientists are converging” on the Aryan migration to the Subcontinent around 2000-1500 BC. This conclusion was mainly based on the results obtained from the paternally inherited markers (Y chromosome), published on March 23, 2017 in a scientific journal, BMC Evolutionary Biology, by a team of 16 co-authors including Martin P. Richards of the University of Huddersfield, which compiled and analysed Y chromosome data mainly from the targeted South Asian populations living in the U.K. and U.S. However, anyone who understands the complexity of Indian population will appreciate that Indians living outside the Subcontinent do not reflect the full diversity of India, as the majority of them are from caste populations with limited subset of regions.


A recent paper by Dhriti Sengupta and colleagues (‘Genome Biology and Evolution 2016’; 8:3460-3470), showed that the South Asian populations included in the “1000 Genomes Project” under-represent the genomic diversity of the Subcontinent. Tribes are one of the founding populations of India, any conclusion drawn without studying them will fail to capture the complete genetic information of the Subcontinent.

Marina Silva/Richards et al. argued that the maternal ancestry (mtDNA) of the Subcontinent is largely indigenous, whereas 17.5% of the paternal ancestry (Y chromosome) is associated with the haplogroup R1a, an indication of the arrival of Bronze Age Indo-European speakers. However, India is a nation of close to 4,700 ethnic populations, including socially stratified communities, many of which have maintained endogamy (marrying within the community) for thousands of years, and these have been hardly sampled in the Y chromosome analysis led by Silva et al., and so do not provide an accurate characterisation of the R1a frequencies in India (several tribal populations carry substantial frequency of haplogroup R1a).

Equally important to understand is that the Y chromosome phylogeny suffered genetic drift (lineage loss), and thus there is a greater chance to lose less frequent R1a branches, if one concentrates only on specific populations, keeping in mind the high level of endogamy of the Subcontinent. These are extremely important factors one should consider before making any strong conclusions related to Indian populations. The statement made by Silva et al. that 17.5% of Indians carry R1a haplogroup actually means that 17.5% of the samples analysed by them (those who live in U.K. and U.S.) carry R1a, not that 17.5% of Indians carry R1a!

 Genetic affinities

Indian genetic affinity with Europeans is not new information. In a study published in Nature (2009; 461:489-494), scientists from CSIR-Centre for Cellular and Molecular Biology (CCMB), Hyderabad, and Harvard Medical School (HMS), U.S., using more than 5,00,000 autosomal genetic markers, showed that the Ancestral North Indians (ANI) share genetic affinities with Europeans, Caucasians and West Asians. However, there is a huge difference between this study and the study published by Silva et al., as the study by CSIR-CCMB and HMS included samples representing all the social and linguistic groups of India. It was evident from the same Nature paper that when the Gujarati Indians in Houston (GIH) were analysed for genetic affinities with different ethnic populations of India, it was found that the GIH have formed two clusters in Principal Component Analysis (PCA), one with Indian populations, another an independent cluster. Similarly, a recent study (‘Neurology Genetics’, 2017; 3:3, e149) by Robert D.S. Pitceathly and colleagues from University College of London and CSIR-CCMB has analysed 74 patients with neuromuscular diseases (of mitochondrial origin) living in the U.K. and found a mutation in RNASEH1 gene in three families of Indian origin. However, this mutation was absent in Indian patients with neuromuscular diseases (of mitochondrial origin). This mutation was earlier reported in Europeans, suggesting that these three families might have mixed with the local Europeans; highlighting the importance of the source of samples. Another study published in The American Journal of Human Genetics (2011; 89:731-744) by Mait Metspalu and colleagues, where CSIR-CCMB was also involved, analysed 142 samples from 30 ethnic groups and mentioned that “Modeling of the observed haplotype diversities suggests that both Indian ancestry components (ANI and ASI) are older than the purported Indo-Aryan invasion 3,500 YBP (years before present). As well as, consistent with the results of pairwise genetic distances among world regions, Indians share more ancestry signals with West than with East Eurasians”.

We agree that the major Indian R1a1 branch, i.e. L657, is not more than 5,000 years old. However, the phylogenetic structure of this branch cannot be considered as a derivative of either Europeans or Central Asians. The split with the European is around 6,000 years and thereafter the Asian branch (Z93) gave rise to the South Asian L657, which is a brother branch of lineages present in West Asia, Europe and Central Asia. Such kind of expansion, universally associated with most of the Y chromosome lineages of the world, as shown in 2015 by Monika Karmin et al., was most likely due to dramatic decline in genetic diversity in male lineages four to eight thousand years ago (Genome Research, 2015; 4:459-66). Moreover, there is evidence which is consistent with the early presence of several R1a branches in India (our unpublished data).

The Aryan invasion/migration has been an intense topic of discussion for long periods. However, one has to understand the complexity of the Indian populations and to select samples carefully for analysis. Otherwise, the findings could be biased and confusing.

With the information currently available, it is difficult to deduce the direction of haplogroup R1a migration either into India or out of India, although the genetic data certainly show that there was migration between the regions. Currently, CSIR-CCMB and Harvard Medical School are investigating a larger number of samples, which will hopefully throw more light on this debate.

Your Comment