What My DNA Says About India’s History


I recently received the results of an ancestry test from 23andMe, a company offering autosomal DNA testing for ancestry. I was not surprised to learn that my recent ancestry, from the last three centuries, is almost exclusively South Asian. Because 23andMe does not have a large database of South Asian individuals, my ancestry was classified as “Broadly South Asian” without further details; the sample size of South Asians (unlike people of European or African descent) is not yet large enough to trace my family history within South Asia.

Some assumptions, however, can be made, which shed a fascinating light on the complex history of South Asian genetic and migratory patterns. Much of this is contrary to the simplistic narrative peddled by some members of the Hindu right today, who want to prove that “today’s Hindus are directly descended from the land’s first inhabitants many thousands of years ago, and make the case that ancient Hindu scriptures are fact not myth.” Furthermore, many Indians want to disprove a theory known as the Aryan invasion theory, which claims most modern Indians are the descendents of Central Asian Sanskrit-speaking invaders who entered the subcontinent some 4,000 years ago. While the DNA-evidence aligns with the idea that the ancestors of most Indians have been in the subcontinent for at least past 10,000 years, linguistic and genetic evidence paint a much more complex history.

Because India has been invaded so many times during recorded history, it would be ridiculous to assume that there is not at least some minor genetic imprint from other parts of the world. My own genetic history shows this: despite being almost 99 percent South Asian over the past 300 years, I have both a full-blooded English and a full-blooded “Yakut” ancestor from the 1700s.

The Yakut are a Turkic people who live in northeastern Siberia. I suspect that due to the small sample set of individuals from other Turkic ethnics groups, most genetic material from Central Asian Turkic peoples are usually placed under categories like “Broadly East Asian,” “Yakut,” or “Broadly Middle Eastern.” It would be reasonable for me to have a Turkic ancestor from the 1750s. The Asaf Jahi Dynasty that ruled as Nizams of Hyderabad was of Turkic ancestry, originating from modern Uzbekistan, and during the Mughal and post-Mughal eras, many Central Asian Turkic soldiers entered India.

Family lore insisted that my great-great-great-great grandmother was a white English woman (my grandfather recalled his grandfather telling him that his grandmother was white, which is the approximate limit of accurate oral transmission). This lore has proven to be accurate. It is not incredibly surprising that I have a European ancestor from the tumultuous period during which the Mughal Empire collapsed into smaller successor states, the Maratha Empire rose, and British and French armies competed for access to the various princes of India. During this period, unlike during the 19th century British Raj, Europeans were not a dominant ruling-class, isolated from the native population by walled compounds. Thousands of soldiers of fortune, advisors, and seaman mingled with the indigenous populations during this period, including the area of coastal Andhra Pradesh where my family is from—this area came under British rule relatively early, in 1765.

Going further back, it is likely that some of my ancestors originated in north India, being members of the upper Brahmin caste, which has been endogamous (marrying only within its group) since the Gupta Era (320-605 CE). According to DNA evidence, “the transition in India from free intermarriage to endogamy took place about 70 generations ago; that is, about 1600 years ago.” With the post-Gupta rise of South Indian kingdoms such as the Rashtrakutas, Cholas, and  Chalukyas, many Brahmins were invited south to serve as scribes, ritual priests, and administrators, presumably, some assimilation and intermarriage still took place, given that my maternal DNA suggests that I have a female ancestor “who likely lived among the inhabitants of present-day southern India shortly over 35,000 years ago.”

The aforementioned fact derives from my maternal and paternal haplogroup information. According to 23andMe,

Haplogroup is the term scientists use to describe a group of Y-chromosome (or mitochondrial) sequences that are more closely related to one another than to others. The term haplogroup is a combination of haplotype and group. In this context, haplotype refers either to the DNA sequence of one’s mitochondrial DNA, which is inherited from one’s mother, or to the DNA sequence of one’s Y chromosome, which is passed from fathers to their sons. Due to their unusual pattern of inheritance, the mitochondrial DNA and the Y chromosome contain information about your maternal and paternal lines, respectively. But together, they make up less than 1% of all your DNA, and only represent a small fraction of your ancestry.

In other words, all that I know from my haplogroup information is that I have at least one maternal ancestor who originated in southern India 35,000 years ago, and at least one male-line ancestor who lived on the Russian steppe around 25,000 years ago, himself descended from a man in the Middle East before; the gene may have spread through the expansion of the Yamnaya culture around 5,300 ago, an event associated with the spread of Indo-European languages.

My genetic information conforms with what is the emerging model of Indian genetic history, which attests to a mixing between two disparate groups in the subcontinent thousands of years ago. Scientists have dubbed these two groups Ancestral North Indian (ANI) and Ancestral South Indian (ASI), though the terms do not actually reflect the modern distribution of these ancestral populations: everyone in South Asia is a mix of ANI and ASI, and there are no unmixed groups, regardless of language, region, religion, or caste. However, the proportion of ANI is greater in the northwestern part of the subcontinent, and lower in the south.

People with ANI ancestry are “related to West Eurasians (people of Central Asia, the Middle East, the Caucasus, and Europe),” although they mostly diverged from other West Eurasians some 12,500 years ago, and may have lived on the fringes of the Middle East, before immigrating into the subcontinent as the region’s first farmers around 10,000 to 8,000 years ago. There has been some further migration form this region: 17.5 percent of Indian male lineage belongs to haplogroup R1a, which diversified from Central Asia only 5,800 years ago.

On the other hand, the ancient ASI population group was indigenous to the subcontinent, or dwelt there for at least 30,000 years; ASI was as “distinct from ANI and East Asians as they are from each other,” and was probably descended from the original eastward migration from Africa 65,000 years ago that gave rise to the “negritos” or “australoids” of Southeast Asia and the Australian Aborigines (most of these groups were later replaced by farmers originating from the Near East and East Asia). Both these groups thoroughly mixed around 4,000-2,000 years ago, so that today in South Asia, there are no “pure” ancestral groups. In all South Asian ethnic, caste, and religious groups, “ANI ancestry ranges from 39–71%.”

Thus, it is true that the majority of Indians are primarily descended from people who have been been in South Asia for at least the last 10,000 years, but much of their ancestry comes from an ancient migration from the Middle East; in fact, Indians as a whole genetically cluster with Middle Easterners, with the ANI component mostly swamping the ASI component except among isolated tribal groups (adivasis) and the natives of the Andaman Islands. This was probably due to the fact that ANI groups grew exponentially due to farming, while ASI groups remained hunter-gatherers for a longer time.

Indian genetic history resembles that of Mexico, in which two highly divergent groups mixed, forming a hybrid genetic population. Most of the Mexican population today is Mestizo (mixed) with a declining number of individuals remaining exclusively Native American or European, since the offspring of a couple in which one partner is even minimally Mestizo, will also be Mestizo—over time, Native American and European genes will spread to almost the entire Mexican population. Genetic studies of Mexican populations demonstrate that European admixture ranges from around 20 percent to 80 percent.

Furthermore, reflecting a common phenomenon throughout history, the mixing occurs when male individuals from the dominant group took female partners from the conquered or subordinated group. My own DNA shows that my distant paternal ancestor originated from a relatively high-status farmer or herder male in the Middle East/Central Asia, while my distant maternal ancestor was probably a woman from an indigenous tribal group. It is likely given my ancestral caste and my facial/physical structure, that the ANI genetic material predominates over the ASI.

We can conclude then that was no massive “Aryan invasion” from Central Asia, and likewise, most Indian Muslims are descended from the same Indian population 10,000 years ago that also gave rise to Indian Hindus, but perhaps with slightly more Arab, Persian, and Turkic admixture. However, there have been linguistic shifts throughout India’s recorded history. Both the Indo-Aryan and Dravidian families of northern and southern India seem to originate from outside the subcontinent; in the case of Sanskrit, its ancestral language probably originated from the aforementioned Yamnaya culture on the basis of its cognates with other Indo-European languages indicating a colder, steppe homeland. Indo-Aryan languages are on the geographical fringes of the Indo-European language family and evidence several features borrowed from Dravidian and Munda languages that demonstrate that the family originated outside South Asia. Dravidian may have originated on the Iranian plateau before traveling along the coast southeast into India. In both cases, the existing ANI and ASI populations in the subcontinent adopted the new languages.

These external language families spread throughout South Asia through the dominance of small groups of elites whose cultures were emulated by pre-existing populations. This phenomenon is known as elite dominance and does not imply a genetic shift; it has occured constantly throughout history, with the spread of Latin, Arabic, Turkish, and other languages that were adopted by native populations; today, much of India’s elite learns English despite not having English ancestry. Technological, agricultural, and even ritual and organizational structures of small groups of elites can spread extremely fast, especially among small groups of people not organized into complex states (Indo-Aryan languages spread in the subcontinent after the collapse of the Indus Valley Civilization due to environmental reasons). It is important to remember that just because some ancient Indians adopted external languages thousands of years ago, it does not mean that the achievements of Indian culture subsequent to that are not “indigenous” as some Hindu nationalists fear.

For a more detailed discussion on the history of language shift in ancient India, please see my article at The Diplomat: “When History Gets Political: India’s Grand ‘Aryan’ Debate and the Indus Valley Civilization.”

My DNA, and the DNA of other Indians, has a lot to say about Indian history: a good database combined with scientific analysis can settle many debates in Indian historiography, and demonstrate that every side has made some correct attestations. However, I suspect, unfortunately, that despite the overwhelming scientific consensus of linguists, archaeologists, geneticists, and historians, there will always been an element in India more invested in promoting pseudo-scientific ideas based off of ideology rather than academic accuracy. For everyone else, there is a world of amazing knowledge at the cusp of being unlocked.

