Deciphering the Genetic Heritage of the People of India: Fireside Chat with Partha Majumder

Deciphering the Genetic Heritage of the People of India: Fireside Chat with Partha Majumder

Shalini Urs

Human Identity: The Politics and the Science

“East is East, and West is West, and never the twain shall meet.”
—Rudyard Kipling in The Ballad of East and West

We are all mongrels—the idea of a pure population does not exist
—Himla Soodyall, South African Geneticist

Even though the politics of identity appears to be center-stage in our discussions, biology is foregrounded in scientific research on human identity. Ethnicity has emerged in recent scientific work in novel ways and concerning a variety of disciplines: medicine, forensics, population genetics, and developments in popular genealogy.

We are expected (and frequently required) to use socially-visible population labels—whether they are referred to as races, ethnicities, nationalities, or others. However, these labels significantly impact science and society during heightened sectarian tensions worldwide, including in India. A nuanced and scientific understanding of identity and ethnicity matters. Genetic research is closely interlinked with group taxonomies and has profound and unexpected impacts on existing categories of belonging and differences.

Samuel Huntington’s Clash of Civilizations (1993), one of the most influential ideas, is often quoted in past decades of international politics. In his Identity and Violence: The Illusion of Destiny (2006), Amartya Sen rejects Huntington’s civilizational argument and contests the so-called “solitarist” approach to human identity. Sen cautions about the dangers of flattening human identity and argues for recognizing and appreciating the true diversity of identities that infuse them. Sen’s argument is from a cultural and philosophical standpoint. Biogeographical ancestry is separate from cultural identity. However, molecular ancestry is equally myriad, complex, and multilayered.

Since the completion of the Human Genome Project (HGP), studies of genetic variation in human populations have intensified. HGP’s signature accomplishment—generating the first sequence of the human genome—provided fundamental information about the human blueprint, accelerated the study of human biology, and improved the practice of medicine. The advent of high-resolution genome-wide genotyping allows more empirical descriptions of individuals and people by the inference of genetic or “biogeographical” ancestry. Much of this body of work uses specific population identities to categorize groups, for example, Caucasian, Korean, Yoruban, and South Asian, in addition to the generic terminology “race” and “ethnicity” to refer to them (Ali-Khan et al., 2011). There has been a steady decline in the use of the terms “race,” “Caucasian,” and “Negro” and it is indicative of a transition away from the field’s history of explicitly biological race science. Instead, the continental labels “African,” “Asian,” and “European” have increased in use. The increasing use of “ancestry,” “ethnicity,” and continental labels to describe genetic variation continues to evolve (Byeon et al., 2021).

—A nuanced and scientific understanding of identity and ethnicity matters.—

In the wake of these developments, Schramm et al., 2012 in Identity politics and the new genetics: Re/creating categories of difference and belonging explore new social and conceptual spaces unfolding between genetic research and technologies on the one hand, and the social and political construction of identities on the other across a range of different settings, considering how in a genomic age, science and the politics of race, ethnicity, and nation facilitate (or at times contradict) each other. In doing so, the book suggests the limits of thinking in terms of either science influencing politics or politics influencing science but instead points to the coproduction of both.

Genetic Heritage: From Human Curiosity to Health Care

Genes are the legacy that we inherit from our parents and ancestors. People are curious to know about their ancestry and genetic heritage. Genetic heritage may be loosely defined as studying human heritage using genomics. Genetics and genomics are complementary but different disciplines. While genetics is the study of inheritance, genomics studies all the genes and gene products in an individual and how those genes interact with one another and the environment (Bloss et al., 2011).

The popularity of genetic and ancestry services like and 23andMe in the US and DNA Forensics Laboratory in India attests to the fact that most people are curious to discover their genealogy and has given rise to a growing market for consumer genetics. This market was made possible partly by the decreasing costs of genome analysis. More than 26 million Americans took at-home ancestry tests in 2019; a recent survey shows that more than 21% of Americans reported having taken the test. Everyone is interested in our individual and collective identities and ethnicities.

The frenetic advances in genetics and genomics have the potential to revolutionize the practice of medicine and bring about a paradigm shift in approaches to health care. Genomic medicine, an emerging branch of medicine, involves the use of genomic information about an individual as part of their clinical care for diagnostic or therapeutic decision-making and the health outcomes and policy implications of that clinical use. Genomic medicine is already making an impact in the fields of oncology, pharmacology, rare and undiagnosed diseases, and infectious disease. Since the investments in and successful completion of HGP, there has been much excitement about the potential for “precision medicine,” where genomics, epigenomics, environmental exposure, and other data would be used to guide individual diagnosis more accurately. Genomic medicine can be considered a subset of precision medicine.

The Genomic Medicine Working Group (GMWG) of the  National Human Genome Research Institute (NHGRI) has compiled a list of exciting advances and published notable accomplishments.

Evolution and Inheritance: Unravelling our origins and spread

Human Evolution: Trails from the Past by Camilo José Cela Conde and Francisco J. Ayala offers a comprehensive overview of hominid evolution, synthesizing data and approaches from fields as diverse as physical anthropology, evolutionary biology, molecular biology, genetics, archaeology, psychology, and philosophy. Evolutionary biologist Theodosius Dobzhansky (1900–1975) was a key author of the Synthetic Theory of Evolution, also known as the Modern Synthesis of Evolutionary Theory. Dobzhansky’s book Genetics and the Origin of Species (1937) is regarded as one of the most important and earliest works of “the modern synthesis.” The book popularized the work of population genetics and forged the appreciation of the genetic basis of evolution. The modern synthesis was the early 20th-century synthesis of Charles Darwin’s theory of evolution and Gregor Mendel’s ideas on heredity into a joint mathematical framework. Julian Huxley coined the term in his 1942 book, Evolution: The Modern Synthesis. Several major ideas about evolution came together in the population genetics of the early 20th century to form the modern synthesis, including genetic variation, natural selection, and particulate (Mendelian) inheritance.

A nation’s genetic heritage is derived from its people’s gene pool, a consequence of the mixing and reshuffling of genes over thousands of years.

Our genome carries the signature of our ancestors, and the forces of evolution have shaped the genetic structure of modern populations. Population geneticists define ‘evolution’ as any change in a population’s genetic composition over time. The four factors that can bring about such a change are natural selection, mutation, random genetic drift, and migration into or out of the population. Some fundamental mechanisms of evolutionary change, like natural selection and genetic drift, cannot operate without genetic variation. Gene flow — also called migration — is any movement of individuals and/or the genetic material they carry from one population to another. Gene flow can be an essential source of genetic variation.

Footprints on the Sands of Time: Migrations, Globalization, and Diversity

Genes move as people move. Thus it is no wonder we are all mongrels, as we are the product of migrations, gene flow, and the genetic diversity of populations.

Luigi Luca Cavalli-Sforza (1922-2018), a giant in population genetics, was among the first to use genetics to track human migration patterns. Blending anthropology and genetics, he followed the spread of genetic variations to track how humans populated the world and spawned a new field—genetic geography. Hailed as a breakthrough in the understanding of human evolution, The History and Geography of Human Genes by Cavalli-Sforza et al. (1994) is the most comprehensive treatment of human genetic variations available. It offers the first full-scale reconstruction of where human populations originated and the paths they spread worldwide.

Peter Bellwood’s First Migrants: Ancient Migration in Global Perspective (2013) is one of the earliest publications to outline the complex global story of human migration and dispersal throughout human prehistory. Utilizing archaeological, linguistic, and biological evidence, Bellwood charts global human migration and population dispersal throughout the whole of human prehistory in all regions of the world, emphasizing how significant migrations have affected population diversity in every region of the world.

Jared Diamond, the Pulitzer Prize-winning author of Guns, Germs, and Steel: The Fates of Human Societies (1999), convincingly argues that geographical and environmental factors shaped the modern world and are responsible for its broadest patterns. Societies that had had a head start in food production advanced in every respect and adventured on sea and land to conquer and decimate preliterate cultures. He chronicles how the modern world, and its inequalities, came to be.

David Reich’s  Who We Are And How We Got Here: Ancient DNA and the New Science of the Human Past (2018) is a compelling book. It outlines the astounding advances in genomics that have proved to be as important as archeology, linguistics, and written records to understand our ancestry and have profoundly changed our understanding of human history. Using advances in DNA sequencing, geneticist Reich shows the effects of migrations and the mongrel nature of humanity in this fascinating book.

Genetically, How Different and Unequal are We?

Humans are genetically 99.9 % identical. Human geneticists studying human genomic diversity engage themselves with a tiny fraction (about a 10th of a percent) of the human genome, which confers uniqueness to every human. It is primarily on this small fraction of the genome that evolutionary forces have acted upon the evolution of modern humans from their common ancestor (Majumder & Balasubramanian, 2006).

Nevertheless, when we look around, we are struck by the astonishing variety in the human population in terms of size, shape, and facial features. We are also very different in our susceptibility to diseases and athletic, mathematical, and musical abilities. These differences extend to differences between group averages. Most of these average differences are inconspicuous, but some—such as skin color—stand out. So, what explains this discrepancy between DNA evidence and visible difference? James F. Crow, another influential population geneticist, notes that we are unique and unequal by nature. Because a tiny fraction—1/1000 of six billion base pairs—is still six million different base pairs per cell. Thus there is plenty of room for genetic differences among us. Although we differ in a very tiny proportion of our DNA, we differ by many DNA bases (Crow, 2002).

What is the connection between gene spread and language distribution?

Cavalli-Sforza was the first to propose the correspondence between the spread of genes and languages. According to him, most gene patterns found in human populations are likely to be consequences of demographic expansions due to developments affecting food availability, transportation, or military power. During such expansions, both genes and languages are spread to potentially vast areas (Cavalli-Sforza, 1997).

Cavalli-Sforza was one of the first scientists to use genetic information to understand the relationships between different human populations at the level of DNA. His book Genes, Peoples, and Languages (2000), comprising five lectures, is a synthesis of a lifetime of work tracking the past hundred thousand years of human evolution. The first chapter, “Genes and History,” is about human diversity and its social and cultural implications. The chapter “Languages and Genes”  demonstrates many stimulating correspondences between genetic and linguistic diversity, shining a light both on the demographic shifts that shaped our genome and on the effects that these shifts may have had on the distribution of modern languages.

Ethnic India: Unity in Diversity?

Contemporary ethnic India has enormous genetic, cultural, and linguistic diversity. Research has shown that, except for Africa, India harbors more genetic diversity than other comparable global regions.

The Anthropological Survey of India (ASI) carried out the colossal task of collecting and cataloging the linguistic, geographic, and sociological features of all ethnic groups across India. It published an authoritative 43-volume work titled The People of India in 1992 (Singh,1992). Indians are a mosaic, a patchwork quilt, and a rainbow coalition of 450 tribal communities speaking over 750 dialects. These are classified into the Austro-Asiatic (AA), Dravidian (DR), Indo-European (IE), and Tibeto-Burman (TB) language families. The tribals make up 8% of India. The non-tribals speak languages that belong to the Indo-European (IE) or Dravidian (DR) families. The IE and DR have contributed significantly to Indian society and cultural development. Nevertheless, they are also known to be affected by waves of migration into India since prehistoric times. And then there is the stratification into castes, a phenomenon unique to India.

“No population is, or ever could be, pure.”
David Riech

Genome revolution in the study of the human past has revealed that great mixtures of highly divergent populations have repeatedly contributed to the diversity of life. Evolutionary biologists used the metaphor of a tree to represent this evolution. Reich (2018) posits that a better metaphor may be a trellis, branching and remixing far back into the past. This is true of most of the modern world-whether Africa, Europe, or the Indian subcontinent. Remarkable parallels can be found in the prehistories of two similarly sized subcontinents of Eurasia—Europe and India. Reich (2018) observes that just as present-day Europeans have a strong genetic affinity to early farmers from Anatolia (present-day Turkey), consistent with the migration of Anatolian farmers into Europe some 9000 years ago, present-day people of India have a strong affinity to ancient Iranian farmers suggesting the expansion of Near Eastern farming eastward to Indus Valley after 9000 years ago and impacting the population of India. Their study also revealed that the present-day population of India also has a strong genetic affinity to ancient steppe people. Reich’s team revised their earlier model of two types of ancestries of Indian people—the Ancient North Indians (ANI) and  Ancient South Indians (ASI). Reich acknowledges that the picture of population movements in India is far less crisp than in Europe because of the lack of ancient DNA from South Asia and an outstanding mystery of the ancestry of the people of the Indus Valley civilization. 

According to Basu et al. (2016), Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Their study, based on a systematic analysis of genome-wide data using multiple robust statistical methods, reveals four major ancestries in mainland India. In addition, a distinct ancestry of the populations of the Andaman archipelago was identified and found to be co-ancestral to Oceanic people. Analysis of ancestral haplotype blocks revealed that extant mainland populations admixed widely irrespective of ancestry. However, admixtures between populations were not always symmetric, and this practice was rapidly replaced by endogamy about 70 generations ago, among upper castes and Indo-European speakers predominantly. This estimated time coincides with the historical period of formulation and adoption of sociocultural norms restricting intermarriage in large social strata. A similar replacement observed among tribal populations was temporally less uniform.

However, the fundamental genomic unity of India lies with a relatively small number of founding female lineages in India (Majumder & Balasubramanian, 2006).

India: Languages and Ethnicity

Using linguistics as a vehicle, Peggy Mohan traces the historical evolution of languages in India. She interweaves their connection to the history of migrations and social structures in her book Wanderers, Kings, Merchants (2021). As the telltale subtitle —”the story of India through its languages” connotes, it is a story of India through its languages. Mohan argues that the structure of various Indian languages, ancient and modern, says something about the social history of the peoples that spoke them. Drawing up analogies between linguistics and genetics, Mohan posits that over time, a language mutates, and populations split up and move about like genes. Using linguistic markers—for example, “retroflex” consonants—Mohan argues that unlike other languages of the “proto Indo-European” family to which it belongs, Sanskrit adopted these consonants from the Dravidian languages that were already there in India. Thus evidencing that Dravidians and Dravidian Languages predate Sanskrit. The possibility of a connection between Dravidian languages and the Mitanni language (of the Mitanni empire) has also been studied and used to present that Dravidian languages (and Dravidians) might have also come from this part of Asia (Brown, 1930).

Admixture and Endogamy

While successive migrations and gene flow created an extensive genetic diversity, the inflexible sociocultural barriers structured this diversity into different endogamous groups called “castes.” The caste, a hierarchically arranged “endogamous” group, is unique as it governs most of life’s rites, including the choice of the mating partner. Several studies have examined the influence of various cultural practices on patterns of human genetic variation. In India, the centuries-old caste system and the resultant social structuring of the contemporary population have significantly influenced genetic variation. Khan et al., 2007 found that genetic affinities of Indians and that of different caste groups towards Caucasians or East Asians are distributed in a cline where geographically north Indians and both upper caste and Muslim populations are genetically closer to the Caucasians.

Basu et al. (2016) have provided evidence that gene flow ended abruptly with the defining imposition of some social norms introduced during the reign of the ardent Hindu Gupta rulers, known as the age of Vedic Brahminism, which was marked by strictures laid down in Dharma sastra—the ancient compendium of moral laws and principles for religious duty and righteous conduct to be followed by a Hindu—and enforced through the powerful state machinery of a developing political economy. These strictures and enforcements resulted in a shift to endogamy.

Bose et al. (2021) integrate linguistics, social structure, and geography to model genetic diversity within India and demonstrate that endogamy and language families are pivotal in studying the genetic stratification of Indian populations. This is in sharp contrast to what has been seen in other parts of the world, where geography is a significant contributor to shaping the genetic structure of populations. As Reich (2018) observes, genetically speaking, India does not have a large population but is composed of many small people.

Genetic Heritage of India: A story of Migrations, Admixture, and Endogamy

The genetic heritage of the people of India is diverse and differently evolved due to its unique place in the geography and history of human evolution.

To learn about India’s genetic heritage, who is better than an expert who has been studying our genomic footprints in the sands of time for more than four decades and goes by the moniker Gene Guru in India? Listen to Partha Majumder, National Science Chair, Govt. of India; Distinguished Professor and Founding Director, National Institute of Biomedical Genomics, Kalyani, WB, India, in this episode of InfoFire to get an insight and his perspective on how Indians came to be who they are, and what factors shaped our ancestry and destiny. After elaborating on the differences between statistical sciences and data sciences, Majumder explains the journey of the people of India—coming out of Africa either due to pressures of resources or curiosity, their two probable routes, the coastal route and the land route, and then their arrival in India. He explains the four significant waves of migrations to India and how admixture is the major source of the diversity of populations. He underscores that even though such population genetic data have social implications, interpretations and use of these data must be done with extreme caution. He quotes Maya Angelou to highlight the beauty and strengths of our genetic diversity:

“It is time for the preachers, the rabbis, the priests and pundits, and the professors to believe in the awesome wonder of diversity so that they can teach those who follow them. It is time for parents to teach young people early on that in diversity, there is beauty, and there is strength. We all should know that diversity makes for a rich tapestry, and we must understand that all the threads of the tapestry are equal in value no matter their color; equal in importance no matter their texture.”
—Maya Angelou


Ali-Khan, S. E., Krakowski, T., Tahir, R., & Daar, A. S. (2011). The use of race, ethnicity, and ancestry in human genetic research. The HUGO journal, 5(1), 47-63.

Basu, A., Mukherjee, N., Roy, S., Sengupta, S., Banerjee, S., Chakraborty, M., … & Majumder, P. P. (2003). Ethnic India: a genomic view, with special reference to peopling and structure. Genome Research13(10), 2277–2290.

Basu, A., Sarkar-Roy, N., & Majumder, P. P. (2016). Genomic reconstruction of the history of extant populations of India reveals five distinct ancestral components and a complex structure. Proceedings of the National Academy of Sciences113(6), 1594-1599.

Bellwood, P. (2014). First Migrants: Ancient Migration in Global Perspective. Germany: Wiley.

Bloss, C. S., Jeste, D. V., & Schork, N. J. (2011). Genomics for disease treatment and prevention. Psychiatric Clinics34(1), 147-166.

Bose, A., Platt, D. E., Parida, L., Drineas, P., & Paschou, P. (2021). Integrating linguistics, social structure, and geography to model genetic diversity within India. Molecular biology and evolution38(5), 1809-1819.

Byeon, Y. J. J., Islamaj, R., Yeganova, L., Wilbur, W. J., Lu, Z., Brody, L. C., & Bonham, V. L. (2021). The evolving use of ancestry, ethnicity, and race in genetics research—A survey spanning seven decades. The American Journal of Human Genetics, 108(12), 2215–2223.

Campbell, M. C., & Tishkoff, S. A. (2007). AFRICAN GENETIC DIVERSITY: Implications for Human Demographic History, Modern Human Origins, and Complex Disease Mapping. Annual review of genomics and human genetics, pp. 9, 403.

Cavalli-Sforza, L. L. (1997). Genes, peoples, and languages. Proceedings of the National Academy of Sciences94(15), 7719-7724.

Cavalli-Sforza, LL (2000). Genes, Peoples, And Languages. New York: North Point Press/Farrar, Straus, and Giroux.

Cela Conde, C. J., & Ayala, F. J. (2007). Human evolution : trails from the past / Camilo J. Cela-Conde and Francisco J. Ayala. Oxford University Press

Crow, J. F. (2002). Unequal by nature: A geneticist’s perspective on human differences. Daedalus131(1), 81-88.

Diamond J. M. (1999). Guns, germs, and steel: the fates of human societies. W.W. Norton.

Dobzhansky T. (1937). Genetics and the origin of species by theodosius dobzhansky.

Huntington S. P. (2011). The clash of civilizations and the remaking of world order (Simon & Schuster paperback). Simon & Schuster Paperbacks.

Huxley J. S. (1942). Evolution: the modern synthesis. George Allen & Unwin.

Khan F, Pandey AK, Tripathi M, Talwar S, Bisen PS, Borkar M, Agrawal S. Genetic affinities between endogamous and inbreeding populations of Uttar Pradesh. BMC Genet. 2007 Apr 7;8:12. Doi: 10.1186/1471-2156-8-12. PMID: 17417972; PMCID: PMC1855350.

Majumder, P. P., & Balasubramanian, D. (2006). Our footprints on the sands of time. Resonance11(1), 32-50.

Mohan, P. R. (2021). Wanderers, kings, merchants : the story of India through its languages. Penguin Random House India Pvt. Ltd.

Reich, D. (2018). Who we are and how we got here: Ancient DNA and the new science of the human past. Oxford University Press.

Schramm, K., Skinner, D., & Rottenburg, R. (2012). Identity politics and the new genetics: Re/creating categories of difference and belonging. Berghahn Books.

Sen, A. (2007). Identity and violence: The illusion of destiny. Penguin Books India

Singh, K. S. (1992). People of India (Vols. 1-43) India: Anthropological Survey of India.

Cite this article in APA as: Urs, S. (2022, October 18). Deciphering the genetic heritage of the people of India: Fireside chat with Partha Majumder. Information Matters, Vol. 2, Issue 11.

Shalini Urs

Dr. Shalini Urs is an information scientist with a 360-degree view of information and has researched issues ranging from the theoretical foundations of information sciences to Informatics. She is an institution builder whose brainchild is the MYRA School of Business (, founded in 2012. She also founded the International School of Information Management (, the first Information School in India, as an autonomous constituent unit of the University of Mysore in 2005 with grants from the Ford Foundation and Informatics India Limited. She is currently involved with Gooru India Foundation as a Board member ( and is actively involved in implementing Gooru’s Learning Navigator platform across schools. She is professor emerita at the Department of Library and Information Science of the University of Mysore, India. She conceptualized and developed the Vidyanidhi Digital Library and eScholarship portal in 2000 with funding from the Government of India, which became a national initiative with further funding from the Ford Foundation in 2002.