Blood samples were collected with informed consent from volunteers. We designate groups by their anthropological name as well as their geographic location, since it has been shown that both are required to specify an effectively endogamous group in India1 . All DNA samples were genotyped on Affymetrix 6.0 arrays. We restricted most analyses to samples that appeared to be unrelated, and to 560,123 autosomal SNPs for which there was good genotyping completeness and for which there were no signs of problematic genotyping. For some analyses we also intersected our data with Illumina 650Y genotyping of the Human Genome Diversity Panel14 (link) and HapMap13 (link),28 , which produced a merged data set of 119,744 autosomal SNPs14 (link). We carried out PCA using the EIGENSOFT software17 (link), assessed allele frequency differentiation among groups using FST, assessed inbreeding in each group using Wright’s Fixation Index F23 (link), and computed standard errors using a Block Jackknife33 . To detect the signature of founder events in linkage disequilibrium data, we studied all possible pairs of samples for each group, and recorded whether they share 0, 1 or 2 alleles at each SNP (at SNPs where both individuals were heterozygous, we recorded 1 allele to be shared to account for the ambiguity in the haplotype phase). Long stretches of allele sharing can reflect regions that are shared identical by descent from a common founder, and by measuring the exponential decay of allele sharing with distance, we inferred the age of the founder event (Figure S3). To test for a history of mixture, we applied 3 and 4 Population Tests (Note S3). To infer the proportion of ancestry in each Indian Cline group in the absence of accurate ancestral populations, we used f3 Ancestry Estimation (Note S5).