Robust Genotyping of Ancient DNA Samples

For 390k analysis, we restricted to reads that not only mapped to the
human reference genome hg19 but that also overlapped the
354,212 autosomal SNPs genotyped on the Human Origins array^{4 (link)}. We trimmed the last two nucleotides from
each sequence because we found that these are highly enriched in ancient DNA
damage even for UDG-treated libraries. We further restricted analyses to sites
with base quality≥30.
We made no attempt to determine a diploid genotype at each SNP in each
sample. Instead, we used a single allele – randomly drawn from the two
alleles in the individual – to represent the individual at that
site^{20 (link),39 (link)}. Specifically, we made an allele call at
each target SNP using majority rule over all sequences overlapping the SNP. When
each of the possible alleles was supported by an equal number of sequences, we
picked an allele at random. We set the allele to “no call” for
SNPs at which there was no read coverage.
We restricted population genetic analysis to libraries with a minimum of
0.06-fold average coverage on the 390k SNP targets, and for which there was an
unambiguous sex determination based on the ratio of X to Y chromosome reads
(SI4) (Online Table 1).
For individuals for whom there were multiple libraries per sample, we performed
a series of quality control analysis. First, we used the ADMIXTURE
software^{40 (link),41 (link)} in supervised mode, using Kharia, Onge,
Karitiana, Han, French, Mbuti, Ulchi and Eskimo as reference populations. We
visually inspected the inferred ancestry components in each individual, and
removed individuals with evidence of heterogeneity in inferred ancestry
components across libraries. For all possible pairs of libraries for each
sample, we also computed statistics of the form D(Library₁,
Library₂; Probe, Mbuti), where
Probe is any of a panel of the same set of eight reference
populations), to determine whether there was significant evidence of the
Probe population being more closely related to one library
from an ancient individual than another library from that same individual. None
of the individuals that we used had strong evidence of ancestry heterogeneity
across libraries. For samples passing quality control for which there were
multiple libraries per sample, we merged the sequences into a single BAM.
We called alleles on each merged BAM using the same procedure as for the
individual libraries. We used ADMIXTURE^{41 (link)} as well as PCA as implemented in EIGENSOFT^{42 (link)} (using the lsqproject:
YES option to project the ancient samples) to visualize the genetic
relationships of each set of samples with the same culture label with respect to
777 diverse present-day West Eurasians^{4 (link)}. We visually identified outlier individuals, and renamed
them for analysis either as outliers or by the name of the site at which they
were sampled (Extended Data Table 1). We
also identified two pairs of related individuals based on the proportion of
sites covered in pairs of ancient samples from the same population that had
identical allele calls using PLINK^{43 (link)}. From each pair of related individuals, we kept the one
with the most SNPs.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Haak W., Lazaridis I., Patterson N., Rohland N., Mallick S., Llamas B., Brandt G., Nordenfelt S., Harney E., Stewardson K., Fu Q., Mittnik A., Bánffy E., Economou C., Francken M., Friederich S., Pena R.G., Hallgren F., Khartanovich V., Khokhlov A., Kunst M., Kuznetsov P., Meller H., Mochalov O., Moiseyev V., Nicklisch N., Pichler S.L., Risch R., Rojo Guerra M.A., Roth C., Szécsényi-Nagy A., Wahl J., Meyer M., Krause J., Brown D., Anthony D., Cooper A., Alt K.W, & Reich D. (2015). Massive migration from the steppe was a source for Indo-European languages in Europe. Nature, 522(7555), 207-211.

Publication 2015

Allele Diploid Eskimo populations Genotype Heterogeneity Library Nucleotides Sex determination Snps Y chromosome

Corresponding Organization :

Other organizations : University of Adelaide, Harvard University, Broad Institute, Johannes Gutenberg University Mainz, University of Tübingen, Institute of Archaeology, Research Centre for the Humanities, Stockholm University, Senckenberg Centre for Human Evolution and Palaeoenvironment, Universidad Autónoma de Madrid, Heritage Foundation, Peter the Great's Museum of Anthropology and Ethnography, Moscow Architectural Institute, University of Basel, Universitat Autònoma de Barcelona, Universidad de Valladolid, Archäologisches Landesmuseum Baden-Württemberg, Max Planck Institute for Evolutionary Anthropology, Hartwick College, Howard Hughes Medical Institute

Top 5 similar protocols

Protocol cited in 173 other protocols

Variable analysis

independent variables

Restricting to reads that mapped to the human reference genome hg19 and overlapped the 354,212 autosomal SNPs genotyped on the Human Origins array
Trimming the last two nucleotides from each sequence
Restricting analyses to sites with base quality≥30
Using a single allele - randomly drawn from the two alleles in the individual - to represent the individual at each site
Restricting population genetic analysis to libraries with a minimum of 0.06-fold average coverage on the 390k SNP targets
Performing quality control analysis to remove individuals with evidence of heterogeneity in inferred ancestry components across libraries

dependent variables

Allele calls at each target SNP using majority rule over all sequences overlapping the SNP
Inferred ancestry components using ADMIXTURE
Genetic relationships of each set of samples with the same culture label with respect to 777 diverse present-day West Eurasians using ADMIXTURE and PCA

control variables

Sex determination based on the ratio of X to Y chromosome reads
Keeping one individual from each pair of related individuals based on the proportion of sites covered in pairs of ancient samples from the same population that had identical allele calls

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!