Comprehensive Analysis of Global Tuberculosis Strains
Raw read data for 8,136 strains were downloaded from the SRA (see SRA accessions in Supplementary Table 1). Reads were mapped onto a reference strain of H37Rv (GenBank accession number CP003248.2) using BWA version 0.7.1068 (link). Variants were identified using Pilon version 1.11 as described 67 (link). The global M. tuberculosis lineage designations used in our analysis, as well as each strain’s spoligotype, were predicted using digital spoligotyping as in Cohen et al., 20156 . We eliminated 824 strains that did not pass our quality control filters: average sequencing depth of coverage >20X; fraction of long insertions <0.2; ambiguity rate <0.5 (to remove samples that were suspected to represent mixes of different genotypes); number of low coverage bases <250,000; and having a single match to one lineage in our lineage-prediction algorithm. We also eliminated strains for which Pilon failed three times. Of the remaining 7,312 samples, we removed 1,970 strains with no “country” metadata or description in a publication; 19 strains with substantial non-tuberculous mycobacteria contamination; as well as 13 additional duplicate patient samples. These filters resulted in a final set of 5,310 strains for analysis. Emu69 was run to canonicalize variants. We conducted phylogenetic analyses for the entire set of 5,310 strains, as well as for a subset corresponding to each lineage and each United Nations geographical subregion23 with >30 strains (Supplementary Table 3). For each set, all sites with unambiguous single nucleotide polymorphisms (SNPs) in at least one strain were combined into a concatenated alignment. Ambiguous positions were treated as missing data. The concatenated alignment was then were used to generate a midpoint rooted phylogenetic tree using FastTree70 (link) version 2.1.8.
Partial Protocol Preview
This section provides a glimpse into the protocol. The remaining content is hidden due to licensing restrictions, but the full text is available at the following link:
Access Free Full Text.
Manson A.L., Cohen K.A., Abeel T., Desjardins C.A., Armstrong D.T., Barry CE I.I.I., Brand J., Chapman S.B., Cho S.N., Gabrielian A., Gomez J., Jodals A.M., Joloba M., Jureen P., Lee J.S., Malinga L., Maiga M., Nordenberg D., Noroc E., Romancenco E., Salazar A., Ssengooba W., Velayati A.A., Winglee K., Zalutskaya A., Via L.E., Cassell G.H., Dorman S.E., Ellner J., Farnia P., Galagan J.E., Rosenthal A., Crudu V., Homorodean D., Hsueh P.R., Narayanan S., Pym A.S., Skrahina A., Swaminathan S., Van der Walt M., Alland D., Bishai W.R., Cohen T., Hoffner S., Birren B.W, & Earl A.M. (2017). Genomic analysis of globally diverse Mycobacterium tuberculosis strains provides insights into emergence and spread of multidrug resistance. Nature genetics, 49(3), 395-402.
Publication 2017
Digital Genotypes Insertions M tuberculosis Non tuberculous mycobacteria Patient Single nucleotide polymorphisms Strain
Corresponding Organization :
Other organizations :
Broad Institute, Johns Hopkins University, National Institute of Allergy and Infectious Diseases, National Institutes of Health, South African Medical Research Council, Yonsei University, International Tuberculosis Research Center, Clinica de Pneumologie Iaşi, Makerere University, Public Health Agency of Sweden, Université des Sciences, des Techniques et des Technologies de Bamako, Phthisiopneumology Institute "Chiril Draganiuc", Delft University of Technology, Masih Daneshvari Hospital, Shahid Beheshti University of Medical Sciences, Republican Scientific and Practical Centre of Pulmonology and Tuberculosis, Brigham and Women's Hospital, Boston Medical Center, Boston University, National Taiwan University Hospital, National Institute of Research in Tuberculosis, Rutgers, The State University of New Jersey, Harvard University
Single match to one lineage in lineage-prediction algorithm
No substantial non-tuberculous mycobacteria contamination
No duplicate patient samples
positive controls
None mentioned
negative controls
None mentioned
Annotations
Based on most similar protocols
Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to
get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required