Benchmarking Transcriptome Reconstruction Methods

Current gene annotations for S. pombe were downloaded as file ‘pombe_290110.gff’ from GeneDB (http://old.genedb.org/genedb/pombe/). RefSeq transcript gene annotations were downloaded for mouse at the UCSC mouse genome browser gateway (http://genome.ucsc.edu/cgi-bin/hgGateway?db=mm9) in BED format. Protein coding nucleotide sequences were extracted from the genome sequences based on the gene annotations using custom PERL scripts. The mouse reference coding sequences were further distilled to remove entirely identical sequences corresponding to isoforms encoding identical proteins and paralogous sequences: the original 19,947 genes encoding 23,881 transcripts were reduced to 19,857 genes encoding 22,717 on-identical coding transcripts.
Reconstructed transcript sequences (via de novo assembly, Scripture, or Cufflinks) were mapped to the reference coding sequences using BLAT^{35 (link)}. Full-length reference annotation mappings were defined as having at least 95% sequence identity covering the entire reference coding sequence and containing at most 5% insertions or deletions (cumulative gap content). In evaluating methods that leverage the strand-specific data (Trinity and Cufflinks), proper sense-strand mapping of sequences was required. Transcripts reconstructed by the alternative methods (Scripture, ABySS, and SOAPdenovo) were allowed to map to either strand. Fusion transcripts were identified as individual reconstructed transcripts that mapped as full-length to multiple reference coding sequences and lacked overlap among the matching regions within the reconstructed transcript. One-to-one mappings were required between reconstructed transcripts and reference transcripts, including alternatively spliced isoforms, with the exception of fusion transcripts.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Grabherr M.G., Haas B.J., Yassour M., Levin J.Z., Thompson D.A., Amit I., Adiconis X., Fan L., Raychowdhury R., Zeng Q., Chen Z., Mauceli E., Hacohen N., Gnirke A., Rhind N., di Palma F., Birren B.W., Nusbaum C., Lindblad-Toh K., Friedman N, & Regev A. (2011). Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nature biotechnology, 29(7), 644-652.

Publication 2011

Coding sequence Coding sequences Deletions Gene annotations Genes Genome Insertions Isoforms Mouse Proteins

Corresponding Organization :

Other organizations : Broad Institute, Massachusetts Institute of Technology, University of Massachusetts Chan Medical School, Science for Life Laboratory, Uppsala University, Hebrew University of Jerusalem

Top 5 similar protocols

Protocol cited in 1 869 other protocols

Variable analysis

independent variables

Reconstructed transcript sequences (via de novo assembly, Scripture, or Cufflinks)

dependent variables

Mappings of reconstructed transcript sequences to the reference coding sequences
Full-length reference annotation mappings (at least 95% sequence identity covering the entire reference coding sequence and containing at most 5% insertions or deletions)
Sense-strand mapping of sequences (for methods that leverage the strand-specific data, Trinity and Cufflinks)
Fusion transcripts (individual reconstructed transcripts that mapped as full-length to multiple reference coding sequences and lacked overlap among the matching regions within the reconstructed transcript)

control variables

Reference coding sequences extracted from the genome sequences based on the gene annotations
Mouse reference coding sequences distilled to remove entirely identical sequences corresponding to isoforms encoding identical proteins and paralogous sequences

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!