The Iso-Seq method for sequencing full-length transcripts was developed by PacBio during the same time period as the genome assembly. We therefore used this technique to improve characterization of transcript isoforms expressed in cattle tissues using a diverse set of tissues collected from L1 Dominette 0 1449 upon euthanasia. The data were collected using an early version of the Iso-Seq library protocol [26 ] as suggested by PacBio. Briefly, RNA was extracted from each tissue using Trizol reagent as directed (Thermo Fisher). Then 2 μg of RNA were selected for PolyA tails and converted into complementary DNA (cDNA) using the SMARTer PCR cDNA Synthesis Kit (Clontech). The cDNA was amplified in bulk with 12–14 rounds of PCR in 8 separate reactions, then pooled and size-selected into 1–2, 2–3, and 3–6 kb fractions using the BluePippin instrument (Sage Science). Each size fraction was separately re-amplified in 8 additional reactions of 11 PCR cycles. The products for each size fraction amplification were pooled and purified using AMPure PB beads (Pacific Biosciences) as directed, and converted to SMRTbell libraries using the Template Prep Kit v1.0 (PacBio) as directed. Iso-Seq was conducted for 22 tissues including abomasum, aorta, atrium, cerebral cortex, duodenum, hypothalamus, jejunum, liver, longissimus dorsi muscle, lung, lymph node, mammary gland, medulla oblongata, omasum, reticulum, rumen, subcutaneous fat, temporal cortex, thalamus, uterine myometrium, and ventricle from the reference cow, as well as the testis of her sire. The size fractions were sequenced in either 4 (for the smaller 2 fractions) or 5 (for the largest fraction) SMRTcells on the RS II instrument. Isoforms were identified using the Cupcake ToFU pipeline [27 ] without using a reference genome.
Short-read–based RNA-seq data derived from tissues of Dominette were available in the GenBank database because her tissues have been a freely distributed resource for the research community. To complement and extend these data and to ensure that the tissues used for Iso-Seq were also represented byRNA -seq data for quantitative analysis and confirmation of isoforms observed in Iso-Seq, we generated additional data, avoiding overlap with existing public data. Specifically, the TruSeq stranded mRNA LT kit (Illumina, Inc.) was used as directed to create RNA-seq libraries, which were sequenced to ≥30 million reads for each tissue sample. The Dominette tissues that were sequenced in this study include abomasum, anterior pituitary, aorta, atrium, bone marrow, cerebellum, duodenum, frontal cortex, hypothalamus, KPH fat (internal organ fat taken from the covering on the kidney capsule), lung, lymph node, mammary gland (lactating), medulla oblongata, nasal mucosa, omasum, reticulum, rumen, subcutaneous fat, temporal cortex, thalamus, uterine myometrium, and ventricle. RNA-seq libraries were also sequenced from the testis of her sire. All public datasets, and the newly sequenced RNA-seq and Iso-Seq datasets, were used to annotate the assembly, to improve the representation of low-abundance and tissue-specific transcripts, and to properly annotate potential tissue-specific isoforms of each gene.
Short-read–based RNA-seq data derived from tissues of Dominette were available in the GenBank database because her tissues have been a freely distributed resource for the research community. To complement and extend these data and to ensure that the tissues used for Iso-Seq were also represented by
Full text: Click here