Automated TAL Effector Gene Assembly

For each strain, a list of raw reads for tal gene regions was generated by using blasr (Chaisson & Tesler, 2012 (link)) to align reads to the BLS256 tal gene sequences, following the PacBio hgap Whitelisting protocol (PacBio, 2013a ). Next, a modification of the RS_PreAssembler protocol included in SMRTAnalysis 2.0 was run on these reads. In this modification, which we designated the RS_PreAssembler_TALs protocol, the ‘whiteList’ parameter for the filtering step was set to the tal gene read list. The minimum read-length cut-off was set to 4000, the seed read-length cut-off was set to 16000 to ensure that short-read to long-read alignments used for correction would be long enough to be unambiguous and the maxLCPLength was set to 14, as recommended for data using the XL-C2 enzyme and chemistry (PacBio, 2013b ). Specifically, the blasr options string was changed to ‘-minReadLength 4000 -maxScore −1000 -bestn 24 -maxLCPLength 14 -nCandidates 24’.
After preassembly, corrected reads were trimmed to estimated QV50 windows and filtered to those > 4000 bp using the SMRTAnalysis 2.0 trimFastqByQVWindow.py utility. Based on comparison with the reference genomes, these reads are typically 97% accurate. Reads were assembled using the Minimo assembler of amos 3.1.0 (Treangen et al., 2011 ), using NUCmer 3.1 (Kurtz et al., 2004 (link)) for the overlap step, for all 16 combinations of a 500, 1000, 2000 and 3000 minimum overlap length, and 91, 93, 95 and 97 minimum overlap per cent identity. Contig sets generated by each of these assemblies were polished separately with the RS_Resequencing protocol included in SMRTAnalysis 2.0. This protocol aligns reads to the assembled regions and uses the Quiver algorithm to call the consensus, regularly achieving 99.999% accuracy in regions with ≥60× coverage (Chin et al., 2013 (link)). For this, read filtering settings were set to those used for preassembly, the ‘Place Repeats Randomly’ option was unchecked and all other settings were left at defaults.
RVD sequences were determined from the 16 polished tal gene assemblies using a consensus approach. For each contig across all polished assemblies, encoded TAL effector CRRs were extracted and split into RVD sequences by conserved boundaries. Inspecting a sorted list of unique RVD sequences and the number of times they were encountered in the 16 assemblies (e.g. File S1, available in the online Supplementary Material), sequences ending in frameshifts or other anomalies that were prefixes of other sequences that occurred more often were discarded. The resulting list was retained as the correct RVD sequences. As an additional measure in case any tal genes were incompletely assembled before polishing, assemblies of the polished contigs in each set were carried out, again with Minimo, and the RVD sequence consensus process repeated. In all cases the results were identical.
This workflow for assembly of tal genes and extraction of encoded RVD sequences, which we have named the pbx toolkit, is automated and available on GitHub (https://github.com/boglab/pbx). The only required input is the path to a folder containing bas.h5 and bax.h5 files of raw sequence reads. Additional options allow specifying the sequences to use for identifying tal gene reads and the conserved repeat boundaries to use for RVD sequence determination. This enables the workflow to be easily adapted for use with other Xanthomonas genomes.

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Booher N.J., Carpenter S.C., Sebra R.P., Wang L., Salzberg S.L., Leach J.E, & Bogdanove A.J. (2015). Single molecule real-time sequencing of Xanthomonas oryzae genomes reveals a dynamic structure and complex TAL (transcription activator-like) effector gene relationships. Microbial Genomics, 1(4), e000032.

Publication 2015

Chin Conserved sequences Enzyme Frameshifts Gene Genomes Hgap Quiver Sequence determination Settings Tal effector Tals Xanthomonas

Corresponding Organization :

Other organizations : Cornell University, Icahn School of Medicine at Mount Sinai, Johns Hopkins University, Colorado State University

Top 5 similar protocols

Protocol cited in 5 other protocols

Variable analysis

independent variables

Alignment method (BLASR)
Parameters for the RS_PreAssembler_TALs protocol (minimum read-length, seed read-length, maxLCPLength, BLASR options)
Minimum overlap length (500, 1000, 2000, 3000) and minimum overlap percent identity (91, 93, 95, 97) for Minimo assembler
Filtering and polishing settings for the RS_Resequencing protocol

dependent variables

Assembled TAL gene sequences
Determined RVD sequences

control variables

Reference genomes for comparison
Conserved repeat boundaries for RVD sequence determination

controls

Positive control: Comparison of assembled TAL gene sequences with reference genomes
Negative control: Not explicitly mentioned

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!