The reference gene sequences to be used in the search can be obtained from a GenBank record that is already validated and has annotations describing the names and qualifiers for all of the genes in the plastome. Plann searches for sequences in the newly assembled plastome similar to those known genes and transforms those matched genes to their corresponding genomic locations. Because it only searches one sequence against one sequence, it is a very fast process: it should only take a few seconds to run.
Plann consists of Perl scripts contained in a GitHub repository (https://github.com/daisieh/plann/releases/tag/v1.1 ) and licensed under a BSD open-source license. It uses two freely available command-line tools from NCBI: BLASTN and tbl2asn. The graphical user interface (GUI) application Sequin, also available from NCBI, can be used to generate the template file required by tbl2asn and to validate the output of tbl2asn. It has been tested on Unix and Unix-like platforms, including Linux and Mac OS X. To use Plann, make sure that both BLASTN and tbl2asn are directly available as executables from the command line, then execute the script plann.pl. The input files required are the new plastome sequence (in FASTA format), the reference plastome (in GenBank format), and the Sequin template file. The output is a Sequin file, ready for submission to NCBI, and a text report with the genes that were not aligned. These problematic genes can be manually edited in a text editor and added back to the Sequin .tbl file, which can then be rerun through tbl2asn. If it turns out that the sequence is incorrect, it can be edited and then rerun through Plann again.
To validate the annotations produced by Plann, we reciprocally annotated plastomes of taxa at varying phylogenetic distances. The reference GenBank records used were NC_024735.1 (Populus balsamifera), NC_009143.1 (P. trichocarpa), NC_024734.1 (P. fremontii), NC_024681.1 (Salix interior), and NC_012224.1 (Jatropha curcas). This analysis can be found in the repository at test/analysis.sh. The comparative results are presented inFig. 1 . Nearly all features were successfully annotated within a genus. Even for a distantly related pair of taxa, Plann was able to identify nearly 70% of the features present in the annotation.
Plann consists of Perl scripts contained in a GitHub repository (
To validate the annotations produced by Plann, we reciprocally annotated plastomes of taxa at varying phylogenetic distances. The reference GenBank records used were NC_024735.1 (Populus balsamifera), NC_009143.1 (P. trichocarpa), NC_024734.1 (P. fremontii), NC_024681.1 (Salix interior), and NC_012224.1 (Jatropha curcas). This analysis can be found in the repository at test/analysis.sh. The comparative results are presented in