Details of the new and updated lineage data sets as well as the new software developments that make up BUSCO v3 are presented in the Supplementary Material online and in the user guide online at http://busco.ezlab.org. BUSCO has been developed and tested on Linux, the codebase is written for Python and runs with the standard Python packages. BUSCO is licensed and freely distributed under the MIT Licence. The BUSCO v3 source code is available through the GitLab project, https://gitlab.com/ezlab/busco, and built as a virtual machine with dependencies preinstalled.
Versions and accessions of all the genome assemblies, annotated gene sets, or transcriptomes assessed by BUSCO as part of this study are detailed in the Supplementary Material online, along with the settings used for each analysis. The Augustus ab initio gene prediction analyses are described in detail in the Supplementary Material online, to compute the coverage scores the predicted protein sequences were aligned against their respective reference annotations using BLASTp (e.g., a coverage score of 100% means that every amino acid of a reference protein is found in the predicted protein with no insertions, deletions, or substitutions). Details of the preprocessing, BUSCO completeness analyses, and postprocessing of the rodent data sets for the phylogenomics study are all presented in the Supplementary Material online, proteins selected for the superalignment were aligned using MAFFT (Katoh and Standley 2013 (link)) and filtered with trimAl (Capella-Gutiérrez et al. 2009 (link)), and the maximum likelihood tree was built using RAxML (Stamatakis 2014 (link)).
Free full text: Click here