The A. thaliana bHLH reported by Bailey et al. (2003) (link), Heim et al. (2003) (link) and Toledo-Ortiz et al. (2003) (link) were retrieved from The Arabidopsis Information Resource (http://www.arabidopsis.org/). A clear bHLH domain was not found in At1g31050 (AtbHLH111) and At1g22380 (AtbHLH152), so they were not further used in this study; we could not find At2g20095 (AtbHLH133) and At4g38071 (AtbHLH131) in any database. A data set of predicted O. sativa L. ssp. japonica bHLH proteins was retrieved from the Plant TFDB (Guo et al. 2008 (link)) and combined with the bHLH protein sequences reported by Li et al. (2006b) (link), retrieved from the Rice Genome Annotation Project (http://rice.plantbiology.msu.edu/). Eleven new proteins were numbered following the nomenclature style of Li et al. (2006b) (link), whereas a clear bHLH was not found in Os01g65080 (OsbHLH033), Os04g35000 (OsbHLH145), Os11g02054 (OsbHLH160), and Os12g02020 (OsbHLH161). A data set of predicted Physcomitrella patens bHLH was retrieved from the Plant TFDB (Guo et al. 2008 (link)). A direct search of genes annotated as bHLH was performed on the genome assembly of Selaginella moellendorffii v1.0 (http://www.jgi.doe.gov/). HMMsearch (Eddy 1998 (link)) was used to screen the genome assemblies of Cyanidioschyzon merolae (Matsuzaki et al. 2004 (link)), Chlamydomonas reinhardtii v3.0 (Merchant et al. 2007 (link)), Ostreoscoccus tauri v2.0 (Palenik et al. 2007 (link)), Thalassiosira pseudonana v3.0 (Armbrust et al. 2004 (link)), and the draft assemblies of Chlorella vulgaris C-169 and Volvox carteri (http://www.jgi.doe.gov/) with the PFAM profile hidden Markov model (pHMM) HLH_ls.hmm (http://pfam.sanger.ac.uk/).
Five Homo sapiens and four Amphimedon queenslandica (demosponge) representative sequences of the major metazoan groups of bHLH proteins (based on Jones 2004 (link); Simionato et al. 2007 (link)) were retrieved from GenBank; group F proteins are not clearly alignable to other bHLH (Ledent et al. 2002 (link)) and so they were not used in this study. The Saccharomyces cerevisiae bHLH proteins reported by Robinson and Lopes (2000) (link) were retrieved from http://www.yeastgenome.org/.
For simplicity, all sequences were renamed according to the supplementary table S1 (Supplementary Material online). The complete amino acid sequence of all proteins can be found in supplementary data 1 (Supplementary Material online).