Eight expressed sequence tag (EST) libraries were obtained from the National Center for Biotechnology Information (NCBI) database (GenBank) in May 2007. The libraries represented Symbiodinium clade A (2,163 sequences); Symbiodinium clade C (5,156 sequences); Amphidinium carterae (3,383 sequences); Alexandrium tamarense (10,885 sequences); Heterocapsa triquetra (6,807 sequences); Karenia brevis (6,986 sequences); Karlodinium micrum (16,532 sequences) and Lingulodinium polyedrum (3,639 sequences). Each of these libraries differed widely. For example, the Symbiodinium clade A library was generated from cells that have been in cultures for over 25 years [37] , whereas the clade C library encompasses Symbiodinium cDNAs isolated from the staghorn coral Acropora aspera exposed to a variety of stresses, including elevated temperature, ammonium supplementation, and seawater with different inorganic carbon concentrations [38] . The other dinoflagellate EST libraries were obtained from cultures grown and harvested under a variety of conditions, including isolation during different phases of growth or time points in the daily cycle [39] (link)–[43] .
Using the two Symbiodinium datasets as queries, a Perl script [44] (link) linking the BLASTn output files from the BLAST v2.2.15 package (http://www.ncbi.nlm.nih.gov/) was used to retrieve homologous sequences from the six non-Symbiodinium dinoflagellate target libraries with an e-value threshold of 10−25. This relatively stringent cutoff was defined to restrict the integration of paralogous genes and limit the inclusion of short sequence fragments (<200 bp). Sequence identity of each homologous group of sequences was assessed at the protein-level using BLASTx. Eighty-four sequence alignments containing all homologous sequences retrieved in the BLAST analyses were created in the BioEdit v5.0.9 sequence alignment software [69] using ClustalW [70] (link), then checked and manually edited. Because individual EST alignments contain sequences from either a single Symbiodinium clade (A or C) or both clades plus other dinoflagellates (see Table S1), candidate genes suitable for downstream characterization were selected using the following criterion: genes were shortlisted for gene characterization based on the presence of conserved regions that would allow for forward and reverse primer design. To facilitate work on all clades of Symbiodinium, alignments containing contigs from both Symbiodinium libraries (A and C) were prioritized. Symbiodinium clades A and C represent the most ancestral and derived Symbiodinium lineages, respectively, so primers targeting these very divergent clades would most likely also allow Symbiodinium from all other clades (B, D, E, F, G, H and I) to be recovered. Non-Symbiodinium sequences were also included in these alignments, because they provided information on how variable a given candidate gene was between dinoflagellate groups, while also allowing for the design of ‘Symbiodinium-specific’ primers in variable regions or ‘dinoflagellate-specific’ primers in conserved regions. In a single case where no Symbiodinium clade A contig was available for comparison with clade C (e.g. calmodulin gene; Table 1, Table S1), the non-Symbiodinium dinoflagellate contigs were used in the primer design. Finally, gene alignments were sorted again to identify those that allowed for design of primers yielding amplicons of between 150 bp and 1000 bp in length. Forward and reverse Symbiodinium-specific primers were designed across the conserved regions of the candidate genes using MacVector v11.0.2 (MacVector Inc., NC, USA), minimizing self/duplex hybridization and internal secondary structure problems.
Free full text: Click here