Single amino acid polymorphisms in the minke whale and bottlenose dolphin genes were compared with those in the cow and pig genes by multiple sequence alignments using ClustalW2 (ref. 69 (link)). Protein sequences of the fin whale and finless porpoise were predicted by aligning and substituting the raw reads to the minke whale scaffolds and bottlenose dolphin scaffolds, respectively. Artifacts were removed from the alignments manually, and the filtering option required ≥1/2 coverage and ≥1/2 well-matched amino acids (the consensus string was ‘*’, ‘:’ or ‘.’). To exclude individual variation, only amino acid changes shared by all the whales tested (four minke whales and two bottlenose dolphins) were used. Significant changes in protein function (‘probably or possibly damaging’) were predicted using PolyPhen-2 (ref. 70 (link)).
PSGs identified on the basis of dN/ds ratios were predicted using branch-site likelihood ratio tests for single-copy gene families with a conservative 10% false discovery rate (FDR) criterion10 (link). The minke whale was used as the foreground branch, and the cow and pig were used as the background branches for the PSGs of the minke whale. The bottlenose dolphin was used as the foreground branch for the PSGs of the bottlenose dolphin. The coding sequences of the single-copy orthologous genes were aligned using PRANK71 (link), and alignments shorter than 150 bp without gaps were discarded. The codeml program in the PAML package was used to calculate the log likelihoods for the alternative model and the null model. The FDR was determined on the basis of the q values calculated using the q-value library in R72 (link). All the PSGs were mapped to KEGG pathways and assigned GO terms on the basis of their P values, which were calculated by Fisher’s exact test with a 10% FDR. The over-representation of glutathione and glutathione disulfide were validated experimentally using kidney Sp1K cells from Atlantic spotted dolphin (S. frontalis). Additional information regarding the methods used to identify rapidly evolving GO categories and copy number variations is provided in the