We took a series of checking and filtering measures on reads following the
Illumina-Pipeline, and low-quality reads, adaptor sequences and duplicates were
removed (
Supplementary Methods).
The reads after the above filtering and correction steps were used to perform
assembly including contig construction, scaffold construction and gap filling
using SOAPdenovo1.04 (
http://soap.genomics.org.cn/) (
Supplementary Methods). Finally, we used
20-kb-span paired-end data generated from the 454 platform and 105-kb-span
BAC-end data downloaded from NCBI (
http://www.ncbi.nlm.nih.gov/nucgss?term=BOT01) to extend scaffold
length (
Supplementary Methods). The
B. oleracea genome size was estimated using the distribution curve of
17-mer frequency (
Supplementary
Methods).
To anchor the assembled scaffolds onto pseudo-chromosomes, we developed a genetic
map using a double haploid population with 165 lines derived from a F1 cross
between two homozygous lines 02–12 (sequenced) and 0188
(re-sequenced). The genetic map contains 1,227 simple sequence repeat markers
and single nucleotide polymorphism markers in nine linkage groups, which span a
total of 1,180.2 cM with an average of 0.96 cM between the
adjacent loci16 (
link). To position these markers to the scaffolds,
marker primers were compared with the scaffold sequences using e-PCR (parameters
-n2 -g1 –d 400–800), with the best-scoring match chosen in
case of multiple matches.
We validated the
B. oleracea genome assembly by comparing it with the
published physical map constructed using 73,728 BAC clones (
http://lulu.pgml.uga.edu/fpc/WebAGCoL/brassica/WebFPC/)17 (
link) and a genetic map from
B. napus18 (
link) (
Supplementary Methods). Eleven
Sanger-sequenced
B. oleracea BAC sequences were used to assess the
assembled genome using MUMmer-3.22 (
http://mummer.sourceforge.net/) (
Supplementary Methods).
Liu S., Liu Y., Yang X., Tong C., Edwards D., Parkin I.A., Zhao M., Ma J., Yu J., Huang S., Wang X., Wang J., Lu K., Fang Z., Bancroft I., Yang T.J., Hu Q., Wang X., Yue Z., Li H., Yang L., Wu J., Zhou Q., Wang W., King G.J., Pires J.C., Lu C., Wu Z., Sampath P., Wang Z., Guo H., Pan S., Yang L., Min J., Zhang D., Jin D., Li W., Belcram H., Tu J., Guan M., Qi C., Du D., Li J., Jiang L., Batley J., Sharpe A.G., Park B.S., Ruperao P., Cheng F., Waminal N.E., Huang Y., Dong C., Wang L., Li J., Hu Z., Zhuang M., Huang Y., Huang J., Shi J., Mei D., Liu J., Lee T.H., Wang J., Jin H., Li Z., Li X., Zhang J., Xiao L., Zhou Y., Liu Z., Liu X., Qin R., Tang X., Liu W., Wang Y., Zhang Y., Lee J., Kim H.H., Denoeud F., Xu X., Liang X., Hua W., Wang X., Wang J., Chalhoub B, & Paterson A.H. (2014). The Brassica oleracea genome reveals the asymmetrical evolution of polyploid genomes. Nature Communications, 5, 3930.