The SDF files of all lipids in LipidBlast were constructed as follows. The SDF files of PC, lysoPC, PE, lysoPE, PG, PI, PS, and PA were downloaded from LIPID MAPS26 . The SDF files for the other lipid classes were created from SMILES code written in LipidBlast by ChemAxon JChem 6.3.0 molconvert (http://www.chemaxon.com), totaling 117,343 SDF files. They also included plasmenyl PC, PE, sphingomyelin, and cholesterol ester as lipid classes although these lipids were not the focus for algal lipid identifications. The PaDEL descriptor software was utilized to calculate 1D and 2D molecular descriptors and PubChem fingerprints from the SDF files12 (link). Their exact masses were also generated by ChemAxon JChem molconvert. Then, redundant and uniform variables were excluded, and a total of 464 compound descriptors were used as predictor variables in the regression analysis. The in-house retention time information of 254 lipids was used for model development. Since the number of predictor variables (compound descriptors) were considerably higher than the number of data samples (the number of training set: 254), partial least square regression (PLS-R) was utilized in order to construct the retention time prediction model11 . The program of PLS-R was written in Visual Basic for Application and the source code can be downloaded at http://prime.psc.riken.jp/. A seven-fold cross validation was used to calculate the predictive residual sum of squares (PRESS) and Q2 value. The final model included six latent variables based on the PRESS and Q2 value and the retention time information from the training samples. In this study, retention time information of newly identified 1,808 lipids from nine algal species was used for validating that accurate precursor ion masses and MS/MS spectra were also confirmed by retention time matching.