For model building, we used soil profile data from ca. 150,000 unique sites spread over all continents (Fig 3; see acknowledgments for a full list). These have been imported, cleaned and merged into a single global compilation of soil points with unique column names and IDs. Preparation of the global compilation of standardized soil training points took several months of work. The translation and cleaning up of soil properties and soil classes took a large amount of time. About 15–20% of the original soil profile data was only reported using a national classification system, e.g. the Canadian and Brazilian classification systems. Since some information is better than none, where possible we translated national classification systems to the two international (World Reference Base and USDA) classification systems. For translation we used published correlation tables either reported in Krasilnikov et al. [22 ] or reported on the agency websites; see e.g. correlation of Canadian Soil Taxonomy published (http://sis.agr.gc.ca/cansis/taxa/) and correlation of the Brazilian classification system (http://www.pedologiafacil.com.br/classificacao.php). We also consulted numerous local soil classification experts and requested their feedback and corrections in the (online) correlation tables (distributed via Google spreadsheets). Some national classification systems, such as the Australian soil classification system, are simply too different from the USDA and WRB systems to allow satisfactory correlation. These data were therefore not used. The full list of correlation tables is available from ISRIC’s github account at https://github.com/ISRICWorldSoil. Another time-consuming operation was merging laboratory measurements and field observations and their harmonization to a standard format. In some cases missing values in the original tables had been coded as "0" values, which can have a serious influence on prediction models; in other cases we implemented and applied functions to locate and correct typos and other gross errors. Some variables, such as soil organic carbon, needed to be converted either from soil organic matter (e.g. divide by 1.724) and/or by removing CaCO3 (Calcium carbonates) from total carbon. Nevertheless, the majority of soil variables from various national soil profile data bases appeared to be compatible and relatively easy to merge—soil scientists across continents do measure similar things, but often express the results using different measurement units, vocabularies and standards. We imported all original tables as-is, next documented all conversion functions through R scripts (available via ISRIC’s github account), to accommodate reproducible research and facilitate that conversion functions may, in the future, be further modified and improved. The majority of the points (excluding LUCAS points and other data sets with specific restricting terms of use) and legends used for model building and for producing SoilGrids are also available for public use via ISRIC’s WoSIS Web Feature Service (http://www.isric.org/data/wosis) and/or the ISRIC’s institutional github account.
Hengl T., Mendes de Jesus J., Heuvelink G.B., Ruiperez Gonzalez M., Kilibarda M., Blagotić A., Shangguan W., Wright M.N., Geng X., Bauer-Marschallinger B., Guevara M.A., Vargas R., MacMillan R.A., Batjes N.H., Leenaars J.G., Ribeiro E., Wheeler I., Mantel S, & Kempen B. (2017). SoilGrids250m: Global gridded soil information based on machine learning. PLoS ONE, 12(2), e0169748.
Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.
As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.
About PubCompare
Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.
We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.
However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.
Ready to
get started?
Sign up for free.
Registration takes 20 seconds.
Available from any computer
No download required