The range size and speaker population size of each language were estimated based on information from the Ethnologue, 16th edition [25 ], which represents the most authoritative and only globally comprehensive source of basic information about languages and their speakers. The data were assembled in a Geographical Information System by Global Mapping International as the WLMS database [33 ], providing georeferenced polygons showing their geographical range, associated with information on speaker population size. Languages that are given as points or have no known location/population size were excluded, leaving 6359 (92% of the known 6909 languages) and 6569 (95%) languages in the analysis for range size and population size, respectively. The total area (km2) of all the polygons for each language was defined as the range size, and the latest estimate of the total number of mother-tongue speakers in the polygon attributes as the speaker population size.
Speaker growth rates were estimated using the index of linguistic diversity (ILD) database [34 ], updated with the Ethnologue, 16th edition; this database provides information on temporal changes in the speaker population size (i.e. estimates of speaker population size and survey years) between 1949 and 2005 for 1500 languages selected at random from the Ethnologue. The ILD database is currently the only global database with information on changes in the population size of languages. To estimate speaker growth rate, we selected languages with at least three records of speaker population size, including at least one non-zero record. This resulted in 649 languages, including 24 languages that have become extinct after 1949, to be analysed for their speaker growth rate. This sample size represents approximately 9% of all known languages but the languages included are well scattered across the globe, roughly following the pattern of distribution for all the languages (see electronic supplementary material, figures S1 and S2). The biases in range size and speaker population size between the 649 languages and all available languages in the ILD and WLMS databases were also very small (see the electronic supplementary material, figure S3, for more detail). Thus, we expect the effect of using the sample of 649 languages for drawing conclusions to be minimal. The level of intergenerational transmission in each language was derived from the Atlas of the World's languages in danger [15 ] (see the electronic supplementary material, appendix B for more detail).
Data on potential drivers of extinction risk were derived from different global data sources (electronic supplementary material, appendix C). Since records used for estimating speaker growth rates were mostly collected between 1978 and 2000 (see the electronic supplementary material, figure S4), we used data sources from this period as much as possible. Though information on gross domestic product (GDP) and globalization was only available at the country level, the obtained data fit the purpose of this analysis, given that the economic status and degree of globalization of a country, not of each speaker, are expected to cause language shifts through educational developments [19 ] and the economic benefits of speaking national and global languages [17 ]. Language richness in each cell was defined as the total number of languages whose geographical range overlaps that cell, based on the WLMS database. The land area of a latitudinal band was calculated as the sum of the land area of all grid cells within the same latitude at the 2° resolution.
Free full text: Click here