The selection of the correct keyword(s) when examining online queries is key for valid results [51 (
link)]. Thus, many factors should be taken into consideration when using Google Trends data in order to ensure a valid analysis.
Google Trends is not case sensitive, but it takes into account accents, plural or singular forms, and spelling mistakes. Therefore, whatever the choice of keywords or combination of keywords, parts of the respective queries will not be considered for further analysis.
To partly overcome this limitation, the “+” feature can be used to include the most commonly encountered misspellings, which are selected and entered manually; however, we should keep in mind that some results will always be missing, as all possible spelling variations cannot be included. In addition, incorrect spellings of some words could be used even more often than the correct one, in which case, the analysis will not be trivial. However, in most of the cases, the correct spelling is the most commonly used, and therefore, the analysis can proceed as usual. For example, gonorrhea is often misspelled, mainly as “Gonorrea,” which is also the Spanish term for the disease. As depicted in
Figure 4a, both terms have significantly high volumes. Therefore, to include more results, both terms could be entered as the search term by using the “+” feature (
Figure 4b). In this way, all results including the correct and the incorrect spellings are aggregated in the results. Note that this is not limited to only two terms; the “+” feature can be used for multiple keywords or for results in multiple languages in a region.
In the case of accents, before choosing the keywords to be examined, the variations in interest between the terms with and those without accents and special characters should be explored. For example, measles translates into “Sarampión,” “ošpice,” “mässling,” and “Ιλαρά” in Spanish, Slovenian, Swedish, and Greek, respectively. As depicted in
Figure 5, in Spanish and Greek, the term without the accent is searched for in higher volumes; in Slovenian, the term with the accent is mostly used; and in Swedish, the term without the accent is almost nonexistent. Thus, in Greek searches, the term without accent should be selected, in Slovenian and Swedish searches, terms with accents should be used, while for Spanish, as both terms yield significant results, either both terms using the “+” feature or the term without the accent should be selected.
Another important aspect is the use of quotation marks when selecting the keyword. This obviously applies only to keywords with two or more words. For example, breast cancer can be searched online by using or not using quotes. To elaborate, the term “breast cancer” without quotes will yield results that include the words “breast” and “cancer” in any possible combination and order; for example, keywords “breast cancer screening” and “breast and colon cancer” are both included in the results. However, when using quotes, the term “breast cancer” is included as is; for example, “breast cancer screening,” “living with breast cancer,” and “breast cancer patient.” As shown in
Figure 6a, the results are almost identical in this case. However, this is not always the case. As depicted in
Figure 6b, this is clearly different for “HIV test.” When searching for HIV test with and without quotes, the results differ in volumes of searches, despite the trend being very similar but not exactly the same.
Finally, when researching with Google Trends, the options of “search term” and “disease” (or “topic”) are available when entering a keyword. Although the “search term” gives results for all keywords that include the selected term, “disease” includes various keywords that fall within the category, or, as Google describes it, “topics are a group of terms that share the same concept in any language.”
Therefore, it is imperative that keyword selection is conducted with caution and that the available options and features are carefully explored and analyzed. This will ensure validity of the results.
Mavragani A, & Ochoa G. (2019). Google Trends in Infodemiology and Infoveillance: Methodology Framework. JMIR Public Health and Surveillance, 5(2), e13439.