We used a semi-automatic approach to identify publications related to ICI efficacy. We initially performed manual, keyword-based searches on PubMed. In a recent meta-analysis that compared predictors of ICI efficacy (15 (
link)), the authors provided 20 sets of search terms used to find the 55 predictors included in their comparisons, such as
(‘Predictive biomarker’ AND ‘immunotherapy’). We used these search terms to perform our first round of literature searches.
While these initial keyword-based searches did help us obtain publications related to ICI efficacy, testing different combinations of keywords on the user interface of PubMed was tedious due to difficulties in comparing and quantifying results from different searches. We therefore made use of an automatic tool, Literature Scanner (LISC) (30 ). LISC searches from PubMed using the Entrez Programming Utilities (EUtils) application programming interface (API) and looks for keywords within publication titles and abstracts. Additionally, each LISC search (called a ‘scan’) can be saved in a customizable database structure, which provides a clear record of the searches we have performed and the publications we have identified in all previous searches.
To use LISC, we needed to supply a query in the form of a logical expression of search terms. Because each LISC scan takes one query, we wanted to optimize our query to capture many relevant publications that involve a diverse set of keywords. To do that, we examined the search terms used by the authors of the meta-analysis (15 (
link)). After removing stop words and combining synonyms, the most common words in their search terms were ‘blockade’, ‘checkpoint’, ‘immunotherapy’ and ‘predict’. For this reason, in the first LISC scan, we used the query
("ICI" OR "Immune checkpoint blockade" OR "Immune checkpoint immunotherapy") AND "predict". While the results contained many publications relevant to ICI efficacy, we found that quite a lot of them involved cancer patients who did not actually undergo ICI treatment (mostly from The Cancer Genome Atlas (TCGA) (31 (
link)) or the Chinese Glioma Genome Atlas (CGGA) (32 (
link))). These publications used established predictors (mostly TIDE (25 (
link)) and Immunophenoscore (33 (
link))) as proxies of ICI response instead of taking actual patient response. We therefore expanded our LISC query to exclude these publications. We also found that some publications used the expression levels of ICI targets (e.g. CTLA-4 mRNA level) as a proxy of ICI response rather than measuring the patient/model organism responses. We found that adding actual ICI drug names to our LISC query effectively reduced the proportion of such publications, probably because publications that mention specific ICI drug names are more likely related to clinical trials of the drugs. Combining all these findings, our final LISC query was
("Pembrolizumab" OR "Tremelimumab" OR "Ipilimumab" OR "Durvalumab" OR "Nivolumab") AND ("Predict" OR "predict response") NOT ("TIDE" OR "immunophenoscore" OR "TCGA"), which effectively retrieved a large number of publications with a high proportion of which being relevant to ICI efficacy predictors.