Literature relevant to the genes were identified by querying PubMed using the approved HGNC gene symbol, name, and alias, including a set of noncoding regions (introns OR promoter OR UTR OR miRNA OR insulator OR enhancer OR silencer) and restricted to abstracts that have been indexed to the MeSH term “human.” The full texts of these papers were downloaded via PubGet (http://pubget.com/) and EndNote (http://www.endnote.com/) and converted into plain text using pdf2text (http://www.foolabs.com/xpdf/home.html).
The full text of these articles was searched for word stems “bind” and “muta” in a single paragraph. The pdf2text conversion software keeps paragraphs together as a single line. Therefore, both words did not need to exist in a single sentence. The word stem “bind” was chosen because it can represent DNA binding or RNA binding activities independent of an assay, while the word stem “muta” (for mutated or mutant or mutagenesis) indicates that studies were performed to assess whether that nucleotide or region is necessary and sufficient for activity.