The largest database of trusted experimental protocols
> Procedures > Molecular Biology Research Technique > Genetic Association Studies

Genetic Association Studies

Genetic Association Studies are a powerful tool for uncovering connections between genetic variants and complex traits or diseases.
These studies examine the relationships between specific genetic markers, such as single nucleotide polymorphisms, and the risk or prevalence of a particular phenotype.
By analyzing large populations, researchers can identify genetic factors that contribute to the development or progression of complex conditions like Alzheimer's disease, diabetes, or cancer.
Genetic Association Studies provide valuable insights into the underlying genetic architecture of human health and disease, guiding the development of personalized medicine and targeted therapeutic interventions.
Thier findings can also inform our understanding of the biological pathways and mechanisms involved in various physiological and pathological processes.
Despite their utility, the reproducibiliy of Genetic Association Studies can be challanged by factors like sample size, population stratification, and statistical power.
Leveraging innovative technologies like AI-driven protocol optimization can enhance the reliability and impact of this important field of genetic research.

Most cited protocols related to «Genetic Association Studies»

Table 1 summarizes information about the pathway and disease databases that KOBAS 2.0 incorporates. Specifically, KEGG PATHWAY (3 (link)) and Reactome (29 (link),30 (link)) are general pathway databases, whereas PID (27 (link)) and Panther (31 (link)) focus on signaling pathways and BioCyc (28 (link)) focuses on metabolic pathways. PID has only human data, whereas the others are multispecies databases. OMIM (http://www.ncbi.nlm.nih.gov/omim/) contains information on all known mendelian disorders and genes. KEGG DISEASE (32 (link)) collects knowledge on genetic and environmental factors of diseases. FunDO (33 (link),34 (link)) is generated from GeneRIF using Disease Ontology Lite that is a condensed version of Disease Ontology. GAD (35 (link)) and NHGRI GWAS Catalog (36 (link)) both collect data from genetic association studies: GAD includes data from both candidate genes and GWAS studies, whereas NHGRI GWAS Catalog is a catalog of only GWAS studies.

Pathway and disease databases supported by KOBAS 2.0a

Database nameData contentFile formatNumber of speciesNumber of pathways or diseases in humanNumber of genes mapped to KEGG GENES/all genes in humanURL
KEGG PATHWAYPathwayText13272205595/5595http://www.genome.jp/kegg/pathway.html
PID CuratedPathwayXML11922782/3315http://pid.nci.nih.gov/
PID BioCartaPathwayXML12541907/2391http://pid.nci.nih.gov/
PID ReactomePathwayXML19963783/4405http://pid.nci.nih.gov/
BioCycPathwayText and Table62771087/1120http://biocyc.org/
ReactomePathwayTable22684366/4534http://www.reactome.org/ReactomeGWT/entrypoint.html
PantherPathwayTable431542170/2207http://www.pantherdb.org/
OMIMDiseaseTable149903792/3792http://www.ncbi.nlm.nih.gov/omim
KEGG DISEASEDiseaseText1323798/798http://www.genome.jp/kegg/disease/
FunDODiseaseTable15613888/4029http://django.nubic.northwestern.edu/fundo/
GADDiseaseTable137703164/3238http://geneticassociationdb.nih.gov/
NHGRIDiseaseTable13691975/2191http://www.genome.gov/gwastudies/

aThe numbers in this table are summarized from KOBAS 2.0 backend database updated in November 23rd, 2010. And all the analyses using KOBAS 2.0 in this article are based on this data version.

KOBAS 2.0 downloaded the raw data files from each database. As shown in Table 1, the file formats include plain text, XML and table. We have written parsers for all the data files. For each pathway or disease database, we retrieve the gene-term mapping by parsing the raw data files. We retrieve the gene annotation and gene-ID relations from KEGG Genes and BioMart (37 (link)). To integrate across different databases, we mapped the genes in all databases to KEGG GENES and KEGG ORTHOLOGY (KO). The gene-pathway and gene-disease data is stored in our backend SQL relational database. The FASTA protein sequence files were preprocessed for BLAST. KOBAS 2.0 backend data is updated every 3 months.
Publication 2011
Amino Acid Sequence Gene Annotation Genes Genetic Association Studies Genome Genome-Wide Association Study Hereditary Diseases Homo sapiens Reproduction Signal Transduction Pathways
Single-variant association tests analyzing the 16,144 advanced AMD cases and 17,832 controls of European ancestry were based on the Firth bias-corrected likelihood ratio test69 , which is recommended for genetic association studies that include rare variants70 (link), as implemented in EPACTS (see Web Resources). Analyses were adjusted for two principal components and source of DNA (whole-blood or whole-genome amplified DNA). Allele dosages of the imputed data were utilized, Sensitivity analyses were conducted to evaluate the influence of alternative association tests, alternative covariate adjustment including age or sex, or up to 10 principal components instead of two, as well as the influence of restricting to population-based controls, or to controls aged 50 years or older. Genomic control correction71 (link) was used to account for potential population stratification using all genotyped variants with minor allele count ≥ 20 outside of 20 previously described AMD loci6 (link),9 (link). As usual for genome-wide association studies, we considered P-values ≤ 5 × 10–8 as genome-wide significant.
To identify independently associated variants, we adopted a sequential forward selection approach: We first computed single variant association for each of the > 12 million variants. Then we selected the variant with the smallest P-value and its flanking ±5 Mb region, repeating the process until no genome-wide significant variant (P ≤ 5 × 10–8) was left yielding a number of 10 Mb regions. Within each of these large regions, we re-analyzed each variant conditioning on the top variant, and repeated this process by adding the previously identified genome-wide significant variant(s) within the respective 10 Mb region. This yielded one or more independently associated genome-wide significant variant(s) per 10 Mb region.
A locus region was defined by a genome-wide significant variant and its correlated variants (r2 (link)≥ 0.5) ± 500kb; overlapping locus regions were merged to one locus, so some loci contained more than one index variant (details in Supplementary Figure 3).
In order to derive independent effect sizes (log odds ratios) for all identified variants, we computed a fully conditioned logistic regression model including all identified variants.
Publication 2015
Alleles BLOOD Europeans Genetic Association Studies Genome Genome-Wide Association Study Hypersensitivity
We conducted a genetic association study with three stages as displayed in Figure 1. Stage 1 consisted of the Myocardial Infarction Genetics Consortium (MIGen), a collection of 2,967 cases of early-onset MI (in men ≤50 years old or women ≤60 years old) and 3,075 age- and sex-matched controls free of MI from six international sites: Boston and Seattle in the United States as well as Sweden, Finland, Spain, and Italy (Table 1). At each site, MI was diagnosed on the basis of autopsy evidence of fatal MI or a combination of chest pain, electrocardiographic evidence of MI, or elevation of one or more cardiac biomarkers (creatine kinase or cardiac troponin). The mean age at the time of MI was 41 years among male cases and 47 years among female cases.
We took forward SNPs into two stages of replication (Stages 2 and 3, Figure 1). 1441 SNPs were tested in Stage 2 based on two criteria: i) strength of statistical evidence in Stage 1 (1433 SNPs from loci with P < 10-3 in Stage 1) or ii) belonging to one of eight reported loci from recent genome-wide association studies for coronary artery disease (a common SNP from each of 9p21.3, near CXCL12, SMAD3, MTHFD1L, MIA3, near CELSR2/PSRC1/SORT1, 2q36, and PCSK9)3 (link),7 (link).
Stage 2 consisted of in silico comparisons with four recently completed GWAS for MI consisting of a symmetric effective sample size of up to 3,942 cases of MI and 3,942 controls. These studies included the Wellcome Trust Case Control Consortium Coronary Heart Disease study3 (link), German MI Family Study I3 (link), PennCATH, and MedStar (Supplementary Table 1). In each Stage 2 study, the analysis was restricted to the phenotype of MI with an age of onset threshold of <66 years for men or women. Although this age cutoff is slightly less restrictive than that used in Stage 1, this cutoff is at or below the mean age of first MI in the US (65 years for men and 70 years for women).
Thirty-three SNPs were taken forward to Stage 3, which consisted of genotyping an additional 6 studies with a symmetric effective sample size of up to 5,469 cases of MI and 5,469 controls. These six studies included Acute MI Gene Study/Dortmund Health Study, Verona Heart Study29 (link), Mid-America Heart Institute Study30 (link), Irish Family Study31 (link), German MI Family Study II, and INTERHEART32 (link) (European ancestry and South Asian ancestry each analyzed separately) (Supplementary Table 2). Stage 3 was comprised of 25 SNPs with the best combined statistical evidence for MI from Stages 1 and 2 (P < 10-5) and the eight previously-reported SNPs discussed above. In each Stage 3 study, the analysis was restricted to the phenotype of MI and in four of the six studies, an age of onset threshold was established at <66 years for men or women.
Publication 2009
Asian Persons Autopsy Biological Markers CELSR2 protein, human Chemokine CXCL12 Chest Pain Coronary Artery Disease Creatine Kinase DNA Replication Electrocardiography Europeans Genes Genetic Association Studies Genome-Wide Association Study Heart Heart Disease, Coronary Males migen Myocardial Infarction PCSK9 protein, human Phenotype Single Nucleotide Polymorphism SMAD3 protein, human SORT1 protein, human Troponin Woman
Lipoprotein fractions for Women’s Genome Health Study (WGHS) samples (N = 23170) were measured using the LipoProtein-II assay (Liposcience Inc. Raleigh, NC) and Framingham Heart Study Offspring samples (N = 2900) were measured with the LipoProtein-I assay (Liposcience Inc. Raleigh, NC)47 (link). Additional information on sub-fraction measurements can be found in Supplementary Fig. 7. Log transformations were used for non-normalized traits. All models were adjusted for age, sex, and PCs. The genetic association analysis of WGHS used SNP genotypes imputed from the HapMap r22 CEU reference panel using MACH. 16,730 out of 23,170 WGHS participants were fasting for 8 hours prior to blood draw (72.2%).
Publication 2013
Biological Assay BLOOD CASP8 protein, human Genetic Association Studies Genome Genotype HapMap Healthy Volunteers Lipoproteins Woman
A multidisciplinary group developed the STREGA Statement by using literature review, workshop presentations and discussion and iterative electronic correspondence after the workshop. Thirty-three of 74 invitees participated in the STREGA workshop in Ottawa, Ontario, Canada, in June, 2006. Participants included epidemiologists, geneticists, statisticians, journal editors and graduate students.
Before the workshop, an electronic search was performed to identify existing reporting guidance for genetic association studies. Workshop participants were also asked to identify any additional guidance. They prepared brief presentations on existing reporting guidelines, empirical evidence on reporting of genetic association studies, the development of the STROBE Statement and several key areas for discussion that were identified on the basis of consultations before the workshop. These areas included the selection and participation of study participants, rationale for choice of genes and variants investigated, genotyping errors, methods for inferring haplotypes, population stratification, assessment of Hardy–Weinberg equilibrium (HWE), multiple testing, reporting of quantitative (continuous) outcomes, selectively reporting study results, joint effects and inference of causation in single studies. Additional resources to inform workshop participants were the HuGENet handbook [57 ,58 (link)], examples of data extraction forms from systematic reviews or meta-analyses, articles on guideline development [59 (link),60 (link)] and the checklists developed for STROBE. To harmonize our recommendations for genetic association studies with those for observational epidemiologic studies, we communicated with the STROBE group during the development process and sought their comments on the STREGA draft documents. We also provided comments on the developing STROBE Statement and its associated explanation and elaboration document [56 (link)].
Publication 2009
Epidemiologists Genetic Association Studies Genetic Selection Haplotypes Joints Student

Most recents protocols related to «Genetic Association Studies»

There were 8 GWAS studies on DM in East Asians [18 (link)–22 (link)] enrolled in GWAS Catalog [27 (link)]. These GWAS studies identified more than 380 SNPs with genome-wide significance [27 (link)]. This study focused on the relationships between the top 95 significant DM SNPs (with a p-value < 10–16) and carotid atherosclerosis (Additional file 1: Table S1). These top significant DM SNPs or their closely linked SNPs were considered for the genetic association study.
We used the Ensemble Genome Browser [28 (link)] to retrieve the linkage disequilibrium (LD) data in the 1000 Human Genome Project Phase 3-Southern Han Chinese [29 (link)]. The cut-off LD (r2) value of linkage was set at 0.80. Among these top 95 significant DM SNPs, 22 of them are LD with more significant SNPs, leaving a total of 73 independent DM SNPs. Among these independent DM SNPs, rs3816157, rs2844623, rs610930, rs12549902, rs13266634, and rs2383208 are the designed SNPs of the plate. Besides, another 38 SNPs of the array plate are closely linked with DM SNPs (r2 > 0.80). Consequently, a total of 44 SNPs were regarded as the candidate SNPs.
In the study, we used a plate (Axiom® CHB 1 Array Plate; Affymetrix Ltd, Santa Clara, CA, USA) to determine the genotypes of these 44 DM SNPs. All genotyping was performed by the National Center for Genome Medicine, Academic Sinica, Taiwan. The frequency distributions of genotypes of these 44 DM SNPs in CP-positive and -negative individuals are shown in Additional file 1: Table S2. The call rates of all typed SNPs were greater than 95% and the relative frequencies of the minor alleles of all typed SNPs were greater than 5%. Yet, SNP rs13342692 was excluded from association analysis for violation of the Hardy–Weinberg Equilibrium, leaving a total of 43 SNPs for association evaluation.
Publication 2023
Carotid Atherosclerosis Chinese East Asian People Genetic Association Studies Genome Genome-Wide Association Study Genotype Single Nucleotide Polymorphism
The data for continuous variables are shown as mean ± s.d. Comparisons between control and GDM subjects were performed parametrically using an independent samples t-test. Differences in variables for symmetric variables between the two groups were estimated using Mann–Whitney U test. The chi-square (χ2) test was used to examine the deviation of genotype distribution from Hardy–Weinberg equilibrium and to estimate the differences in frequencies between the GDM and control groups. Odds ratios (ORs) and 95% confidence intervals (CIs) were computed to assess the relative risk for GDM associated with the genetic variations by logistic regression analysis or χ2 analysis. The biochemical indexes after correction for differences in age and delivery BMI between the two groups were estimated using covariance analysis. Statistical significance was set as P < 0.05. Statistical analyses were conducted using SPSS 21.0 (IBM).
A power calculation based on sample size and the minor allele frequency of the SNP MPO G-463A (significance level = 0.05, prevalence = 0.15) was conducted by the Genetic Association Study (GAS) Power Calculator (http://csg.sph.umich.edu/abecasis/gas_power_calculator/index.html).
Publication 2023
Genetic Association Studies Genetic Diversity Genotype Obstetric Delivery
To facilitate investigations of nominated risk variants, members of IPDGC have created a PD GWAS locus browser (https://pdgenetics.shinyapps.io/GWASBrowser/) that makes relevant statistics and datasets available to the public30 (link). Throughout the hackathon, our team continued the development of this browser through the addition of new datasets and features. To identify secondary association signals at each locus from the Nalls et al. 2019 study, we performed conditional analysis using the Genome-wide Complex Trait Analysis (GCTA) tool31 (link),32 (link). Locus zoom plots were added to display the results of this conditional analysis (Fig. 4b)33 . Power calculations were done for each risk variant by Nalls et al. 2019 to determine if the findings were sufficiently powered. To do so, we followed methods used by the Genetic Association Study Power Calculator tool (https://csg.sph.umich.edu/abecasis/gas_power_calculator/), using summary statistics from Nalls et al. 2019, a disease prevalence of 0.01, and a significance level of 0.05 as input. We queried blood gene expression data included in the AMP PD version 2.5 release to measure expression levels in PD cases and controls. We obtained TPM expression at baseline for samples that had case or control status and no PD mutations in whole-genome sequencing data, leaving a total of 1710 samples. Expression data for each gene was displayed in a violin plot and added to the expression section of the browser (Fig. 4b). The literature section of the browser was updated to display a description, PubMed hit count, and word cloud plot for each gene within 1 MB of a PD risk variant. Our last addition to the browser was a display of user statistics. We used the googleAnalyticsR package34 to record and visualize the number of visits for the browser and each risk variant within a period specified by the user (Fig. 4b).
Publication 2023
BLOOD Gene Expression Genes Genetic Association Studies Genome Genome-Wide Association Study Mutation Polygenic Traits
UKB participants’ records have been linked with inpatient hospital codes, primary care data, and death registry for longitudinal follow-up. Outcome events were gathered from hospital admissions and death registry data using International Classification of Diseases (ICD) 10 codes that were aligned with the diagnostic algorithm in the UKB (https://biobank.ndph.ox.ac.uk/showcase/ukb/docs/alg_outcome_main.pdf). Incident events were defined as events occurring after baseline and after the first drug prescription date. Because only very limited information on ischemic vs. hemorrhagic stroke subtypes before baseline exists in the UKB, recurrent ischemic strokes were defined as ischemic strokes occurring in individuals with a history of any stroke at baseline, as defined in the UKB field 42006. The ICD-10 codes used for ascertainment of the outcomes are supplied in Supplemental Table S6. Stroke outcomes in the UKB have been routinely used in genetic association studies, including the most recent GWAS of stroke risk.11 (link)
Publication Preprint 2023
Cerebrovascular Accident Diagnosis Genetic Association Studies Genome-Wide Association Study Hemorrhagic Stroke Inpatient Primary Health Care Stroke, Ischemic
The DEGs resulting from the above analyses, i.e., those that scored highly in the centrality tests and were part of identified functional modules, were validated by text-mining using databases such as DisGeNET, MalaCards, and HuGE Genopedia.
The DisGeNET database was used to obtain the genes associated with ATH and NAFLD. DisGeNET is a discovery platform containing one of the largest publicly available collections of genes and variants associated with human diseases [106 (link)]. The latest update available is version 7 (June 2020) containing 1,134,942 gene–disease associations (GDAs) between 21,671 genes and 30,170 diseases and traits. The data contained in this database come from the most popular repositories used by the scientific community. In addition, these data are expanded and enriched with information extracted from scientific literature using state-of-the-art text-mining tools.
MalaCards is an integrated database of human pathologies and their annotations. This database is organized into disease cards containing information, annotations, connections between other diseases, as well as genes associated with each disease. It currently contains 22,091 disease entries, which come from 75 sources [107 (link)].
HuGE Genopedia is a database that focuses on genetic association studies summarized in Human Genome Epidemiology (HuGE). Following its latest available data update, it contained 16,498 genes and 3416 diseases. Using a single gene as a query, it provides summary information on diseases that have been studied in association with the given query [108 (link)].
Publication 2023
Genes Genes, vif Genetic Association Studies Genome, Human Hereditary Diseases Homo sapiens Non-alcoholic Fatty Liver Disease

Top products related to «Genetic Association Studies»

Sourced in United States, Germany
The ABI PRISM 7700 system is a real-time PCR instrument designed for nucleic acid detection and quantification. It utilizes fluorescence-based detection technology to provide accurate and sensitive measurement of target sequences.
Sourced in United States, Japan, United Kingdom, Germany, Israel, Thailand
SPSS version 17.0 is a statistical software package developed by IBM. It provides a comprehensive set of tools for data analysis, including data manipulation, visualization, and predictive modeling. The software is designed to handle a wide range of data types and offers a user-friendly interface for conducting complex statistical analyses.
Sourced in Germany, United States, Spain, United Kingdom, Japan, Netherlands, France, China, Canada, Italy, Australia, Switzerland, India, Brazil, Norway
The QIAamp DNA Blood Mini Kit is a laboratory equipment designed for the extraction and purification of genomic DNA from small volumes of whole blood, buffy coat, plasma, or serum samples. It utilizes a silica-based membrane technology to efficiently capture and wash DNA, while removing contaminants and inhibitors.
The Illumina 317 Chip is a high-throughput sequencing platform designed for next-generation sequencing applications. It is capable of generating up to 10 gigabases of sequencing data per run.
The Illumina 370 Duo Chip is a high-density genotyping microarray designed for genome-wide association studies. It enables the measurement of genomic variations across the human genome.
The Illumina Human Hap 600 is a high-density genotyping array that enables genome-wide association studies (GWAS) and other genomic research applications. The array assesses over 600,000 genetic markers across the human genome.
Sourced in United States, Japan, China, Germany, United Kingdom, Singapore, Spain, France, Canada, Italy, Switzerland, Belgium
The 7300 Real-Time PCR System is a laboratory instrument designed for quantitative real-time polymerase chain reaction (qRT-PCR) analysis. It provides precise detection and quantification of target DNA sequences in samples. The system includes a thermal cycler, optical detection module, and analysis software to enable real-time monitoring of PCR amplification.
Sourced in Canada, United States
The Oragene kits are a line of products designed for the collection and stabilization of DNA samples from saliva. These kits provide a non-invasive method for obtaining high-quality genetic material that can be used in various research and diagnostic applications.
Sourced in United States
The TaqMan Allelic Discrimination Protocol is a real-time PCR-based method used for the detection and genotyping of single nucleotide polymorphisms (SNPs) or other sequence variants. It utilizes TaqMan probes labeled with different fluorescent reporters to discriminate between allelic variants in a DNA sample.

More about "Genetic Association Studies"

Genetic Association Studies (GAS) are a powerful tool for uncovering connections between genetic variants, such as single nucleotide polymorphisms (SNPs), and complex traits or diseases.
These studies examine the relationships between specific genetic markers and the risk or prevalence of a particular phenotype.
By analyzing large populations, researchers can identify genetic factors that contribute to the development or progression of complex conditions like Alzheimer's disease, diabetes, or cancer.
GAS provide valuable insights into the underlying genetic architecture of human health and disease, guiding the development of personalized medicine and targeted therapeutic interventions.
Their findings can also inform our understanding of the biological pathways and mechanisms involved in various physiological and pathological processes.
Despite their utility, the reproducibiliy of GAS can be challenged by factors like sample size, population stratification, and statistical power.
Leveraging innovative technologies like AI-driven protocol optimization can enhance the reliability and impact of this important field of genetic research.
Researchers can utilize tools like the ABI PRISM 7700 system, SPSS version 17.0, QIAamp DNA Blood Mini Kit, Illumina 317 Chip, Illumina 370 Duo Chip, Illumina Human Hap 600, and 7300 Real-Time PCR System to conduct GAS.
Additionally, Oragene kits and Taqman allelic discrimination protocols can be employed to collect and analyze genetic data.
By incorporating these technologies and best practices, researchers can streamline their workflow, improve the reliability of their findings, and advance our understanding of the genetic underpinnings of complex traits and diseases.