The largest database of trusted experimental protocols

Gene Clusters

Gene clusters are groups of genes that are located close together on a chromosome and are often co-expressed or functionally related.
These clusters play important roles in a variety of biological processes, such as immune system function, developmental regulation, and metabolic pathways.
Understanding the structure and organization of gene clusters is crucial for unraveling the complex genetic mechanisms underlying various diseases and traits.
PubCompare.ai's AI-driven platform enhances reproducibility and accuracy in gene cluster research by helping researchers locate the best protocols from literature, pre-prints, and patents using intelligent comparisons.
This cutting-edge technology can optimize research and reduce the risk of experimental eroors, empowering scientists to make new discoveries in this vital field of study.

Most cited protocols related to «Gene Clusters»

To create the annotations network ClueGO provides predefined functional analysis settings ranging from general to very specific ones. Furthermore, the user can adjust the analysis parameters to focus on terms, e.g. in certain GO level intervals, with particular evidence codes or with a certain number and percentage of associated genes. An optional redundancy reduction feature (Fusion) assesses GO terms in a parent–child relation sharing similar associated genes and preserves the more representative parent or child term. The relationship between the selected terms is defined based on their shared genes in a similar way as described by Huang et al. (2007 (link)). ClueGO creates first a binary gene-term matrix with the selected terms and their associated genes. Based on this matrix, a term–term similarity matrix is calculated using chance corrected kappa statistics to determine the association strength between the terms. Since the term–term matrix is of categorical origin, kappa statistic was found to be the most suitable method. Finally, the created network represents the terms as nodes which are linked based on a predefined kappa score level. The kappa score level threshold can initially be adjusted on a positive scale from 0 to 1 to restrict the network connectivity in a customized way. The size of the nodes reflects the enrichment significance of the terms. The network is automatically laid out using the Organic layout algorithm supported by Cytoscape. The functional groups are created by iterative merging of initially defined groups based on the predefined kappa score threshold. The final groups are fixed or randomly colored and overlaid with the network. Functional groups represented by their most significant (leading) term are visualized in the network providing an insightful view of their interrelations. Also other ways of selecting the group leading term, e.g. based on the number or percentage of genes per term are provided. As an alternative to the kappa score grouping the GO hierarchy using parent–child relationships can be used to create functional groups.
When comparing two gene clusters, another original feature of ClueGO allows to switch the visualization of the groups on the network to the cluster distribution over the terms. Besides the network, ClueGO provides overview charts showing the groups and their leading term as well as detailed term histograms for both, cluster specific and common terms.
Like BiNGO, ClueGO can be used in conjuntion with GOlorize for functional analysis of a Cytoscape gene network. The created networks, charts and analysis results can be saved as project in a specified folder and used for further analysis.
Publication 2009
Child Gene Clusters Gene Regulatory Networks Genes Genes, vif Parent
We used STRUCTURE [1 (link),2 (link)] as a benchmark for the performance of DAPC. We analysed all simulated datasets with STRUCTURE v2.1, using the admixture model with correlated allele frequencies to determine the optimal number of genetic clusters and to assign individuals to groups. Computations were performed on the computer resources of the Computational Biology Service Unit at Cornell University (http://cbsuapps.tc.cornell.edu/). For each run, results were based on a Markov Chain Monte Carlo (MCMC) of 100,000 steps, of which the first 20,000 were discarded as burn-in. Analyses were ran with numbers of clusters (k) ranging from 1 to 8 for the island and hierarchical island models (Figure 2a-b), from 1 to 15 for the hierarchical stepping stone (Figure 2c), and from 1 to 30 for the stepping stone (Figure 2d). Ten runs were performed for each k value. We employed the approach of Evanno et al. [57 (link)] to assess the optimal number of clusters. In order to assess assignment success, STRUCTURE was run by enforcing k to its true value. Individuals were assigned to clusters using CLUMPP 1.1.2 [58 (link)], which allows to account for the variability in individual membership probabilities across the different runs. To obtain results comparable to DAPC, individuals were assigned to the cluster to which they had the highest probability to belong.
Full text: Click here
Publication 2010
1,2-diarachidonoyl-glycero-3-phosphocholine Calculi Gene Clusters MLL protein, human
In order to rapidly annotate the accessory genes surrounding the detected core signature genes in the various types of secondary metabolite biosynthesis gene clusters, we constructed a database of all gene clusters contained in the latest NCBI nt database (15 February 2011). To do so, pHMMs described above were used to detect all secondary metabolite biosynthesis gene cluster signature genes in the nr database. The accession numbers of all hits meeting the described cut-offs were extracted and used to download the corresponding GenPept files. If the taxonomy identifier included ‘bacteria’ or ‘fungi’, the nucleotide source accession number was extracted. The corresponding nucleotide GenBank files were then downloaded as well, and cross-checked for presence of the queried protein accession number. For each nucleotide GenBank file, gene clusters were detected as described above. Amino acid sequences of all genes contained within the gene clusters were written to a FASTA file with headers containing key information, and a summary of all detected gene clusters (nucleotide accession, nucleotide description, cluster number, cluster type, protein accession numbers) was written to a text file. To construct the smCOGs, clustering of all gene cluster proteins was performed using OrthoMCL (21 (link)), and consensus annotations were manually assigned based on the frequencies of the five most prevalent annotations of each smCOG in GenBank. For each smCOG, a seed alignment was created from 100 randomly picked sequences using MUSCLE 3.5 (22 (link)), and a pHMM of each smCOG was generated based on the conserved core of each alignment (Supplementary Figure S1). Within the antiSMASH software pipeline, the smCOG pHMMs are used for functional annotation of all accessory genes within the gene clusters. After assignment of an smCOG to a gene—based on the highest-scoring pHMM on its sequence above a certain e-value threshold—the predicted protein sequence is aligned to the smCOG seed alignment, and a rough neighbor-joining phylogenetic tree is calculated using FastTree 2 (23 (link)) and visualized with TreeGraph 2 (24 (link)) (Supplementary Figure S1).
Publication 2011
Amino Acid Sequence Anabolism Bacteria Fungi Gene Annotation Gene Clusters Gene Products, Protein Genes Muscle Tissue Nucleotides Proteins

Protocol full text hidden due to copyright restrictions

Open the protocol to access the free full text link

Publication 2015
Chromatin Copy Number Polymorphism Cytoplasmic Granules Gene Clusters Genes Genetic Heterogeneity Malignant Neoplasms Oncogenes
Five analysis modules will be executed in PGAP after checking and pre-preparation (Supplementary Fig. S1). They are cluster analysis of functional genes, pan-genome profile analysis, genetic variation analysis of functional genes, species evolution analysis and function enrichment analysis of gene clusters. Among all these five modules, the cluster analysis of functional genes module is the basis for the whole program, as other modules are dependent on the orthologous clusters' output from cluster analysis of functional genes. As for species evolution analysis, it is dependent on the results from genetic variation analysis of functional genes and orthologous clusters (Supplementary Material).
Publication 2011
Biological Evolution Gene Clusters Gene Modules Genes Genetic Diversity Genetic Profile

Most recents protocols related to «Gene Clusters»

Example 17

Since interferon signaling is spontaneously activated in a subset of cancer cells and exposes potential therapeutic vulnerabilities, it was tested whether there is evidence for similar endogenous interferon activation in primary human tumors. An IFN-GES threshold was computed to predict ADAR dependency across the CCLE cell lines and was determined to be a z-score above 2.26 (FIG. 66, panel A). This threshold was applied to The Cancer Genome Atlas (TCGA) tumors, to identify primary cancers with similarly high interferon activation. Restricting the analysis to the 4,072 samples analyzed by TCGA with at least 70% tumor purity as estimated by the ABSOLUTE algorithm (Carter et al. (2012) Nat. Biotechnol. 30:413-421), 2.7% of TCGA tumors displayed IFN-GESs above this threshold (FIG. 66, panel B and. GSEA of amplified genes in these high purity, high interferon tumors revealed the top pathway as “Type I Interferon Receptor Binding”, comprising 17 genes that all encode type I interferons and are clustered on chromosome 9p21.3 (FIG. 67).

Furthermore, analysis of TCGA copy number data showed that the interferon gene cluster including IFN-β (IFNβI), IFN-ε (IFNE), IFN-ω (IFNWI), and all 13 subtypes of IFN-α on chromosome 9p21.3, proximal to the CDKN2A/CDKN2B tumor suppressor locus, is one of the most frequently homozygously deleted regions in the cancer genome. The interferon genes comprise 16 of the 26 most frequently deleted coding genes across 9,853 TCGA cancer specimens for which ABSOLUTE copy number data are available (FIG. 66, panels C and D). Interferon signaling and activation, both in tumors with high IFN-GESs or deletions in chromosome 9p, therefore represent a biomarker to stratify patients who benefit from interferon modulating therapies.

In summary, specific cancer cell lines have been identified with elevated IFN-β signaling triggered by an activated cytosolic DNA sensing pathway, conferring dependence on the RNA editing enzyme, ADAR1. In cells with low, basal interferon signaling, the cGAS-STING pathway is inactive and PKR levels are reduced (FIG. 68, panel A). Upon cGAS-STING activation, interferon signaling and PKR protein levels are elevated but ADAR1 is still able to suppress PKR activation (FIG. 68, panel B). However, once ADAR1 is deleted, the abundant PKR becomes activated and leads to downstream signaling and cell death (FIG. 68, panel C). This is also shown in normal cells lines (e.g. A549 and NCI-H1437) once exogenous interferon is introduced (FIG. 68, panel D). ADAR1 deficiency in cell lines with high interferon levels, whether from endogenous or exogenous sources, led to phosphorylation and activation of PKR, ATF4-mediated gene expression, and apoptosis. Recent studies have shown that cGAS activation and innate interferon signaling, induced by cytosolic DNA released from the nucleus by DNA damage and genome instability (Mackenzie et al. (2017) Nature 548:461-465; Harding et al. (2017) Nature 548:466-470), led to elevated interferon-related gene expression signatures, which have been linked to resistance to DNA damage, chemotherapy, and radiation in cancer cells (Weichselbaum et al. (2008) Proc. Natl. Acad. Sci. USA 105:18490-18495). In high-interferon tumors, blocking ADAR1 might be effective to induce PKR-mediated apoptotic pathways while upregulating type I interferon signaling, which could contribute to anti-tumor immune responses (Parker et al. (2016) Nature 16:131-144). Alternatively, in tumors without activated interferon signaling, ADAR1 inhibition can be combined with localized interferon inducers, such as STING agonists, chemotherapy, or radiation. Generation of specific small molecule inhibitors targeting ADAR1 exploits this novel vulnerability in lung and other cancers and serves to enhance innate immunity in combination with immune checkpoint inhibitors.

Full text: Click here
Patent 2024
agonists Apoptosis ATF4 protein, human Biological Markers CDKN2A Gene Cell Death Cell Lines Cell Nucleus Cells Chromogranin A Chromosome Deletion Chromosomes, Human, Pair 3 Cytosol DNA Damage Electromagnetic Radiation Enzymes Gene, Cancer Gene Clusters Gene Expression Genes Genome Genomic Instability Homo sapiens IFNAR2 protein, human Immune Checkpoint Inhibitors Immunity, Innate inhibitors Interferon-alpha Interferon Inducers interferon omega 1 Interferons Interferon Type I Lung Malignant Neoplasms Neoplasms Oncogenes Patients Pharmacotherapy Phosphorylation Proteins Psychological Inhibition Response, Immune Tumor Suppressor Genes

Example 1

This example describes the generation of a marker-free B. subtilis strain expressing allulose epimerase. Briefly, in a first step, a B. subtilis strain was transformed with a cassette encoding the BMCGD1 epimerase and including an antibiotic resistance marker. This cassette recombined into the Bacillus chromosome and knocked out 8 kb of DNA, including a large sporulation gene cluster and the lysine biosynthesis gene lysA. In a second step, a second cassette was recombined into the B. subtilis chromosome, restoring the lysA gene and removing DNA encoding the antibiotic resistance. E. coli strain 39 A10 from the Keio collection was used to passage plasmid DNA prior to transformation of B. subtilis. The relevant phenotype is a deficiency in the DNA methylase HsdM in an otherwise wild-type K-12 strain of E. coli.

In detail, a cassette of 5120 bp (SEQ ID NO:1; synthetic DNA from IDT, Coralville, Iowa) was synthesized and cloned into a standard ampicillin resistant pIDT vector. The synthetic piece encoded 700 bp upstream of lysA on the B. subtilis chromosome, the antibiotic marker cat (651 bp), the DNA-binding protein lad (1083 bp), and the allulose epimerase (894 bp), and included 700 bp of homology in dacF. This vector was transformed into E. coli strain 39 A10 (Baba et al., 2006), and plasmid DNA was prepared and transformed into B. subtilis strains 1A751 and 1A976.

Transformants were selected on LB supplemented with chloramphenicol. The replicon for pIDT is functional in E. coli but does not work in Gram positive bacteria such as B. subtilis. The colonies that arose therefore represented an integration event into the chromosome. In strain 1A751, the colony morphology on the plates was used to distinguish between single and double recombination events. The double recombination event would knock out genes required for sporulation, whereas the single recombination would not. After three days on LB plates, colonies capable of sporulation were brown and opaque; sporulation-deficient colonies were more translucent.

B. subtilis strain 1A976 with the allulose epimerase cassette is auxotrophic for histidine and lysine and can achieve very high transformation efficiency upon xylose induction. A 1925 bp synthetic DNA (SEQ ID NO:2) was amplified by primers (SEQ ID NO:3, SEQ ID NO:4) and Taq polymerase (Promega). This PCR product encoded the lysA gene that was deleted by the dropping in the epimerase cassette and 500 bp of homology to lad. A successful double recombination event of this DNA should result in colonies that are prototrophic for lysine and sensitive to chloramphenicol; i.e., the entire cat gene should be lost.

Transformants were selected on Davis minimal media supplemented with histidine. Colonies that arose were characterized by PCR and streaking onto LB with and without chloramphenicol. Strains that amplified the introduced DNA and that were chloramphenicol sensitive were further characterized, and their chromosomal DNA was extracted.

Strain 1A751 containing the chloramphenicol resistant allulose was transformed with this chromosomal DNA and selected on Davis minimal media supplemented with histidine. Transformants were streaked onto LB with and without chloramphenicol and characterized enzymatically as described below.

Full text: Click here
Patent 2024
Ampicillin Anabolism Antibiotic Resistance, Microbial Antibiotics Bacillus Bacillus subtilis Chloramphenicol Chromosomes Cloning Vectors DNA, A-Form DNA-Binding Proteins Epimerases Escherichia coli Gene Clusters Gene Knockout Techniques Genes Gram-Positive Bacteria Histidine Lysine Methyltransferase Oligonucleotide Primers Phenotype Plasmids psicose Recombination, Genetic Replicon Strains Taq Polymerase Xylose
Using Equation 5, we calculated a fold-difference in gene expression (gi) from the counts of an individual gene from sample i (ci), the counts of that same gene in the corresponding induced wild-type sample (cwt), and total sequencing counts (cT).
The variance in the fold-difference is then calculated from counts and the fold-difference in gene expression using Equation 6.
A 99% confidence interval, Z-score, and P-value were calculated following the method used in our deep sequencing analysis. Full equations are included in Mehlhoff et al. (2020) (link).
We primarily identified genes and more broadly, gene clusters and associated pathways, from the set of values in which there was a > 2-fold difference in gene expression with P < 0.001 significance. We utilized EcoCyc (Keseler et al. 2017 (link)) to further identify interconnected genes and search for patterns across genes that potentially play a role in deleterious collateral fitness effects.
Publication 2023
Gene Clusters Gene Expression Genes
To detect alternative mRNA associations with AUD we used Leafcutter version 0.2.923 (link). Leafcutter is a powerful transcriptome-wide splicing method that uses a Dirichlet-multinomial generalized linear regression to identify differentially spliced genes. A differentially spliced gene generally is composed of multiple clusters, each of which includes various alternative splicing events, such as exon-skipping (see Fig. 1), intron retention, alternative acceptor or alternative donor splice sites, which we annotated with the Vertebrate Alternative Splicing and Transcription Database (https://vastdb.crg.eu/wiki/Main_Page). Each splicing event corresponds to a change in percent spliced in (ΔPSI or dPSI) metric. In our AUD analyses, a positive ΔPSI for an exon skipping event would suggest that an individual with AUD is more likely to skip a certain exon than someone without AUD. We utilized the default filtering parameters of Leafcutter that filtered out splicing clusters with < 5 samplers per intron, < 3 samples per group, and required at least 20 reads, which resulted in 18,685 unique genes across human brain regions. Human differential splicing analyses covaried for sex, age, brain pH, PMI, and smoking status. Note leafcutter performs analyses at the cluster level calculating a cluster p-value and then performs a Benjamini–Hochberg False Discovery (BH-FDR) multiple testing correction. Differentially spliced genes/clusters were those that survived a standard BH-FDR adjusted p-value < 0.05. We corrected p-values for multiple testing within brain regions and thus, our analyses do not account for multiple testing across tissues or samples. Since only 21 genes were differentially spliced in primates (BH-FDR < 0.05), we defined significant differential splicing with a nominal p-value threshold < 0.05. When possible, primate differential splicing analyses controlled for age (NAc sample). We assessed linear correlations of the ΔPSI across all significant alternative splicing events that were common across brain regions.
To assess the overlap between human and primate results we used a Fisher’s Exact test at the gene-level and restricted analyses to homologous genes identified by biomaRt24 (link) and only used results from analogous regions of the brain (CEA, NAc, and PFC). In humans, we compared our differential splicing analyses with differentially expressed genes. Differential expression analyses leveraged featureCounts to count aligned RNA-seq reads and used DESeq225 (link) to determine differential expression. Differential expression analyses used the same covariates and p-value adjustment as differential splicing analyses. Previous differential splicing analyses of these data7 (link) used rMATS26 (link) that focuses on individual splicing events (rather than broader clusters within genes) and leverages a joint likelihood function combining binomial and normal distributions.
Full text: Click here
Publication 2023
Alternative Splice Sites Brain Exons Gene Clusters Genes Genetic Testing Homo sapiens Introns Joints Primates Retention (Psychology) RNA, Messenger RNA-Seq Tissue Donors Tissues Transcription, Genetic Transcriptome Vertebrates
Using Monocle2, we inferred the cell lineage trajectory of the MMD system. We first used the transcript count data (e.g. UMI) and created an object with the parameter ‘‘expression Family = negbinomial. Size ()’’ following the Monocle2 tutorial. The ‘‘differentialGeneTest’’ function was used to derive differentially expressed genes (DEGs) from each cluster, and genes with a q-value < 0.05 were used to order the cells in pseudotime analysis. Furthermore, DEGs along the pseudotime were detected and analyzed using the ‘‘differentialGeneTest’’ function after the cell trajectories were constructed.
Full text: Click here
Publication 2023
Gene Clusters Genes Genes, vif Physiology, Cell

Top products related to «Gene Clusters»

Sourced in United States, China, Germany, United Kingdom, Hong Kong, Canada, Switzerland, Australia, France, Japan, Italy, Sweden, Denmark, Cameroon, Spain, India, Netherlands, Belgium, Norway, Singapore, Brazil
The HiSeq 2000 is a high-throughput DNA sequencing system designed by Illumina. It utilizes sequencing-by-synthesis technology to generate large volumes of sequence data. The HiSeq 2000 is capable of producing up to 600 gigabases of sequence data per run.
Sourced in United States, Germany, Spain, United Kingdom, Netherlands
Ingenuity Pathway Analysis is a software tool designed to analyze and interpret biological and chemical systems. It provides a comprehensive suite of analytical and prediction capabilities to help users understand the complex relationships between genes, proteins, chemicals, and diseases.
Sourced in United States, China, United Kingdom, Japan, Germany, Canada, Hong Kong, Australia, France, Italy, Switzerland, Sweden, India, Denmark, Singapore, Spain, Cameroon, Belgium, Netherlands, Czechia
The NovaSeq 6000 is a high-throughput sequencing system designed for large-scale genomic projects. It utilizes Illumina's sequencing by synthesis (SBS) technology to generate high-quality sequencing data. The NovaSeq 6000 can process multiple samples simultaneously and is capable of producing up to 6 Tb of data per run, making it suitable for a wide range of applications, including whole-genome sequencing, exome sequencing, and RNA sequencing.
Sourced in United States
Cell Ranger is a software suite that enables the analysis of single-cell transcriptomics data. It provides tools for sample demultiplexing, barcode processing, gene counting, and data aggregation.
Sourced in United States, China, Germany, United Kingdom, Canada, Switzerland, Sweden, Japan, Australia, France, India, Hong Kong, Spain, Cameroon, Austria, Denmark, Italy, Singapore, Brazil, Finland, Norway, Netherlands, Belgium, Israel
The HiSeq 2500 is a high-throughput DNA sequencing system designed for a wide range of applications, including whole-genome sequencing, targeted sequencing, and transcriptome analysis. The system utilizes Illumina's proprietary sequencing-by-synthesis technology to generate high-quality sequencing data with speed and accuracy.
Sourced in United States, China, Japan, Germany, United Kingdom, Canada, France, Italy, Australia, Spain, Switzerland, Netherlands, Belgium, Lithuania, Denmark, Singapore, New Zealand, India, Brazil, Argentina, Sweden, Norway, Austria, Poland, Finland, Israel, Hong Kong, Cameroon, Sao Tome and Principe, Macao, Taiwan, Province of China, Thailand
TRIzol reagent is a monophasic solution of phenol, guanidine isothiocyanate, and other proprietary components designed for the isolation of total RNA, DNA, and proteins from a variety of biological samples. The reagent maintains the integrity of the RNA while disrupting cells and dissolving cell components.
Sourced in United States, Japan, Sweden
Expression Console software is a core analysis tool that enables users to visualize and analyze data from a variety of gene expression experiments. The software provides a user-friendly interface for organizing, processing, and interpreting gene expression data.
Sourced in United States, China, Germany, United Kingdom, Spain, Australia, Italy, Canada, Switzerland, France, Cameroon, India, Japan, Belgium, Ireland, Israel, Norway, Finland, Netherlands, Sweden, Singapore, Portugal, Poland, Czechia, Hong Kong, Brazil
The MiSeq platform is a benchtop sequencing system designed for targeted, amplicon-based sequencing applications. The system uses Illumina's proprietary sequencing-by-synthesis technology to generate sequencing data. The MiSeq platform is capable of generating up to 15 gigabases of sequencing data per run.
Sourced in Germany, United States, United Kingdom, Netherlands, Spain, Japan, Canada, France, China, Australia, Italy, Switzerland, Sweden, Belgium, Denmark, India, Jamaica, Singapore, Poland, Lithuania, Brazil, New Zealand, Austria, Hong Kong, Portugal, Romania, Cameroon, Norway
The RNeasy Mini Kit is a laboratory equipment designed for the purification of total RNA from a variety of sample types, including animal cells, tissues, and other biological materials. The kit utilizes a silica-based membrane technology to selectively bind and isolate RNA molecules, allowing for efficient extraction and recovery of high-quality RNA.

More about "Gene Clusters"

Gene clusters are groups of genes located closely together on a chromosome, often co-expressed or functionally related.
These genetic arrangements play crucial roles in various biological processes, such as immune function, developmental regulation, and metabolic pathways.
Understanding the structure and organization of gene clusters is essential for unraveling the complex genetic mechanisms underlying diverse diseases and traits.
PubCompare.ai's cutting-edge AI-driven platform enhances reproducibility and accuracy in gene cluster research by helping scientists locate the best protocols from literature, preprints, and patents using intelligent comparisons.
This innovative technology can optimize research and reduce the risk of experimental erors, empowering researchers to make new discoveries in this vital field of study.
The HiSeq 2000 and NovaSeq 6000 are high-throughput sequencing platforms that can be employed to study gene expression patterns within gene clusters.
Ingenuity Pathway Analysis, a powerful bioinformatics tool, can further elucidate the functional relationships and interactions between genes within these clusters.
The Cell Ranger software, used in conjunction with the 10x Genomics platform, enables single-cell analysis of gene expression, providing insights into the heterogeneity within gene clusters.
The HiSeq 2500, another high-throughput sequencer, and the MiSeq platform can be utilized for targeted sequencing of specific gene clusters of interest.
The TRIzol reagent is a commonly used method for RNA extraction, while the RNeasy Mini Kit facilitates purification of high-quality RNA samples for downstream gene expression analyses.
The Expression Console software can be employed to analyze and visualize gene expression data, further enhancing the understanding of gene cluster dynamics.