The largest database of trusted experimental protocols

Synteny

Synteny refers to the conservation of genomic neighborhoods and ordering of genes acrosse related species.
It is a powerful tool for comparative genomics, enabling researchers to identify functional elements and study evolutionary relationships between organisms.
PubCompare.ai's AI-driven platform helps scientists locate the best protocols for synteny analysis by comparing them side-by-side, unlocking unparalleled reproducibility and efficeincy in their research.

Most cited protocols related to «Synteny»

Figure 1 describes the overall pipeline implemented in ABACAS. It uses MUMmer (Kurtz et al. 2004 (link)) to find alignment positions and identify areas of synteny of the contigs against the reference. The output is then processed to generate a pseudomolecule taking overlapping contigs and gaps in to account. MUMmer's alignment programs, Nucmer and Promer, are used followed by the ‘delta-filter’ and ‘show-tiling’ utilities.

A flow-chart describing the pipeline implemented in ABACAS.

Gaps in the pseudomolecule are represented by N's. ABACAS automatically extracts gaps on the pseudomolecule and, based on flanking sequences above a base quality threshold, designs primers for gap closure using Primer3 (Koressaar and Remm 2007 (link)). As part of the primer design step the uniqueness of the sequence is checked by running a sensitive NUCmer alignment. ABACAS allows users to adjust parameters such as melting temperature, size, flanking region and size of contig ends to exclude from picking primers. It then produces a list of sense and antisense primer oligos as well as a detailed Primer3 output that contains additional information on each gap position.
ABACAS generates a comparison file that can be used to visualize ordered and oriented contigs in ACT, the Artemis Comparison Tool (Carver et al. 2008 (link)). Synteny is represented by red bars where colour intensity decreases with lower values of percent identity between comparable blocks. Information on contigs such as the orientation, percent identity, coverage and overlap with other contigs can also be visualized by loading the output feature file on ACT. Contigs that were not mapped can be included separately. Repetitive regions in the reference can also be identified using a MUMmer self-comparison and visualized in ACT alongside quality of the contigs.
If all of the contigs are not mapped, there is an option to run tBLASTx (Altschul et al. 1997) on contigs that are not included in the pseudomolecule using sequences from the reference that correspond to the gaps. Additional contigs to the pseudomolecule can be dragged and dropped to the correct position using ACT.
Publication 2009
2',5'-oligoadenylate Base Sequence Oligonucleotide Primers Repetitive Region Synteny
In gene order comparisons, it is necessary to work with blocks of genes conserved in two or more genomes; trying to work with one gene at a time is not a robust procedure, especially with flowering plants, because most of these genomes have a whole genome duplication (WGD) in their history. The fractionation process ensuing from WGD deletes duplicate genes in a partially random pattern from one or the other duplicate (homeologous) chromosome, independently in two or more descendants of a duplicated genome [4 (link)]. This pattern, together with the possibility for some genes to transpose into different positions in the genome, makes it hard to identify unambiguously orthologous genes that are in the same gene order in two genomes. A set of five or ten genes in the same order, with few intervening genes, in two genomes can be confidently identified as a conserved syntenic block [5 (link), 6 (link)].
However, the notion of block adjacency encounters a number of operational problems; the genes in a syntenic block in one genome may differ somewhat from the same block in the other genome, the minimum number of genes to establish a block is a parameter that must be determined by some empirical experimentation, as is the number of genes allowed to intervene between two pairs of orthologs within a block in the two genomes. We will avoid these practical problems in our simulations by excluding fractionation or other gene loss, duplication and small transpositions from our model.
Publication 2016
Chromosomes Conserved Synteny Gene Order Genes Genes, Duplicate Genome Magnoliopsida Radiotherapy Dose Fractionations Synteny
We constructed the paired-end DNA libraries with insert sizes larger than 2 kb by self-ligation of the DNA fragments and merging the two ends of the DNA fragment. We randomly fragmented the circularized DNA and enriched the fragments crossing the merged boundaries using magnetic beads with biotin and streptavidin. The sequencing process followed the manufacturer’s instructions (Illumina), and the fluorescent images were processed to sequences using the Illumina data processing pipeline (v1.1).
The genome sequence was assembled with short reads using SOAPdenovo software6 (http://soap.genomics.org.cn), which adopts the de Bruijn graph data structure to construct contigs7 (link). The reads were then realigned to the contig sequence, and the paired-end relationship between the reads was transferred to linkage between contigs. We constructed scaffolds starting with short paired-ends and then iterated the scaffolding process, step by step, using longer insert size paired-ends. To fill the intra-scaffold gaps, we used the paired-end information to retrieve read pairs that had one read well-aligned on the contigs and another read located in the gap region, then did a local assembly for the collected reads.
Known transposable elements were identified using RepeatMasker (version 3.2.6)14 against the Repbase31 (link) transposable element library (version 2008-08-01), and highly diverged transposable elements were identified with RepeatProteinMask14 by aligning the genome sequence to the curated transposable-element-related proteins. A de novo panda repeat library was constructed using RepeatModeller14 . Using evidence-based gene prediction, the human and dog genes (Ensembl release 52) were projected onto the panda genome, and the gene loci were defined by using both sequence similarity and whole-genome synteny information. De novo gene prediction was performed using Genscan16 (link) and Augustus17 (link). A reference gene set was created by merging all of the gene sets. The sequencing reads were mapped on the panda genome sequence using SOAPaligner8 (link), and heterozygous SNPs were identified by SOAPsnp9 (link).
Publication 2009
Amino Acid Sequence Biotin DNA Library DNA Transposable Elements Genes Genetic Loci Genome Heterozygote Homo sapiens Ligation Selfish DNA Single Nucleotide Polymorphism Streptavidin Synteny
Sequenced reads were assembled and attempts were made to assign the largest contiguous blocks of sequence to chromosomes using a genetic linkage map21 (link), fingerprint map and synteny with the chicken genome assembly Gallus_gallus-2.1, a revised version of the original draft6 (link) (Supplementary Note 1).
Publication 2010
Chickens Chromosomes Genome Linkage, Genetic Synteny
RATT is programmed in ‘bash’ and ‘PERL’ and its design is illustrated in Figure 1 and Supplementary Figure S1. First, two sequences are compared using ‘nucmer’ from the MUMmer package (17 (link)) to define sequence regions that share synteny. Those regions are filtered using configurable parameters depending on the type of annotation mapping that is being attempted. Preset parameters are provided for transfers between assembly versions, strains or species (see Supplementary Table S1). To be included, the minimum nucleotide sequence identity between synteny blocks must be 40%. Synteny information is stored as a base range in the query and its associated base range in the reference. However, this information alone is inadequate to map the annotation because insertions or deletions (indels) change the relative distance between mapped synteny blocks. The coordinates are therefore sequentially adjusted across a synteny block by calling indels using ‘show-snp’ from the MUMmer package. Accurately calling indels within repetitive regions presents a particular challenge. Therefore, RATT recalibrates the adjusted coordinates using single nucleotide polymorphisms (SNPs, also called using ‘show-snp’) as unambiguous anchor points within synteny blocks. In transfers between very closely related sequences (e.g. successive assembly versions), SNPs may occur with insufficient frequency to perform this coordinate adjustment. In such cases, RATT modifies the query by inserting a faux SNP every 300 bp to aid in the recalibrating step. The final sequence and transferred annotations remain unaffected.

Workflow of RATT.

Once the coordinates within synteny blocks have been defined, RATT proceeds to the annotation-mapping step, whereby each feature within a reference EMBL file is associated with new coordinates on the query (Supplementary Figure S1B). A feature is not mapped (and is put in the non-transferred bin file), if it bridges a synteny break and if its coordinate boundaries match different chromosomes, different DNA strands, or if the new mapped distance of its coordinates has increased by more than 20 kb. If a short sequence from the beginning, middle or the end of a feature can be placed within a synteny region, mapping is attempted (see Supplementary Figure S1B). In addition, if the exons of a single gene model map to different gene regions, the model is split and identified in the output file. The bin is an EMBL-format file that can be loaded onto the reference sequence for analysis (see Figure 2, brown colour track). Further outputs include statistics about transferred features or the amount of synteny conserved between the reference and query, as well as Artemis-readable files showing SNPs, indels and regions that lack synteny between the compared sequences, see the example on the sourceforge site.

Transfer of annotation from the M. tuberculosis strain H37Rv onto the strain F11 sequence, over a deletion. The genomes of H37Rv (upper) and F11 (lower) are shown using the Artemis Comparison Tool (ACT). The source H37Rv annotation (light blue) is directly mapped onto F11 by RATT (green) except for those features corresponding to a region that is unique to the source strain that cannot be transferred and are written to a separate output file (brown).

Although two sequences may be related, differences can occur, such as a change in the start or stop codons of a protein-coding sequence. Therefore, we implemented a correction algorithm in RATT (see Supplementary Figure S1C). Figure 3 shows examples of the correction step. First, the start codon is checked. If it is not present, the upstream sequence is searched for a new start codon (Figure 3A). If a stop codon is found, the first start codon downstream is used. In the absence of any start codon, an error is recorded in the results file. If the sequence between exons has no stop codon and a length divisible by three bases but the splice acceptor or donor sequences are wrong, then the intron is eliminated. Likewise, frameshifts previously introduced into the reference to maintain conceptual translations (for instance, in apparent pseudogenes) will also be removed from coding sequences in the query. RATT will also detect, and attempt to fix, incorrect splice sites. As splice sites are difficult to annotate correctly, RATT only tries to correct a gene model that has one wrong splice site. If one incorrect splice site is detected, the closest alternative splice donor or acceptor is found that, when used, generates no frame shifts. Next, RATT searches for genes or exons with internal stop codons, further than 150 bp from the 3′-end. If the introduction of a frameshift would generate a model without internal stop codons, the model is corrected (Figure 3C). Stop codons are corrected last: if a model has less than five internal stops in its last exon, the model is shortened to the first stop codon (Figure 3B). If the model has no stop codon it is extended downstream until a stop codon is found.

RATT corrections of transferred annotations. Annotation from H37Rv were transferred onto the F11 sequence (pale blue), corrected (green) and then compared with the existing strain F11 annotation in EMBL (yellow). (A and B) The correction of start and stop codons, respectively. In a more complex mapping situation (C), where all three reading frames are shown for clarity, RATT maps a large single coding sequence (CDS) from H37Rv to a locus within F11 that includes several in-frame stop codons. By inserting a frameshift (i.e. to indicate a pseudogene) the conceptual translation is preserved. This contrasts with two overlapping genes predicted as part of the F11 genome project.

Different criteria can be specified depending on the translation that an organism uses (e.g. such bacterial TTG and GTG start codons) or whether unsual splice sites are used. RATT is programmed in PERL and was tested in UNIX/LINUX environments. The output can be loaded into Artemis/Act. The list and explanation of all the output files can be found at the sourceforge site.
Publication 2011
Bacteria Base Sequence Chromosomes Codon, Initiator Codon, Terminator Contrast Media Deletion Mutation Exons Frameshift Mutation Gene Deletion Genes Genes, Overlapping Genome INDEL Mutation Insertion Mutation Introns Light Microtubule-Associated Proteins Mycobacterium tuberculosis H37Rv Open Reading Frames Pseudogenes Reading Frames Repetitive Region Single Nucleotide Polymorphism Strains Synteny Tissue Donors

Most recents protocols related to «Synteny»

To confirm the detected duplications, a combination of BLAST and synteny was used on the denovo-assembled genome. Only the insertions that segregate in the 6 new genomes were used (398). For each gene, the corresponding sequence from the TAIR10 annotation was located in the target genome using BLAST (see Additional file 1: Fig. S6). A threshold of 70% sequence identity as well as 70% of the initial sequence length was used. The presence of a match within 20kb of the predicted peak position was interpreted as confirmation.
Publication 2023
Genes Genome Insertion Mutation Synteny
For nucleotide-based comparative synteny analysis, we ran lastz77 (version 1.04.00) with the notransition and nogapped options and step = 20 to search for homologous sequences of two genomic sequences, the P. pacificus reference genome versus the P. exspectatus de novo assembly. For P. pacificus, the ‘El_Paco’ reference genome was used36 (link). Only homology at unique sites in the P. exspectatus genome was selected. Pairs of 100 kb non-overlapping sliding windows of the two genome sequences having at least 10 kb sequence homology are visualized in the circos plot (Fig. 2a). To visualize any small homology in the dot-plot analysis, we calculated the P. pacificus genome coordinate homologous to any nucleotides of P. exspectatus as distance from the start site of homology divided by the length of homology. The plot is downsized by 1/100 for visualization. Inversions larger than 100 kb were detected manually from dot-plot analysis, and the position was identified in the 10 kb scale.
For the protein-based comparative synteny analysis, one-to-one orthologues between species were identified as best reciprocal hits with the help of the get_BRH.pl script from the Perl Package for Customized Annotation Computing package73 (link). For C. elegans, the dataset of WormBase ParaSite release 14 (https://parasite.wormbase.org) was used.
Publication 2023
Caenorhabditis elegans Genome Homologous Sequences Inversion, Chromosome Nucleotides Parasites Proteins Synteny
Genes and gene trees were downloaded from Ensembl v.92 (ref. 68 (link)) and Ensembl Plants v.41 (ref. 69 (link)). Ensembl v.92 gene trees were edited for poorly supported duplication nodes as described previously70 (link), as part of the standard build procedure for the Genomicus synteny database. Of note, this step only marginally improves ancestral genome reconstructions and is not a prerequisite to use AGORA. The species trees for the extant and ancestral genomes from Ensembl v.92 and Ensembl Plants v.41 are described in Supplementary File 1.
Publication 2023
Genes Genome Plants Reconstructive Surgical Procedures Synteny Trees
In a previous study, gDNA extracted from Blue 58, CS, and Th. ponticum was digested using HaeIII (New England Biolabs, America) and then specific fragments (400–450 bp) were selected for the specific-locus amplified fragment sequencing (SLAF-seq) analysis. The SLAFs of 4Ag were obtained following a comparison with the A-genome, D-genome, and the SLAFs of CS and Th. ponticum. Our group developed 573 markers specific to chromosome 4Ag on the basis of these SLAFs of 4Ag (Liu et al., 2018a (link)), of which 223 markers specific to chromosome 4AgS were used for the subsequent analysis. First, the SLAFs related to these markers were aligned with the Thinopyrum elongatum (Host) D.R. Dewey (2n = 2x = 14, EE) genome (Wang et al., 2020a (link)) to reveal the homoeologous group that 4Ag belongs to and with the Th. ponticum genome (unpublished) to identify specific chromosome. The comparisons were completed using Bowtie2, with one mismatch accepted. The correct sequence was obtained if the physical distance between the forward and reverse SLAFs was less than 500 bp. These SLAFs were further mapped to a specific chromosome to determine their physical positions using BLAST, with an E-value cut off of 1e-5. Depending on the physical positions, some markers were selected and amplified by PCR using the Blue 58 gDNA. The amplified products were sequenced and aligned with the Th. ponticum and Th. elongatum genomes. Syntenic relationships were visualized using MapChart 2.32 (Voorrips, 2002 (link)). Finally, the chromosomal regions of four wheat–Th. ponticum translocation lines (L1, WTT139, WTT146, and WTT323) were determined according to a PCR amplification of these selected markers. The PCR amplification and detection were performed as described by Liu et al. (2018a) (link).
Publication 2023
Chromosomes Genome Physical Examination Synteny Translocation, Chromosomal Triticum aestivum
The whole-genome assemblies sequences of QG10 were compared with the rice reference genome sequence (Oryza_sativa_MSU7 version) using the software package MUMmer v3 (Kurtz et al., 2004 (link)). According to the results from the software package MUMmer, the sequence variations and SVs were further re-called using the software package BLAST. The synteny/inversion comparison were analysis by using GenomeSyn_Win.v1 (Zhou et al., 2022 (link)). At the site of each sequence variant, the genotypic information for QG10, Nipponbare, and the elite variety having important genes was called according to the results of the one-to-one alignments. The allelic information of sequence variants was detected based on gff files from the Oryza_sativa_MSU7 version. The software packages ClustalW v1.8.3(Thompson et al., 1994 (link)) and BLAST v2.2.31 were used for re-detected the sequence variations and detailed haplotype analyses for the well-characterized genes in rice (Zhao et al., 2018b (link)).
Publication 2023
Alleles FCER2 protein, human Genes Genetic Diversity Genome Genotype Haplotypes Inversion, Chromosome Oryza sativa Synteny

Top products related to «Synteny»

Sourced in United States, China, Germany, United Kingdom, Hong Kong, Canada, Switzerland, Australia, France, Japan, Italy, Sweden, Denmark, Cameroon, Spain, India, Netherlands, Belgium, Norway, Singapore, Brazil
The HiSeq 2000 is a high-throughput DNA sequencing system designed by Illumina. It utilizes sequencing-by-synthesis technology to generate large volumes of sequence data. The HiSeq 2000 is capable of producing up to 600 gigabases of sequence data per run.
Sourced in United States, China, Germany, United Kingdom, Canada, Switzerland, Sweden, Japan, Australia, France, India, Hong Kong, Spain, Cameroon, Austria, Denmark, Italy, Singapore, Brazil, Finland, Norway, Netherlands, Belgium, Israel
The HiSeq 2500 is a high-throughput DNA sequencing system designed for a wide range of applications, including whole-genome sequencing, targeted sequencing, and transcriptome analysis. The system utilizes Illumina's proprietary sequencing-by-synthesis technology to generate high-quality sequencing data with speed and accuracy.
Sourced in Germany, United States, United Kingdom, Canada, China, Spain, Netherlands, Japan, France, Italy, Switzerland, Australia, Sweden, Portugal, India
The DNeasy Plant Mini Kit is a lab equipment product designed for the isolation and purification of DNA from plant samples. It utilizes a silica-based membrane technology to extract and concentrate DNA effectively from a variety of plant materials.
Sourced in United States
Lasergene v7.1 is a software suite for DNA and protein sequence analysis and visualization. It provides a range of tools for sequence alignment, primer design, and sequence manipulation. The core function of Lasergene v7.1 is to enable users to analyze and manage biological sequence data.
Sourced in United Kingdom, United States
The PromethION platform is a high-throughput, real-time DNA/RNA sequencing system developed by Oxford Nanopore Technologies. It utilizes nanopore technology to detect and analyze the electrical signals generated as molecules pass through nanopores, enabling long-read sequencing.
Sourced in United States, China, Germany, United Kingdom, Australia, Canada, India, Switzerland, Cameroon, Portugal, Brazil, Japan
The HiSeq platform is a high-throughput DNA sequencing system developed by Illumina. The core function of the HiSeq platform is to perform large-scale genomic analysis by generating high-quality sequence data efficiently.
Sourced in Germany, United States
The Large-Construct Kit is a laboratory equipment designed for the preparation and purification of large DNA constructs. It provides a reliable and efficient method for isolating and concentrating high-molecular-weight DNA fragments, enabling researchers to work with larger genetic materials for various applications.
Sourced in United States, Germany
Illustrator CS6 is a vector graphics editing software that allows users to create and manipulate vector-based images, such as logos, illustrations, and graphics. It provides a range of tools and features for designing and editing vector artwork.
Sourced in United States
Lasergene Genomics Suite is a comprehensive software package for sequence analysis, assembly, and visualization. It provides tools for nucleic acid and protein sequence analysis, multiple sequence alignment, phylogenetic tree construction, and more.
Sourced in United States
MegAlign is a multiple sequence alignment software tool. It aligns DNA, RNA, or protein sequences and displays the alignment results. MegAlign supports various alignment algorithms and provides visualization options to analyze the alignment.

More about "Synteny"

Synteny refers to the conservation of genomic neighborhoods and ordering of genes across related species.
It is a powerful tool for comparative genomics, enabling researchers to identify functional elements and study evolutionary relationships between organisms.
Synteny analysis is a key technique in bioinformatics and genomics, helping scientists understand the structure and evolution of genomes.
Some related terms and concepts include genomic co-linearity, chromosome-level sequence alignments, and the identification of orthologous genes.
Synteny can be analyzed using a variety of computational tools and software, such as HiSeq 2000, HiSeq 2500, and the PromethION platform for high-throughput DNA sequencing.
Data analysis and visualization tools like Lasergene v7.1, Illustrator CS6, and the Lasergene Genomics Suite can also be employed.
The DNeasy Plant Mini Kit and Large-Construct Kit are examples of molecular biology tools that can be used to prepare samples for synteny analysis.
Researchers may also utilize methods like whole-genome alignment and local collinearity detection to identify syntenic regions between species.
Synteny analysis is a crucial component of comparative genomics, allowing scientists to uncover the evolutionary history and functional relationships between different organisms.
By leveraging the power of this technique, researchers can gain unprecedented insights into the structure, organization, and evolution of genomes, furthering our understanding of biology and the natural world.
Exploring the latest advancements in synteny analysis can unlock new possibilities in your research endeavors.