The largest database of trusted experimental protocols
> Physiology > Molecular Function > Base Pairing

Base Pairing

Base pairing is a fundamental concept in molecular biology, where complementary nucleic acid bases (adenine-thymine, guanine-cytosine) form stable hydrogen bonds, enabling the double-helix structure of DNA and the secondary structure of RNA.
This process is essential for accurate genetic information storage, replication, and expression.
Understandign base pairing is crucial for a wide range of biological research, from DNA sequencing to structural biology and drug design.
Experiecnce the power of this core molecular mechanism and accelerate your research with innovative tools that optimize base pairing protocols.

Most cited protocols related to «Base Pairing»

Table 1 illustrates the wide range of operations that BEDTools support. Many of the tools have extensive parameters that allow user-defined overlap criteria and fine control over how results are reported. Importantly, we have also defined a concise format (BEDPE) to facilitate comparisons of discontinuous features (e.g. paired-end sequence reads) to each other (pairToPair), and to genomic features in traditional BED format (pairToBed). This functionality is crucial for interpreting genomic rearrangements detected by paired-end mapping, and for identifying fusion genes or alternative splicing patterns by RNA-seq. To facilitate comparisons with data produced by current DNA sequencing technologies, intersectBed and pairToBed compute overlaps between sequence alignments in BAM format (Li et al., 2009 (link)), and a general purpose tool is provided to convert BAM alignments to BED format, thus facilitating the use of BAM alignments with all other BEDTools (Table 1). The following examples illustrate the use of intersectBed to isolate single nucleotide polymorphisms (SNPs) that overlap with genes, pairToBed to create a BAM file containing only those alignments that overlap with exons and intersectBed coupled with samtools to create a SAM file of alignments that do not intersect (-v) with repeats.

Summary of supported operations available in the BEDTools suite

UtilityDescription
intersectBed*Returns overlaps between two BED files.
pairToBedReturns overlaps between a BEDPE file and a BED file.
bamToBedConverts BAM alignments to BED or BEDPE format.
pairToPairReturns overlaps between two BEDPE files.
windowBedReturns overlaps between two BED files within a user-defined window.
closestBedReturns the closest feature to each entry in a BED file.
subtractBed*Removes the portion of an interval that is overlapped by another feature.
mergeBed*Merges overlapping features into a single feature.
coverageBed*Summarizes the depth and breadth of coverage of features in one BED file relative to another.
genomeCoverageBedHistogram or a ‘per base’ report of genome coverage.
fastaFromBedCreates FASTA sequences from BED intervals.
maskFastaFromBedMasks a FASTA file based upon BED coordinates.
shuffleBedPermutes the locations of features within a genome.
slopBedAdjusts features by a requested number of base pairs.
sortBedSorts BED files in useful ways.
linksBedCreates HTML links from a BED file.
complementBed*Returns intervals not spanned by features in a BED file.

Utilities in bold support sequence alignments in BAM. Utilities with an asterisk were compared with Galaxy and found to yield identical results.

Other notable tools include coverageBed, which calculates the depth and breadth of genomic coverage of one feature set (e.g. mapped sequence reads) relative to another; shuffleBed, which permutes the genomic positions of BED features to allow calculations of statistical enrichment; mergeBed, which combines overlapping features; and utilities that search for nearby yet non-overlapping features (closestBed and windowBed). BEDTools also includes utilities for extracting and masking FASTA sequences (Pearson and Lipman, 1988 (link)) based upon BED intervals. Tools with similar functionality to those provided by Galaxy were directly compared for correctness using the ‘knownGene’ and ‘RepeatMasker’ tracks from the hg19 build of the human genome. The results from all analogous tools were found to be identical (Table 1).
Publication 2010
Exons Gene Fusion Gene Rearrangement Genes Genome Genome, Human Sequence Alignment Single Nucleotide Polymorphism
Some sequences, or even entire reads, can be overrepresented in FASTQ data. Analysis of these overrepresented sequences provides an overview of certain sequencing artifacts such as PCR over-duplication, polyG tails and adapter contamination. FASTQC offers an overrepresented sequence analysis module, however, according to the author’s introduction, FASTQC only tracks the first 1 M reads of the input file to conserve memory. We suggest that inferring the overall distribution from the first 1 M reads is not a reliable solution as the initial reads in Illumina FASTQ data usually originate from the edges of flowcell lanes, which may have lower quality and different patterns than the overall distribution.
Unlike FASTQC, fastp samples all reads evenly to evaluate overrepresented sequences and eliminate partial distribution bias. To achieve an efficient implementation of this feature, we designed a two-step method. In the first step, fastp completely analyzes the first 1.5 M base pairs of the input FASTQ to obtain a list of sequences with relatively high occurrence frequency in different sizes. In the second step, fastp samples the entire file and counts the occurrence of each sequence. Finally, the sequences with high occurrence frequency are reported.
Besides the occurrence frequency, fastp also records the positions of overrepresented sequences. This information is quite useful for diagnosing sequence quality issues. Some sequences tend to appear in the read head whereas others appear more often in the read tail. The distribution of overrepresented sequences is visualized in the HTML report. Figure 5 shows a demonstration of overrepresented sequence analysis results.
Publication 2018
Head Memory Poly G Sequence Analysis Tail
Genomic analyses often seek to compare features that are discovered in an experiment to known annotations for the same species. When genomic features from two distinct sets share at least one base pair in common, they are defined as ‘intersecting’ or ‘overlapping’. For example, a typical question might be ‘Which of my novel genetic variants overlap with exons?’ One straightforward approach to identify overlapping features is to iterate through each feature in set A and repeatedly ask if it overlaps with any of the features in set B. While effective, this approach is unreasonably slow when screening for overlaps between, for example, millions of DNA sequence alignments and the RepeatMasker (Smit et al., 1996–2004 ) track for the human genome. This inefficiency is compounded when asking more complicated questions involving many disparate sets of genomic features. BEDTools was developed to efficiently address such questions without requiring an installation of the UCSC or Galaxy browsers. The BEDTools suite is designed for use in a UNIX environment and works seamlessly with existing UNIX utilities (e.g. grep, awk, sort, etc.), thereby allowing complex experiments to be conducted with a single UNIX pipeline.
Publication 2010
Base Pairing Exons Genetic Diversity Genome Genome, Human Sequence Alignment

Protocol full text hidden due to copyright restrictions

Open the protocol to access the free full text link

Publication 2014
Cell Lines Cell Nucleus Cells Formaldehyde Ligation Microtubule-Associated Proteins Nucleotides Streptavidin Technique, Dilution

Protocol full text hidden due to copyright restrictions

Open the protocol to access the free full text link

Publication 2013
Base Pairing Chromatin Immunoprecipitation Sequencing Genes Genome Genome, Human Macrophage-1 Antigen Mus Transcription Factor

Most recents protocols related to «Base Pairing»

Example 5

We studied the effect of CH25H KO in AD pathogenesis. SgRNAs targeting CH25H were designed with sgRNA1 being SEQ ID NO: 1 and sg RNA2 being SEQ ID NO: 2. See FIGS. 12A and 12B. With the two sg RNAs, CH25H gene were knocked out by crisper/cas9 method so that 46 base pairs (bp) were deleted in the exon of CH25H genes (see FIG. 12C), resulting in CH25H knockout (KO) mice.

In the CH25H KO mice, the deletion of the 46 bp fragment of CH25H gene was detected with the 488 bp band being the deleted CH25H gene and the 534 bp being the wild-type gene. The expression of CH25H mRNA in the CH25H KO mice was significantly reduced (FIG. 12E).

Once crossed to 5XFAD mice, the CH25H KO showed similar phenotype to STAT1 KO. Aβ was greatly reduced in both immunostaining and Elisa quantification (FIGS. 13A, 13B and 13C, respectively). Conversely, mice injected with 25-OHC had significant high amount of Aβ (FIGS. 11A, 11B and 11C).

To test the effect of reduced Aβ on cognitive abilities, the mice were examined by watermaze. 5XFAD mice gradually learned to locate the platform underneath the water, while the CH25H KO mice took significantly (p<0.05) less time to find the platform, indicating they performed better in learning and memory task (FIG. 14).

Full text: Click here
Patent 2024
Cognition Deletion Mutation Enzyme-Linked Immunosorbent Assay Exons Genes Memory Mice, Knockout Mice, Laboratory pathogenesis Phenotype RNA RNA, Messenger STAT1 protein, human
As detailed above, RNA and DNA were extracted in parallel and RNA was subsequently reverse transcribed. DNA and cDNA extracts were used as templates for PCR-based amplification for 16S rRNA gene/transcript V4 amplicon generation using the following primers: 515F-ACACTGACGACATGGTTCTACAGTGCCAGCMGCCGCGGTAA and 806R-TACGGTAGCAGAGACTTGGTCTGGACTACNVGGGTWTCTAAT, and thermocycling program: 94°C for 3 min, followed for 32 × [ 94°C for 45s, 50°C for 60s, 72°C for 90s], 72°C for 10 min, and a 4°C hold. Amplicon sequencing was performed on an Illumina MiSeq platform. Sequence analyses were performed using the DADA2 package [63 (link)] implemented in R. Briefly, forward and reverse reads were trimmed with the filterAndTrim() command using the following parameters: trimLeft = c(20,20), maxEE = c(2,2), phix = TRUE, multithread = TRUE, minLen = 120, followed by error assessments and independent forward and reverse read de-replication. Sequencing errors were removed using the dada() command and error-free forward and reverse reads were merged using the mergePairs() command, specifying overhand trimming and a minimum overlap of 120 base pairs. The resulting amplicon sequence variants (ASVs) were assigned taxonomy by alignment against the SILVA 132 database [64 (link)]. ASV count tables and taxonomy assignments were merged into an S4 object for diversity analysis and summary visualization using vegan in phyloseq [65 (link)].
Full text: Click here
Publication 2023
DNA, Complementary DNA Replication Gene Amplification Genes Genetic Diversity Oligonucleotide Primers RNA, Ribosomal, 16S Vegan
Output files from Illumina MiSeq were first run through FastQC (Andrews et al. 2018 (link)) to check read quality. The paired-end reads were merged using PEAR (Stamatakis et al. 2014 (link)) set to a minimum assembly length of 150 base pairs reads allowing for high quality scores at both ends of the sequence. Adapters were trimmed from the ends of the antibiotic resistance genes' coding sequence using Trimmomatic (Bolger et al. 2014 (link)). Enrich2 (Rubin et al. 2017 ) was used to count the frequency of each allele for use in calculating selection coefficients and associated statistical measures. We set Enrich2 to filter out any reads containing bases with a quality score below 20, bases marked as N, or mutations at more than one codon.
Fitness of an allele (wi) was calculated from the enrichment of the synonyms of the wild-type gene ( εwt ), the enrichment of allele i ( εi ) and the fold increase in the number of cells during the growth competition experiment (r) as described by Equation 1. We utilize the frequency of wildtype synonymous alleles as the reference instead of the frequency of wildtype because wildtype synonyms occurred more frequently in the library and wildtype sequencing counts are more prone to being affected by the artifact of PCR template jumping during the preparation of barcoded amplicons for deep-sequencing. Detailed derivations of the following equations (Equations 1–6) can be found in our previous work (Mehlhoff et al. 2020 (link)).
We calculate the variance in the fitness as
where the frequency of allele (fi) is calculated from counts of that allele (ci) and the total sequencing counts (cT).
From the variance in fitness, we calculated a 99% confidence interval. Additionally, we calculated a P-value using a 2-tailed test. Details of the Z-score and P-value equations are available in Mehlhoff et al. (2020) (link).
We estimated the number of false positives that would be included at P < 0.01 and P < 0.001 significance in order to correct for multiple testing (Storey and Tibshirani 2003 (link)) in our DMS datasets as described previously (Mehlhoff et al. 2020 (link)). For TEM-1, we estimated that our data would contain approximately 55.0 false positives on average at P < 0.01 significance and an estimated 5.6 false positives on average at P < 0.001 significance for a single replica (Mehlhoff et al. 2020 (link)). Those values are 44.1 and 4.3 (CAT-I), 52.8 and 5.3 (NDM-1), and 33.8 and 3.4 (aadB) at P < 0.01 and P < 0.001 significance, respectively. We chose to report the frequency of mutations having fitness effects that met the P-value criteria in both replica experiments to limit the occurrence of false positives.
Publication 2023
Alleles Antibiotic Resistance, Microbial AT-001 Chloramphenicol O-Acetyltransferase Codon DNA Library Genes Mutation Open Reading Frames Pears
The TEM-1, NDM-1, CAT-I, aadB, and aac(6′)-Im antibiotic resistance genes were individually placed under control of the IPTG-inducible tac promoter on pSKunk1, a minor variant of plasmid pSKunk3 (AddGene plasmid #61531) (Firnberg and Ostermeier 2012 ). The CAT-I gene was amplified from pKD3 (AddGene plasmid #45604). An A to C mutation was made at base pair 219 within the CAT-I gene using the QuickChange Lightning Site-Directed Mutagenesis kit (Agilent) to match the native E. coli CAT-I sequence. The NDM-1, aadB, and aac(6′)-Im genes were ordered as gene fragments with adapters from Twist Bioscience. We verified the correct size and antibiotic resistance gene sequence for each plasmid using agarose electrophoresis gels and Sanger sequencing, respectively before transforming the resulting plasmids into electrocompetent NEB 5-alpha LacIq cells, which contain an F” episome encoding LacI. We produced these electrocompetent cells starting from a single tube of NEB 5-alpha LacIq chemically competent cells. Chemically competent cells were plated on LB-agar containing 10 μg/ml tetracycline to ensure the presence of the F” episome. A single colony was selected to produce electrocompetent cells with aliquots of the resulting electrocompetent cells used for all experiments within this study. All growth experiments were conducted in LB media supplemented with glucose (2% w/v) and spectinomycin (50 µg/ml) to maintain the pSKunk1 plasmid except where otherwise noted. Expression of the antibiotic resistance proteins was induced by the addition of 1 mM IPTG to exponentially growing cultures.
Publication 2023
Agar Antibiotic Resistance, Microbial Base Pairing Cells Chloramphenicol O-Acetyltransferase Electrophoresis, Agar Gel Episomes Escherichia coli Genes Glucose Isopropyl Thiogalactoside Mutagenesis, Site-Directed Mutation Pancreatic alpha Cells Plasmids Proteins Spectinomycin Tetracycline
Each sequencing run was imported to QIIME2-2022.2 [62 , 63 ] and processed individually. The datasets were divided into two distinct pipelines: (1) data that targeted the 16S rRNA gene V4 region of Bacteria and/or Archaea and (2) data that targeted the V3–V4 region of Bacteria and/or Archaea. For V4 datasets, the data were processed with cut-adapt to remove sequencing primers corresponding to the respective study [64 (link)]. In total, three 515F primers that targeted the V4 region of the 16S rRNA gene were used across studies (5′-GTGCCAGCMGCCGCGGTAA-3′ (n = 1033) [31 (link)], 5′-GTGYCAGCMGCCGCGGTAA-3′ (n = 1219) [33 (link)], and 5′-ACACTGACGACATGGTTCTACAGTGCCAGCMGCCGCGGTAA-3′, (n = 79) [31 (link), 32 ]; Supplementary Table 1). Next, the data were processed with DADA2 for quality control and denoising using a max error rate of three [65 (link)]. Although all runs were paired-end reads, the V4 samples were processed as single-end reads and the forward reads were truncated at 130 base pairs (bp) with the DADA2 program. The error rates, truncation, and single-end options were selected based on the quality and sequence length (Supplementary Table 1) of the lowest-quality reads across all datasets. The two V3–V4 datasets (n = 31 samples) were processed with the cut-adapt program, which was used to select forward sequences that contained sequences similar to the 515F primers used in the V4 studies. The forward primer 515FY [33 (link)] was used as the target sequence using a 0.4 error rate to allow for some differences in bases. The selected sequences were then processed with DADA2 and truncated at 240 bps with a max error rate of one. After, if studies had multiple Illumina sequencer runs, they were first merged together, and then all studies were merged into one count table and sequence file. The vsearch cluster-features-de-novo function was then used to cluster the data by 99% similarity [66 (link)]. The classify-consensus-vsearch option was then used for taxonomy assignments with the SILVA-138-99 database [67 (link)]. The data were then filtered to remove mitochondria and chloroplast reads. All analyses were conducted at the ASV level.
Full text: Click here
Publication 2023
Archaea Bacteria Chloroplasts Genes Genes, Bacterial Mitochondria Oligonucleotide Primers Ribosomal RNA Genes RNA, Ribosomal, 16S

Top products related to «Base Pairing»

Sourced in United States, China, Germany, United Kingdom, Canada, Switzerland, Sweden, Japan, Australia, France, India, Hong Kong, Spain, Cameroon, Austria, Denmark, Italy, Singapore, Brazil, Finland, Norway, Netherlands, Belgium, Israel
The HiSeq 2500 is a high-throughput DNA sequencing system designed for a wide range of applications, including whole-genome sequencing, targeted sequencing, and transcriptome analysis. The system utilizes Illumina's proprietary sequencing-by-synthesis technology to generate high-quality sequencing data with speed and accuracy.
Sourced in United States, China, Germany, United Kingdom, Hong Kong, Canada, Switzerland, Australia, France, Japan, Italy, Sweden, Denmark, Cameroon, Spain, India, Netherlands, Belgium, Norway, Singapore, Brazil
The HiSeq 2000 is a high-throughput DNA sequencing system designed by Illumina. It utilizes sequencing-by-synthesis technology to generate large volumes of sequence data. The HiSeq 2000 is capable of producing up to 600 gigabases of sequence data per run.
Sourced in United States, Germany, Canada, China, France, United Kingdom, Japan, Netherlands, Italy, Spain, Australia, Belgium, Denmark, Switzerland, Singapore, Sweden, Ireland, Lithuania, Austria, Poland, Morocco, Hong Kong, India
The Agilent 2100 Bioanalyzer is a lab instrument that provides automated analysis of DNA, RNA, and protein samples. It uses microfluidic technology to separate and detect these biomolecules with high sensitivity and resolution.
Sourced in United States, Germany, China, United Kingdom, Australia, France, Italy, Canada, Japan, Austria, India, Spain, Switzerland, Cameroon, Netherlands, Czechia, Sweden, Denmark
The NextSeq 500 is a high-throughput sequencing system designed for a wide range of applications, including gene expression analysis, targeted resequencing, and small RNA discovery. The system utilizes reversible terminator-based sequencing technology to generate high-quality, accurate DNA sequence data.
Sourced in United States, China, United Kingdom, Hong Kong, France, Canada, Germany, Switzerland, India, Norway, Japan, Sweden, Cameroon, Italy
The HiSeq 4000 is a high-throughput sequencing system designed for generating large volumes of DNA sequence data. It utilizes Illumina's proven sequencing-by-synthesis technology to produce accurate and reliable results. The HiSeq 4000 has the capability to generate up to 1.5 terabytes of data per run, making it suitable for a wide range of applications, including whole-genome sequencing, targeted sequencing, and transcriptome analysis.
Sourced in United States, China, United Kingdom, Japan, Germany, Canada, Hong Kong, Australia, France, Italy, Switzerland, Sweden, India, Denmark, Singapore, Spain, Cameroon, Belgium, Netherlands, Czechia
The NovaSeq 6000 is a high-throughput sequencing system designed for large-scale genomic projects. It utilizes Illumina's sequencing by synthesis (SBS) technology to generate high-quality sequencing data. The NovaSeq 6000 can process multiple samples simultaneously and is capable of producing up to 6 Tb of data per run, making it suitable for a wide range of applications, including whole-genome sequencing, exome sequencing, and RNA sequencing.
Sourced in United States, China, Japan, Germany, United Kingdom, Canada, France, Italy, Australia, Spain, Switzerland, Netherlands, Belgium, Lithuania, Denmark, Singapore, New Zealand, India, Brazil, Argentina, Sweden, Norway, Austria, Poland, Finland, Israel, Hong Kong, Cameroon, Sao Tome and Principe, Macao, Taiwan, Province of China, Thailand
TRIzol reagent is a monophasic solution of phenol, guanidine isothiocyanate, and other proprietary components designed for the isolation of total RNA, DNA, and proteins from a variety of biological samples. The reagent maintains the integrity of the RNA while disrupting cells and dissolving cell components.
Sourced in United States, China, Germany, United Kingdom, Spain, Australia, Italy, Canada, Switzerland, France, Cameroon, India, Japan, Belgium, Ireland, Israel, Norway, Finland, Netherlands, Sweden, Singapore, Portugal, Poland, Czechia, Hong Kong, Brazil
The MiSeq platform is a benchtop sequencing system designed for targeted, amplicon-based sequencing applications. The system uses Illumina's proprietary sequencing-by-synthesis technology to generate sequencing data. The MiSeq platform is capable of generating up to 15 gigabases of sequencing data per run.
Sourced in Germany, United States, United Kingdom, Netherlands, Spain, Japan, Canada, France, China, Australia, Italy, Switzerland, Sweden, Belgium, Denmark, India, Jamaica, Singapore, Poland, Lithuania, Brazil, New Zealand, Austria, Hong Kong, Portugal, Romania, Cameroon, Norway
The RNeasy Mini Kit is a laboratory equipment designed for the purification of total RNA from a variety of sample types, including animal cells, tissues, and other biological materials. The kit utilizes a silica-based membrane technology to selectively bind and isolate RNA molecules, allowing for efficient extraction and recovery of high-quality RNA.
Sourced in United States, Germany, Canada, United Kingdom, France, China, Japan, Spain, Ireland, Switzerland, Singapore, Italy, Australia, Belgium, Denmark, Hong Kong, Netherlands, India
The 2100 Bioanalyzer is a lab equipment product from Agilent Technologies. It is a microfluidic platform designed for the analysis of DNA, RNA, and proteins. The 2100 Bioanalyzer utilizes a lab-on-a-chip technology to perform automated electrophoretic separations and detection.

More about "Base Pairing"

Molecular biology is a fascinating field that delves into the fundamental building blocks of life - nucleic acids and their intricate base pairing mechanisms.
At the heart of this discipline lies the concept of base pairing, where complementary nucleic acid bases (adenine-thymine, guanine-cytosine) form stable hydrogen bonds, enabling the iconic double-helix structure of DNA and the secondary structure of RNA.
This process is crucial for accurate genetic information storage, replication, and expression.
Understanding base pairing is essential for a wide range of biological research, from DNA sequencing using cutting-edge technologies like the HiSeq 2500, HiSeq 2000, NextSeq 500, HiSeq 4000, and NovaSeq 6000, to structural biology and drug design.
The Agilent 2100 Bioanalyzer and the MiSeq platform are just a few of the innovative tools that researchers use to study and optimize base pairing protocols.
These advanced instruments, coupled with powerful reagents like the TRIzol reagent and the RNeasy Mini Kit, enable researchers to gain unprecedented insights into the complex world of molecular biology.
Experiecnce the power of this core molecular mechanism and accelerate your research with innovative tools that optimize base pairing protocols.
Discover how PubCompare.ai enhances research accuracy by locating the best protocols from literature, pre-prints, and patents using AI-driven comparisons.
Identify the most effective products and accelerate your research with this innovative tool.
Experiecnce the power of PubCompare.ai today and unlock the secrets of the molecular world.