Bioinformatics toolbox

Manufactured by MathWorks

Sourced in United States

The Bioinformatics Toolbox is a MATLAB toolbox that provides a comprehensive set of functions and algorithms for the analysis and processing of biological data. It includes tools for sequence analysis, phylogenetics, and biological pathway modeling.

Automatically generated - may contain errors

Lab products found in correlation

Matlab, by MathWorks (8 mentions) Matlab 2020a, by MathWorks (1 mentions) Lc q tof, by Bruker (1 mentions) Next generation sequencing, by Eurofins (1 mentions) Sds 2, by Thermo Fisher Scientific (1 mentions) Expression console, by Thermo Fisher Scientific (1 mentions) Masslynx v4, by Waters Corporation (1 mentions) Pls toolbox, by Eigenvector Research (1 mentions) Statistics toolbox, by MathWorks (1 mentions) Pls toolbox 8, by Eigenvector Research (1 mentions) Mcr als toolbox, by MathWorks (1 mentions) Matlab 2019b, by MathWorks (1 mentions) Hiseq 2500, by Illumina (1 mentions) Matlab r2017b, by MathWorks (1 mentions) G2565ca microarray scanner system, by Agilent Technologies (1 mentions)

Spelling variants of this product

MATLAB Bioinformatics Toolbox (9 citations)

20 protocols using bioinformatics toolbox

Predicting TF Binding Site Types

Check if the same lab product or an alternative is used in the 5 most similar protocols

We trained a random forest (RF) classifier to predict whether a TF binding site is a direct or indirect site using the proximal binding of other TFs in the co-binding region. We used the TreeBagger implementation of RF in the MATLAB software (MATLAB and Bioinformatics Toolbox Release 2015b, The MathWorks, Inc., Natick, Massachusetts, United States). More specifically, using the region-TF matrix (159,204 × 167), we took the rows that contained either direct or indirect binding sites of the TF, used the columns corresponding to the direct or indirect binding of the TF as the prediction target and the rest of the columns (binding of the other TFs) as the features. For each sequence specific co-binding TF, the dTF and iTF columns were combined, ignoring the motif information. Therefore, the prediction is harder because it based only on the identity but not the motif information of the co-binding TFs. For each sequence-specific TF, we trained five RFs, each with a distinct random subset (80%) of the data, and then tested on the rest of 20% data. The prediction accuracy values of the five classifiers are then averaged for each TF. The correlation between direct and indirect binding of a TF (shown in Fig. 5c) were computed using the corresponding TF vectors in the module-TF matrix.

Guo Y, & Gifford D.K. (2017). Modular combinatorial binding among human trans-acting factors reveals direct and indirect factor binding. BMC Genomics, 18, 45.

+ Open protocol

+ Expand

Diurnal Gene Expression Analysis of Three Plants

Check if the same lab product or an alternative is used in the 5 most similar protocols

The diurnal expression data with 4-h intervals for Arabidopsis thaliana were obtained from Mockler et al.^{75 (link)} and adjusted to 2-h interval time series by interpolation using the SRS1 cubic spline function (http://www.srs1software.com/). The diurnal expression data with 2-h intervals for K. fedtschenkoi was generated in this study. The diurnal expression data with 2-h intervals for Ananas comosus was obtained from Ming et al.^{4 (link)}. The gene expression data were normalized by Z-score transformation. The hierarchical clustering of gene expression was performed for genes in each ortholog group using the Bioinformatics Toolbox in Matlab (Mathworks, Inc.) based on Spearman correlation (Supplementary Method 14).

Yang X., Hu R., Yin H., Jenkins J., Shu S., Tang H., Liu D., Weighill D.A., Cheol Yim W., Ha J., Heyduk K., Goodstein D.M., Guo H.B., Moseley R.C., Fitzek E., Jawdy S., Zhang Z., Xie M., Hartwell J., Grimwood J., Abraham P.E., Mewalal R., Beltrán J.D., Boxall S.F., Dever L.V., Palla K.J., Albion R., Garcia T., Mayer J.A., Don Lim S., Man Wai C., Peluso P., Van Buren R., De Paoli H.C., Borland A.M., Guo H., Chen J.G., Muchero W., Yin Y., Jacobson D.A., Tschaplinski T.J., Hettich R.L., Ming R., Winter K., Leebens-Mack J.H., Smith J.A., Cushman J.C., Schmutz J, & Tuskan G.A. (2017). The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism. Nature Communications, 8, 1899.

+ Open protocol

+ Expand

Quantitative Phosphoproteomics Analysis

Check if the same lab product or an alternative is used in the 5 most similar protocols

To identify phosphopeptides with significant changes in abundance relative to the 0 s LightR-Src condition, we utilized paired Student’s t-test (p-value<0.05). Additionally, to ensure phosphopeptide changes were not simply background, we filtered for peptides with average abundances (n = 3) that were >1.4 relative to the 0 s LightR-Src condition. The average abundance for each phosphopeptide was log2-transformed and visualized using MATLAB (version R2019b, Bioinformatics Toolbox version 4.13, MathWorks). Data were plotted with the ‘clustergram’ function with hierarchical clustering using Euclidean distance.

Shaaya M., Fauser J., Zhurikhina A., Conage-Pough J.E., Huyot V., Brennan M., Flower C.T., Matsche J., Khan S., Natarajan V., Rehman J., Kota P., White F.M., Tsygankov D, & Karginov A.V. (2020). Light-regulated allosteric switch enables temporal and subcellular control of enzyme activity. eLife, 9, e60647.

+ Open protocol

+ Expand

Downstream Cytokine/Chemokine/Growth Factor Analysis

Cited 1 time

Check if the same lab product or an alternative is used in the 5 most similar protocols

Downstream data analysis exported from xPONENT was performed using MATLAB 2020a with Bioinformatics Toolbox (MathWorks). Sample technical replicates were averaged, and the lowest dilution was generally used for all cytokines/chemokines/growth factors (C/C/GF). When the measured value exceeded the top standard curve, the lower 50-fold dilution value was used instead. No 50-fold dilution values exceeded the top standard value. When the measured value fell below the lowest standard, the concentration was linearly interpolated between the blank and the lowest standard MFI values, provided that the value was 4 standard deviations above background wells.^{[27 (link),144 (link),145 ]} If all samples contained low values below the lowest standard, the cytokine was excluded from further analysis. For heatmap visualization and principal component analysis, C/C/GFs were z-scored across samples. Hierarchical clustering was done using Euclidean distance. For fold change samples, non-stimulated C/C/GFs with very low expression (>50% of samples below the lowest standard or any sample measuring 0 concentration) were removed from the analysis. For principal component plots, error ellipses were drawn using the error ellipse function (AJ Johnson (2020). error_ellipse, MATLAB Central File Exchange).

Wang A.J., Allen A., Sofman M., Sphabmixay P., Yildiz E, & Griffith L.G. (2021). Engineering Modular 3D Liver Culture Microenvironments In Vitro to Parse the Interplay between Biophysical and Biochemical Microenvironment Cues on Hepatic Phenotypes. Advanced nanobiomed research, 2(1), 2100049.

+ Open protocol

+ Expand

Identification and Analysis of Positively Selected Orthologs

Cited 2 times

Check if the same lab product or an alternative is used in the 5 most similar protocols

The orthologous gene pairs between two species were identified through the combination of both Best Reciprocal Hits (BRH) and OrthoMCL strategies. The coding sequences were aligned using PAL2NAL [58 (link)], guided by protein sequence alignment generated by MAFFT (linsi; version 7.045b) [59 (link)], and gaps in the alignment were removed. The gapless coding sequence alignments were used for Ka/Ks ratio calculation using the Bioinformatics Toolbox in Matlab (Mathworks, Inc.) with a 50-codon sliding window. For identifying positively selected sites, coding sequences from Arabidopsis, maize, rice and Agave were aligned by Translatorx [60 (link)] using the standalone script. The HyPhy package were used to identify positively selected sites as described [61 (link)], and the tests of FUBAR and REL models as implemented in Datamonkey webserver were used with default settings [62 (link)]. Since we used a sliding window to study the regions of protein with positive selection, we calculated the probabilities of Ka/Ks positive regions to a null hypothesis that Ka/Ks equals to one by one-sided t-test, as described by Schmid and Yang (2008) [63 (link)].

Yin H., Guo H.B., Weston D.J., Borland A.M., Ranjan P., Abraham P.E., Jawdy S.S., Wachira J., Tuskan G.A., Tschaplinski T.J., Wullschleger S.D., Guo H., Hettich R.L., Gross S.M., Wang Z., Visel A, & Yang X. (2018). Diel rewiring and positive selection of ancient plant proteins enabled evolution of CAM photosynthesis in Agave. BMC Genomics, 19, 588.

+ Open protocol

+ Expand

Phylogenetic Tree Construction from MSAs

Check if the same lab product or an alternative is used in the 5 most similar protocols

The trees were constructed by using subsamples of 4,000 sequences from the original MSAs. The Jukes–Cantor pairwise distance was calculated for the subsamples; this is defined as a maximum-likelihood estimate of the number of substitutions based on the Hamming distance between two sequences. The phylogenetic-tree construction was done by using the neighbor-joining method, assuming equal variance and independence of evolutionary distance estimates, as in refs. 81 (link) and 82 (link). No ancestral sequences were reconstructed. Both the Jukes–Cantor distance and the neighbor-joining implementations are provided in the Bioinformatics Toolbox in MATLAB (MathWorks, Inc.).

de la Paz J.A., Nartey C.M., Yuvaraj M, & Morcos F. (2020). Epistatic contributions promote the unification of incompatible models of neutral molecular evolution. Proceedings of the National Academy of Sciences of the United States of America, 117(11), 5873-5882.

+ Open protocol

+ Expand

Identifying High-Affinity Egr-1 Binding Sites

Check if the same lab product or an alternative is used in the 5 most similar protocols

Analysis of high-affinity sites for Egr-1 in the human genome was conducted using MATLAB software together with the Bioinformatics Toolbox (MathWorks, Inc.; Natick, MA). First, all possible 9-bp sequences were generated. For each of them, the difference in the binding free energy, ΔΔG, for Egr-1 was predicted from the ΔΔG data for single substitutions, and the number of base-pair matches with the Egr-1 recognition sequence was counted. High-affinity sequences for Egr-1 were identified as those that exhibit ΔΔG < 1.3 kcal/mol and ≥ 6 base-pair matches with the Egr-1 recognition sequence. The total number of each of these high-affinity Egr-1-binding sequences in the human genome was counted using the GenBank GRCh38.p7 assembly.

Chattopadhyay A., Zandarashvili L., Luu R.H, & Iwahara J. (2016). Thermodynamic additivity for impacts of base-pair substitutions on association of the Egr-1 zinc-finger protein with DNA. Biochemistry, 55(47), 6467-6474.

+ Open protocol

+ Expand

Automated Metabolite Analysis in MATLAB

Check if the same lab product or an alternative is used in the 5 most similar protocols

Firstly, the raw data obtained through the Bruker LC/Q-TOF were converted to a NetCDF data file (or mzXML files) through the Compass DataAnalysis software 5.2 (Bruker, Germany) and imported into MATLAB R2020a computing and visualization environment. This could be performed either using the MSroi app as described by Pérez-Cova et al. [21 ] in MATLAB or via mass spectrometry directly with the functions of the Bioinformatics toolbox (The Mathworks, Inc., 2020b).
The data from the amino acids’ standard were imported using the MSroi GUI app as a single chromatographic run (single sample option) while the data from the 7 replicate fish embryo samples were imported as multiple chromatographic runs (using the multi-sample option) arranged as a column-wise augmented data matrix (see Fig. 1).

Yamamoto F.Y., Pérez-López C., Lopez-Antia A., Lacorte S., de Souza Abessa D.M, & Tauler R. (2023). Linking MS1 and MS2 signals in positive and negative modes of LC-HRMS in untargeted metabolomics using the ROIMCR approach. Analytical and Bioanalytical Chemistry, 415(25), 6213-6225.

+ Open protocol

+ Expand

Next-Generation Sequencing of PCR Product

Check if the same lab product or an alternative is used in the 5 most similar protocols

The PCR product was subjected to Next-Generation Sequencing by Eurofins Genomics. Sequences were generated with a MiSeq system using a 2x250 paired-end module. A single run was conducted which yielded 20.95 million reads and 5.24 gigabasepairs of data. The percentage of reads with Q score above 30 was 76.19% and the mean Q score was 30.59. The pair-end reads were combined using the FLASH program [24 (link)]. The result was a set of 8,573,790 individual nucleotide sequences (about 82% yield). Sequences were translated and converted to fasta format using Matlab with the BioInformatics toolbox (Mathworks). Searches within this database were carried out using BLAST (National Library of Medicine).

Turner K.B., Naciri J., Liu J.L., Anderson G.P., Goldman E.R, & Zabetakis D. (2016). Next-Generation Sequencing of a Single Domain Antibody Repertoire Reveals Quality of Phage Display Selected Candidates. PLoS ONE, 11(2), e0149393.

+ Open protocol

+ Expand

Comparative Analysis of Diurnal Gene Expression in Arabidopsis and Agave

Check if the same lab product or an alternative is used in the 5 most similar protocols

The Arabidopsis–Agave orthologous gene pairs were identified through the combination of both OrthoMCL strategies and the reciprocal best hits (RBH) based on BLASTp with an E-value cutoff of 1e-5. The diurnal expression data for Arabidopsis thaliana were obtained from Mockler et al. (2007) [9 (link)]. Both Arabidopsis and Agave plants were grown under a photoperiod of 12 h light:12 h dark cycle. The Arabidopsis expression data were collected at 0, 4, 8, 12, 16, 20, and 24 h, whereas the Agave data were collected at 0, 3, 6, 9, 12, 15, 18, and 21 h after the start of the light period [14 (link)]. The cubic interpolation algorithm implemented in Matlab (Mathworks, Inc.) was used to simulate the gene expression levels at additional time points, so that both time-course data sets consisted of the same time points: 0, 3, 4, 6, 8, 9, 12, 15, 16, 18, 20, and 21 h after the start of the light period. The gene expression data were normalized by Z score transformation. The hierarchical clustering of gene expression was performed using the Bioinformatics Toolbox in Matlab (Mathworks, Inc.).

+ Open protocol

+ Expand

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!