Comprehensive TALEN Off-Target Analysis

Every single replicate, treated and untreated control, is processed independently from the alignment up to the cluster definition, as described in (39). Then, an overlap analysis is performed to unify the clusters from several replicates. Clusters overlapping or separated by less than 1,500 bp are merged and considered as a single translocation event [see (Turchiano et al., 2021 (link)) for details]. Based on the number of replicates, the user can define the minimum number of replicates where the site was found, and the minimum number of samples in which the site was significantly different from untreated control (i.e., the number of reads was significantly higher in treated vs. untreated based on Fisher’s exact test).
Barcode hopping: We introduced an additional filter to eliminate artifacts generated by barcode hopping events. Barcode hopping are identified by their low reads:hits ratio in comparison to real translocation events by the formula: log10 (reads:hits) distribution (Coverage: For the remaining sites, the read coverage is calculated in order to identify highly covered regions. Sites are divided into 100 bins of equal size. For each site, the coordinates of bin with the highest coverage across all replicates is used for downstream analysis instead of the whole site coordinates. This new feature restricts the alignment against the target sequence to a smaller, and highly covered region. This makes the alignment more specific and less prone to identification of false-positive OMTs/HMTs.
Alignment: A new TALEN-specific substitution matrix was implemented (Supplementary Tables S12) inspired by (18), and analysis restricted to four TALEN combinations: LF.LR, LF.RR, RF.RR, and RF.LR (L/RX, left/right; XF/R, forward/reverse). In order to determine the best combination, i.e., the one that is most likely cleaving an off-target site, different spacer lengths from 8 to 28 bp, are tested for each combination. Artificial sequences, representing binding sites of two TALEN arms separated by a spacer “N_k” of 8–28 nucleotides (k belong to 8:28) are tested. N can match any bases without cost, therefore the length of the spacer does not influence the alignment score by itself. An example sequence is shown in Supplementary Figure S2B. Alignment score is calculated using the pairwise Alignment function from Biostrings R package with a “local-global” alignment type. The different TALEN combinations and spacer lengths are first selected based on two criteria: a) The first (5′) aligned base is a T, b) the last (3′) aligned base is an A. Then we ordered them based on the alignment score and define the highest score as the most probable TALEN combination and spacer length for a given target site. The same approach was performed on randomly selected regions over the entire genome to determine the overall distribution of the alignment score on random sequences. p values of a given combination and spacer length are assessed based on the empirical cumulative distribution function. Sites with p values below 0.05 are considered as OMT. HMTs and NBSs were classified in the same way as described in (39).

Free full text: Click here

Rhiel M., Geiger K., Andrieux G., Rositzka J., Boerries M., Cathomen T, & Cornu T.I. (2023). T-CAST: An optimized CAST-Seq pipeline for TALEN confirms superior safety and efficacy of obligate-heterodimeric scaffolds. Frontiers in Genome Editing, 5, 1130736.

Publication 2023

Arms Binding sites Genome Hmts Nucleotides Replicate Sequence alignment Talen Translocation

Corresponding Organization : University Medical Center Freiburg

Other organizations : German Cancer Research Center, Heidelberg University

Top 5 similar protocols

Variable analysis

independent variables

Treated samples
Untreated control samples

dependent variables

Overlapping clusters from several replicates
Minimum number of replicates where the site was found
Minimum number of samples in which the site was significantly different from untreated control (i.e., the number of reads was significantly higher in treated vs. untreated based on Fisher's exact test)
Highly covered regions (coordinates of bin with the highest coverage across all replicates)

control variables

Untreated control samples

controls

Untreated control samples

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!