CNN-based Multimodal Prediction Protocol

We used CNNs with multiple cell structures that have two one-dimensional convolution layers, one pooling layer and one dropout layer. Convolution layers are designed to extract features with high-dimensional abstract representation. The pooling layer limits the number of model parameters tractable by pooling operations. The dropout layer prevents overfitting of the model by randomly setting some of the input units to a value of 0. Four prediction methods had been established based on four different network structures composed by the cell structures mentioned above. One-hot encoding data were fed into the network with four cell structures and fully connected layers as input, while neighboring methylation state encoding data, RNA word embedding data, and Gene2vec processing data were fed into networks with two cell structures (Fig. 10). The final result was obtained by a voting strategy from the four prediction probabilities.
Taking an example of the one-hot coding sequence, the input data matrix X_n was first fed into a 1D-convolutional layer, which used a convolutional filter W_f ∈ R^H, where H is the length of the filter vector. The output feature A_i at the ith position was computed by

A_{i} = R e L U (\sum_{h = 1}^{H} W_{f} X_{n, i + h} + b_{f}),

where ReLU(x) = max(0, x) is the rectified linear unit function and b_f ∈ R is a bias (Mairal et al. 2014 ). These convolutional operations are similar to data block of H length in sequence filtered by a sliding filter window at each ith position.
Next, a max. pooling layer was used for reduction of the dimensions of output data generated by the multiple convolutional filter operations. A max. pooling layer is a form of nonlinear downsampling achieved by outputting the maximum of each subregion.
To reduce overfitting, we added a dropout layer in which individual nodes were either “dropped out” from the network with probability 1 − P or kept with probability P at each training stage. This not only prevented overfitting, but also led to integration of various deformed network structures to generate more robust features that are more generalizable to new data.
Finally, a flattening layer that “flattened” the input data was used, which transformed multidimensional data into a single dimension. Fully connected layers with an ReLU activation function and output layer predict the binary classification probability with activation function as follows (Han and Moraga 1995 ):

\hat{y} (x) = sigmoid (x) = (\frac{1}{1 + e^{- x}}) .

Partial Protocol Preview
This section provides a glimpse into the protocol.
The remaining content is hidden due to licensing restrictions, but the full text is available at the following link: Access Free Full Text.

Zou Q., Xing P., Wei L, & Liu B. (2019). Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA. RNA, 25(2), 205-218.

Publication 2019

Cell structures Coding sequence Filter operations Length Methylation Sigmoid Vector

Corresponding Organization :

Other organizations : Tianjin University, University of Electronic Science and Technology of China, Harbin Institute of Technology

Top 5 similar protocols

Protocol cited in 8 other protocols

Variable analysis

independent variables

One-hot encoding data
Neighboring methylation state encoding data
RNA word embedding data
Gene2vec processing data

dependent variables

Binary classification probability

control variables

Not explicitly mentioned

Annotations

Based on most similar protocols

Etiam vel ipsum. Morbi facilisis vestibulum nisl. Praesent cursus laoreet felis. Integer adipiscing pretium orci. Nulla facilisi. Quisque posuere bibendum purus. Nulla quam mauris, cursus eget, convallis ac, molestie non, enim. Aliquam congue. Quisque sagittis nonummy sapien. Proin molestie sem vitae urna. Maecenas lorem.

As authors may omit details in methods from publication, our AI will look for missing critical information across the 5 most similar protocols.

About PubCompare

Our mission is to provide scientists with the largest repository of trustworthy protocols and intelligent analytical tools, thereby offering them extensive information to design robust protocols aimed at minimizing the risk of failures.

We believe that the most crucial aspect is to grant scientists access to a wide range of reliable sources and new useful tools that surpass human capabilities.

However, we trust in allowing scientists to determine how to construct their own protocols based on this information, as they are the experts in their field.

Ready to get started?

Revolutionizing how scientists
search and build protocols!