• Home
  • Explore

Analysis approaches for the identification and prediction of N6-methyladenosine sites

www.tandfonline.com/doi/full/10.1080/15592294.2022.2158284

1 Users

0 Comments

23 Highlights

0 Notes

Tags

Top Highlights

  • Because both the transcript structure and the relative position on the transcript are found to be related to the occurrence and function of RNA sub-molecular events, transcript annotation could be used as an information source for predicting m6A modification

  • In order to construct robust and precise machine learning predictors for predicting RNA modification sites, multiple features were designed and extracted to encode RNA sequences.

  • Most m6A sites prediction methods and web servers extracted input features from the sequence-derived information and other genomic information and predicted m6A sites by various machine learning approaches. Finally, the performance of the measures is evaluated (Figure 3).

  • In the computational approaches published at present, features are mainly divided into six categories [Citation110], including RNA primary sequence-derived features, nucleotide physicochemical properties, predicted RNA structural features, position-weighted matrix, RNA sequence similarity feature and genomic-derived features.

  • Numerous tools have been developed for feature extraction and modelling of primary sequences, such as BioSeq-Analysis [Citation111,Citation112], PyFeat [Citation113] and BioSeq-BLM [Citation114].

  • Furthermore, the perturb method and the SFS are also used for feature selection.

  • Therefore, geographic encoding of transcripts might be used for deep learning models applied to RNA transcripts.

  • Compared to other deep learning models, the transcript region information incorporated into genomic features by WHISTLE greatly improves its performance.

  • Combined with one-hot encoding, more informative and interpretable sub-molecular geographic descriptors of transcripts are provided.

  • Natural language processing is used to feature extraction and classification of m6A methylation sites with consideration of context information

  • In addition to predicting m6A-containing sequences, the biological features surrounding m6A could be characterized to elucidate its regulatory code

  • Besides, conservation analysis of individual m6A sites is achieved by a novel scoring framework, ConsRM

  • However, information regarding the position relative to the boundaries of the long-range is neglected.

  • In addition, one-hot encoding is widely used to describe the transcript region [Citation129], but it may result in an incomplete landscape of the local transcript structure.

  • To fill the gap, three novel encoding methods, landmarkTX, gridTX, and chunkTX, were developed by Geo2vec

  • Furthermore, experimental results indicate that the base, upstream, and downstream information of m6A sites are all critical to detection.

  • Matthews Correlation Coefficient (MCC)

  • The higher the AUC and AUPRC value, the better the prediction performance.

  • K-fold cross-validation test

  • jackknife validation test

Ready to highlight and find good content?

Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.

AboutPrivacyTerms

© 2023 Glasp Inc. All rights reserved.