Leader: Graziano Pesole (UNIBA); Other collaborator(s): Ernesto Picardi
We plan to analyze and extract early detection markers of age-related diseases from biochemical, epigenetic, transcriptomic, metabolomic and proteomic data, and implement innovative artificial intelligence (AI)-based strategies to process heterogeneous data.
Brief description of the activities and of the intermediate results: We are proceeding with the development of new computational tools for the study of RNA editing events using direct RNA sequencing with Nanopore technology. In particular, a basecaller prototype for inosines is being optimized. In parallel, a deep learning algorithm is being improved for the identification of A-to-I RNA editing events in RNAseq experiments obtained with Illumina technology. A manuscript describing a new method for characterizing C-to-U RNA editing events in RNAseq Nanopore has been published in January.
A tool, called REDInet, for the identification of RNA editing events in RNAseq experiments from human samples has recently been completed. The algorithm will be described in a manuscript that should be submitted for publication by the end of July 2024. In the meantime, the REDIportal database, used to feed the REDInet algorithm, has been updated. Finally, a tool for the identification of inosines in direct RNA sequencing data using Nanopore technology is being optimized.
The manuscript describing the deep learning algorithm implemented in the REDInet tool was submitted for publication in August and is under review. A preprint is currently available at https://www.researchsquare.com/article/rs-4900829/v1. REDInet is the first bioinformatics tool designed to identify A-to-I RNA editing events by deep learning using a model trained on more than 15 million known editing events in more than 9000 RNAseq experiments from the international GTEx project. The A-to-I editing events used in the training phase by REDInet are available in the REDIportal database that, in the meantime, has been updated to include A-to-I edits from more than 9000 RNAseq experiments from the TCGA project to assess the impact of RNA editing in more than 30 different tumor types. To facilitate the study of RNA editing, REDIportal has also been updated to include specific cross-links to Database Core Resources such as Ensembl, UniProt, RNAcentral and PRIDE. A manuscript describing the main updates has been submitted in August to the journal Nucleic Acids Research for the annual database issue. A computational framework is also underway for the identification of inosines in direct RNA sequencing data using Nanopore technology. A first prototype, called NanoSpeech, has been developed and is being optimized."
Efforts have been intensified to complete the NanoSpeech basecalling prototype for direct inosine calling in native RNA sequences produced using Oxford Nanopore technology. A software for the creation of training datasets for native RNA sequences produced using nanopore technology has also been developed. A model for the simultaneous calling of inosine and m6A from native RNA sequences is under development to study the interconnection between different epitranscriptomic modifications. The manuscript describing the deep learning algorithm implemented in the REDInet tool is still under review. The manuscript describing the update of the REDIportal database has been accepted for publication in the journal Nucleic Acids Research. It will be available online in the January 2025 Special Database Issue.
A. Fonzino, C. Manzari, P. Spadavecchia, U. Munagala, S. Torrini, S. Conticello, G. Pesole, E. Picardi. Unraveling C-to-U RNA editing events from direct RNA sequencing. RNA Biol. 2024 Jan;21(1):1-14. doi: 10.1080/15476286.2023.2290843.
L. Grasso, A. Fonzino, C. Manzari, T. Leonardi, E. Picardi, C. Gissi, F. Lazzaro, G. Pesole, and M. Muzi-Falconi. Detection of ribonucleotides embedded in DNA by Nanopore sequencing. Commun Biol 7, 491 (2024). doi: 10.1038/s42003-024-06077-w