Research Project 8

Exploring the transcriptional landscape of environmental samples using metatranscriptomics long reads

About the Project

CEA – Genoscope

Évry, France

France Denoeud

Main Supervisor

Jean-Marc Aury


Université of Paris-Saclay

PhD enrolment

Yearly salary

30.080,16 € (Gross)

Research Objectives

Marine environments remain largely unexplored and unobserved. Planktonic eukaryotes have been shown to harbor non-canonical splice sites and original splicing mechanisms, but a lot are uncultivable. We are aiming at exploring the transcriptional landscape (intron structure, organization and evolution) in marine metatranscriptomics samples. Our objectives are to:

  • Evaluate and develop a method to predict complete protein sequences from LrRNA-seq
  • Reconstruct full exon/intron structures by combining metatranscriptomics and metagenomics long reads of a given sample.
  • Characterize new introns and splicing mechanisms, and better understand the evolution of introns.
  • Use this resource for the annotation of MAGs (metagenomics assembled genomes).

Envisioned Secondments

  •  CRG (Guigo) : assess/compare gene annotation methods using LrRNA-seq data .
  • Stockholm University (Sahlin) assess novel mapping algorithms for long transcripts.

About the Main Supervisor and Host Group

France Denoeud


Our lab (LBGB: Bioinformatics for Genomics and Biodiversity Lab) has been a pioneer in the use of RNA sequencing and has developed tools dedicated to the annotation of eukaryotic genome. Since 2017 we have been part of the ASTER project (Algorithms and software for third generation RNA sequencing). Through this project and its strong network of experts in algorithmics and genomics we have participated in several projects based on RNA sequencing with long reads. The clustering of long reads native from the same gene is an important step to allow correction at the gene/transcript level. This was one of the motivations to develop the CARNAC-LR tool. We have also developed NaS, an error-correction method based on the combination of short and long reads. Generally, these methods are dedicated to DNA reads and have individual heuristics that preclude their use to error-correct transcriptomic reads. We have summarized these results in a complete benchmark.  The ASTER project was a good opportunity to sequence RNA samples and monitor the evolution of the long-reads technology provided by Oxford Nanopore. Our result suggests potential biases in the long-reads approach and also the great potential of direct RNA sequencing. At Genoscope, we also have strong expertise in metatranscriptomics and metagenomics, and performed and analyzed long-reads data from marine samples of TARA expeditions. In addition, we extensively studied introns structure and evolution in several marine organisms.