Research Project 7

Interactome predictions from long-read transcriptome sequencing data

About the Project

Max Planck Institute for Molecular Genetics 

Berlin, Germany

Yalan Bi

Early Stage Researcher

Ralf Herwig

Main Supervisor

Freie Universitaet Berlin

PhD enrolment

Research Objectives

LRS improves our annotation of protein isoforms expressed in a specific biological context and allows the construction of isoform-aware protein-protein interaction networks (IsoPPIs). These IsoPPIs can be used as a scaffold to integrate multi-omics data and to analyze expression effects at the network level. Our objectives are:

  • Catalogue LRS-based isoform expression on a large number of publicly available and proprietary data.
  • Develop an analysis pipeline to predict IsoPPIs from these data and additional PPI and domain information.
  • Apply and validate these IsoPPIs in the context of drug toxicity prediction and cancer drug resistance mechanisms.
  • Implement methods in the IsoTools software.

Envisioned Secondments

  • EI: Method application to cancer datasets.
  • Biobam: Development of isoform network visualization options.

Early Stage Researcher

Yalan Bi

Max Plank Institute,

I hold a Bachelor’s degree in Biological Science at Xiamen University, China. After I completed my Bachelor’s degree, I moved to the Netherlands and earned my Master’s degree in Molecular Biology and Biotechnology from the University of Groningen. Since 2020, I have been working at the European Bioinformatics Institute (EMBL-EBI), initially in Zerbino’s group and later in the Gene Expression team.

Throughout my previous experiences, I have developed skills in addressing biological questions using computational approaches and became proficient with statistical inference of large-scale datasets and software development and optimization. These experiences paved the way for a new chapter in my academic career, leading me to pursue my PhD studies with Herwig’s group at the Max Planck Institute for Molecular Genetics (MPIMG), as part of the LongTREC Marie Skłodowska-Curie Actions Doctoral Networks.

The LongTREC network is a collaboration among prestigious research institutes and organizations across Europe, with the mission to advance current methods and establish standardized workflows for long-read transcriptome sequencing (LRTS) analysis. LRTS is revolutionary, enabling us to obtain full-length transcripts without fragmentation and with remarkable accuracy. However, there is still room for further advancements in sequencing methods and analytical approaches, which can significantly benefit the scientific community in transcriptome research.

My specific project will focus on interactome predictions from LRTS data for biomedical applications. We will collect a large number of existing proprietary and public LRTS data and develop a method to predict protein domains based on the expressed isoforms, map domain-domain interactions (DDIs) to protein-protein interactions (PPIs) and perform functional analysis. The results of this project will help elucidate and predict cellular functions in patients, ultimately aiding in therapy selection and risk prediction.

As a member of the LongTREC network, I am eagerly looking forward to contributing to the advancement of LRTS analysis in collaboration with fellow members.

About the Main Supervisor and Host Group

Ralf Herwig

Max Plank Institute,

The group has developed computational methods for RNA-seq and MeDIP-seq and works on the integrative analysis of these data in order to elucidate the interplay of methylation, gene expression and genome structure that are operative in human (disease) processes related to cancer, diabetes and drug toxicity.

Research covers i) the development of computational methods for the analysis of molecular data often in collaborative research projects with the focus on human diseases and ii) the integration and interpretation of these data in the context of biological networks.