BITS Meeting 2021 .:. BITS Bioinformatics Italian Society

Invited Speakers

Keynotes

Pedro Beltrao PhD, EMBL-EBI, Cambridge UK

Pedro Beltrao is a group leader at EMBL-EBI. He earned a PhD in Biology from the University of Aveiro in 2007 by conducting his research at the EMBL-Heidelberg. He was a postdoctoral researcher at the University of California San Francisco before joining the EMBL-EBI as a group leader in 2013

Talk title: Disease and functional relevance of human protein phosphorylation

Abstract: Cells need to constantly adapt to changes in conditions and use post-translational regulation as a fast way to transfer information from sensors to effectors of cellular responses. Advances in mass-spectrometry now allow us to identify post-translational modification (PTMs) sites in large scale and to quantify their changes across different conditions. While there have been over 100,000 identified phosphorylation sites in human proteins, less than 5% of these have a known functional role or known kinase regulator. We have develop computational and experimental approaches to rank which phosphosites are more likely to be critical for the cell. In our latest effort, we used a machine learning approach to score human phosphosites according to their relevance for organismal fitness using information such as the degree of conservation or regulation. This functional score can identify phosphosites that are more likely show phenotypes when mutated or be associated with human diseases. We have also worked on approaches that try to predict the kinase-kinase regulatory network and how to use large scale phosphoproteomics to infer the activation state of kinases. As an example we have applied these approaches to study the changes in kinase signalling across tumour samples or occurring during SARS-CoV-2 infection. Based on the viral phosphorylation studies we could show how SARS-CoV-2 infection promoted casein kinase II (CK2) and p38 MAPK activation and the inhibition of cell-cycle kinases. These were linked to production of diverse cytokines, cell cycle arrest and stimulation of filopodial protrusions. These approaches are starting to give us a less biased understanding of the kinase signalling network and uncovering the importance of phospho-regulation across multiple aspects of cell biology and disease.

Annalisa Marsico PhD, Helmholtz Zentrum, Munich DE

She is group leader of the Computational RNA Biology team at the Institute of Computational Biology (Helmholtz Zentrum München) since 2019. She obtained a PhD in Bioinformatics at the Technical University (TU) of Dresden as part of the International IMPRS PhD graduate school of the Max Planck Institute for Molecular Cell Biology (2009). She was a postdoctoral researcher at the Max Planck Institute for Molecular Genetics (MPIMG) Berlin (2009-2013), and assistant professor at the Freie Universität Berlin and junior group leader at the Max Planck Institute (2013-2019).

Talk title: Machine learning models of post-transcriptional regulation of RNAs

Abstract: Transcriptional and post-transcriptional gene regulation occur at several levels and are highly controlled and interconnected processes, whose alterations contribute to the genesis and progression of complex diseases. Many newly discovered (long) non-coding RNAs, as well as RNA Binding Proteins (RBPs), have been found to be crucial players in establishing complex regulatory processes in the cell. They also represent a new class of biomarkers and/or therapeutic targets in several biomedical applications. In this talk, I will first introduce the statistical models and machine learning approaches developed in our lab to detect protein-RNA interactions from high-throughput CLIP-seq data, as well as characterize in vivo RBP binding preferences, and second, I will show two case studies where such model have been applied to perform predictions of protein-RNA interactions in the absence of experimental data. In the first application, I will show how we can characterize, in silico, the RBP binding landscape of enhancer-like long non-coding RNAs, and in the second application I will show how we can generate an in silico map of host RBP – viral RNA interactions in SARS-CoV-2 infection.

Jacques van Helden PhD, Aix-Marseille University and IFB, Marseille FR

He is professor of bioinformatics at Aix-Marseille Université (Marseille, France). Since 1997, his research activities consist in conceiving, developing, assessing and applying bioinformatics and statistical approaches to analyse genomic sequences, gene regulation (http://rsat.eu/) and biomolecular networks (regulation, protein interactions, metabolism). His teaching activities cover various domains of bioinformatics and life sciences: fundamentals of bioinformatics, genomics, analysis of regulatory sequences, analysis of biomolecular networks, biostatistics, evolutionary biology, science & society. He is also the co-director of the Institut Français de Bioinformatique (IFB).

Talk title: Investigating the origins of SARS-CoV-2 in coronavirus sequences

Abstract: Eighteen months after the beginning of the COVID-19 pandemic, the origins of the SARS-CoV-2 coronavirus remains elusive. Understanding the origin of this virus is however an important challenge, since tracing the history of the emergence might be crucial to prevent future epidemics. In February 2020, early publications claimed that the proximal origins could only be the natural zoonosis of a virus resulting from the recombination between bat and pangolin coronaviruses. However, the pangolin hypothesis has subsequently been dismissed, and no other intermediate host has been identified despite tens of thousands of samples collected from natural sites and farms.

Despite the high politicisation of the debates and the complexity of the context, the scientific community should be able to address the origin of the virus as a scientific question, and to lead an evidence-based and prejudice-free investigation. During this talk, I will discuss the different scenarios on the origin – natural or synthetic – of the virus, and summarise the results of some bioinformatics analyses led by ourselves and by colleagues to gain insights into different plausible hypotheses. Even though the data currently available is not sufficient to firmly assert whether SARS-CoV2 results from a zoonotic emergence or from a laboratory strain, a close analysis of genomic sequences already provides us with some hints to evaluate the possible scenarios.

Alex Graudenzi PhD, Institute of Molecular Bioimaging and Physiology, Consiglio Nazionale delle Ricerche (IBFM-CNR)

Alex Graudenzi is a tenured researcher of the Institute of Bioimaging and Molecular Physiology, Consiglio Nazionale delle Ricerche (IBFM-CNR), Milan. He is co-head of the Data and Computational Biology Group of the Univ. of Milan-Bicocca and director of the Lake Como Workshop and School on Cancer, Development and Complexity. His research activity lies at the boundaries of computer science, bioinformatics and complex systems, with the goal of delivering innovative computational methods for the investigation of complex biological systems and phenomena, according to an inherently cross-discipline vision of biomedical research. He is author of 50+ publications on indexed international journals and conference proceedings, recipient of several research grants and awards and computational task leader in a wide number of international research projects.

Talk title: Characterizing the evolution of SARS-CoV-2 and the generation of genomic variants from sequencing data of viral samples

Abstract: The exceptional gravity of the COVID-19 pandemic has fostered the surging of works that analyze SARS-CoV-2 consensus sequences to analyze the evolution and diffusion of the virus. Yet, most approaches do not account for intra-host genomic diversity, which results from the complex interplay between host-related mutational processes and the transmission dynamics of genomic variants. The characterization of such diversity can be achieved by leveraging raw sequencing data of SARS-CoV-2 samples, which are starting to become available in public databases. To this end, we first introduce VERSO, a two-step framework for the characterization of viral evolution from sequencing data of viral samples, which is an improvement over standard phylogenomic approaches for consensus sequences. VERSO exploits an efficient algorithmic strategy to return robust phylogenies from clonal variant profiles, also in conditions of sampling limitations. It then leverages variant frequency patterns to characterize the intra-host genomic diversity of samples, revealing undetected infection chains, validated with contact tracing data, also allowing one to pinpoint possibly hazardous variants. We then present a second framework, named VirMutSig, which aims at identifying and quantifying the mechanisms responsible for the generation of genomic variants. By employing an NMF-based approach successfully employed in the analysis of cancer mutational processes, our approach allows the de-novo discovery of statistically significant viral mutational signatures, i.e., nucleotide substitution patterns, the existence of which suggests that distinct hosts may respond in different ways to SARS-CoV-2 infection.

José Duarte PhD, Assistant Project Scientist - RCSB Protein Data Bank, SDSC

Jose Duarte is an Assistant Project Scientist at the RCSB Protein Data Bank in SDSC. He received an MSc degree in Physics from Universidad Complutense (Madrid) and later an MSc in Bioinformatics from Birkbeck College, University of London. He further pursued a PhD in Structural Bioinformatics at the University of Zurich under the supervision of Dr. Guido Capitani from the Paul Scherrer Institute in Switzerland. He worked as a software engineer at the Max Planck Institute for Molecular Genetics in Berlin and later as a postdoc at the ETH Zurich. He joined the RCSB Protein Data Bank in 2015 where he acts as the Scientific Team Lead for the San Diego site of the RCSB PDB. His research is centered around structural bioinformatics algorithms, with a focus on protein quaternary structure analysis and evolution.

Talk title: The PDB at 50: a history of structural bioinformatics

Abstract: The year 2021 marks the 50th anniversary of the creation of the Protein Data Bank. This is a momentous turn in the history of Structural Biology, with over 170,000 structures now stored in the PDB archive. The Structural Bioinformatics field came to existence with the birth of the PDB and then grew hand in hand with it. I will cover the major milestones in Structural Bioinformatics in those 50 years and provide my view on what are the next big challenges ahead.