Pavel Baranov (Pasha), a Chemist by training, is Professor of Biomolecular Informatics at University College Cork, Ireland (https://lapti.ucc.ie). He is best known for his works on alternative genetic decoding and for the development of computational and data resources for ribosome profiling. Pasha earned his PhD at Lomonosov Moscow State University and MPI for Molecular Genetics. During his doctoral studies, he utilized biochemical methods and computer modeling to investigate the topology of ribosomal RNA within the ribosome. In 1999, he moved to the University of Utah, where he pioneered the use of comparative sequence analysis for studying genes that use alternative decoding mechanisms, such as ribosomal frameshifting. In 2007 Pasha established a research lab at University College Cork where he continues to investigate the diversity of genetic decoding and regulation of gene expression at the translational level. His research group has developed and maintains RiboSeq.Org, which encompasses popular ribosome profiling resources, including GWIPS-viz, Trips-Viz, and RiboGalaxy. Pasha is also a co-founder of Eirna Bio, a company that specializes in comprehensive translatomics analyses.
Talk title: Junk translation, translation of junk and translation that does not make proteins.
Abstract: The application of ribosome profiling to study mRNA translation in human cells has uncovered a vast amount of translation outside of annotated protein-coding regions. This discovery led to the proposition that non-standard or non-canonical mechanisms of mRNA translation are more prevalent than previously thought. However, a closer examination of the data suggests that non-standard mechanisms are in fact extremely rare, and the "extra translation" observed with ribosome profiling can be explained through standard mechanisms. The discrepancy between current annotations and ribosome profiling data arises from an overreliance on annotation methods that are based on comparative sequence analysis, and inadequate abstract representations of mRNA translation. Many of the translated RNA regions that fall outside of current annotations do not evolve under strong purifying selection, as is typical for annotated protein-coding regions. Nonetheless, this neutral translation is not necessarily phenotypically inconsequential, and therefore requires further characterization. Furthermore, ribosome profiling revealed translation of very short regions that are unlikely to code for proteins, yet many are clearly functional and evolve under strong purifying selection, such as regulatory uORFs (upstream Open Reading Frames). Given these findings, it is important to rethink how protein-coding information is annotated and devise alternative representations.
Erik Bongcam-Rudloff, is a Swedish scientist and full professor in Bioinformatics at the Swedish University of Agricultural Sciences (SLU) in Uppsala, Sweden. He is a renowned expert in the field of bioinformatics and has made significant contributions to the development of bioinformatics research and education in Sweden and internationally. He is Chair of EMBnet, the Global Bioinformatics Network, and his research interests include genomics, transcriptomics, and epigenomics, as well as the application of bioinformatics in various fields such as agriculture, medicine, and conservation biology. He has also been involved in several international initiatives aimed at promoting the use of bioinformatics for sustainable development.
Talk title: Revolutionising Farming: How Bioinformatics is Changing the Face of Precision Agriculture.
Abstract: Bioinformatics is revolutionizing precision agriculture by combining computational and biological sciences to create a more sustainable and efficient farming system. By analyzing large amounts of data from plant genomes, environmental sensors, and crop yield data, bioinformatics can identify the most favorable growing conditions for crops and optimize fertilization and irrigation strategies. In addition to the benefits of bioinformatics, artificial intelligence (AI) based systems are poised to accelerate the process of modernizing agriculture as we know it. AI can quickly analyze large datasets from a wide range of sources, such as satellite imagery, weather data, and soil samples, to provide farmers with real-time insights about their crops. This approach leads to reduced use of resources, lower environmental impact, and higher yields. Additionally, bioinformatics allows for the identification of genes that control desirable traits in crops, which can be used to develop new cultivars that are more resistant to pests and disease, and have higher nutritional value. With bioinformatics, precision agriculture is becoming more precise, sustainable, and productive, leading the way to a brighter future for our planet and its inhabitants. During my presentation, I will provide examples of what can be successful in this field, as well as potential pitfalls to avoid.
Prof. Chris Evelo, is the founding head of the Department of Bioinformatics - BiGCaT at Maastricht University and a PI in the Maastricht Center for Systems Biology (MaCSBio). He holds a chair in Bioinformatics for Integrative Systems Biology, aiming to better interpret experimental data through integration in data models that build on structuring existing knowledge. He is a co-lead of the ELIXIR Interoperability Platform, Deputy Head of the Dutch ELIXIR Node and board member of the Open PHACTS foundation, a think tank for large-scale knowledge structuring of relations between chemicals, gene products, pathways and diseases. For pathway analysis, Chris’ group developed PathVisio, a modular open-source pathway curation and analysis tool, and apps to link pathway analysis to network analysis in Cytoscape to allow network extension with targeted relationships. Chris was a scientific advisor for the recent IMI project for translational quantitative systems biology (TransQST) and a partner in the COST CHARME project for the harmonisation of standards in biology, in the EU H2020 toxicology projects Eu-ToxRisk and OpenRiskNet. Chris co-leads the systems biology work package in the European Joint Project for Rare Diseases. In the nutrition domain, he is part of the management team of the nutrigenomics organisation NuGO, a partner in the Food Nutrition Security Cloud (FNS-cloud) project and the new NUTRIOME Marie Curie Doctoral Network. He helped conceptualise and develop the phenotype database. For data organisation, he also was a member of the IMI project FAIRplus which wrote the FAIR-cookbook where he worked on mappings between equivalent FAIR data identifiers and ontological terms.
Talk title: For a nutritionist the proof of the pudding comes after the eating.
Abstract: If you enjoy the fabulous Italian kitchen, your first thought is probably not “How do I study whether this is healthy”? But for nutritional scientists, that is a very important question. Unfortunately, it is not an easy question to answer. You typically want to use a wide variety of data about food, consumption, lifestyle and biological effects. For that purpose, biological effects are often accessed on a molecular level using, for instance, transcriptomics analysis. The complexity of the interactions between diet, lifestyle and human biology typically often leads to a buffered system response, where there are no specific molecular endpoints that stand out and show strong responses to diet changes, even though we often can observe that people get healthier when changing diets. In part that is because False Discovery Rate corrections are a two-edged sword, because correcting for false positives also makes you lose a lot of real positives. Often it looks like the findings are actually there, but then disappear during the analysis. System biology approaches that reuse data and take into account what we know about biological processes and pathways can help to solve this. After all, it is unlikely that random, false positive, gene products all show up in the same process or pathway. And seeing the results in the context of known biological processes often leads to better understanding. It also adds a possibility to extend and evaluate the biological context using methods from network biology. Of course, all this is not only true for nutrition and we apply the same approaches in fields like toxicology, rare disease and cancer. To get all this to work better we need FAIR data deposition and reuse infrastructure, machine and human-readable descriptions of pathways and gene sets relevant to nutritional health, and interoperability and analysis tools and workflows. For that reason, we connect to infrastructure developments, like they happen in many projects but especially ELIXIR.
Preparata Lecture
Silvio Tosatto, Full Professor in Bioinformatics, Chair of Biochemistry, has been PI of the BioComputing UP lab at the Department of Biomedical Sciences (University of Padova) since 2002. Previously, he earned his MSc and PhD from the University of Mannheim (Germany). His work in protein bioinformatics focuses on the structural and mechanistic aspects of complex systems (e.g. cancer) as well as the provision of services and databases for the scientific community. Prof. Tosatto is heavily involved in ELIXIR, the European infrastructure for biological data, where he is Deputy Head of Node for the Italian node, ExCo (co-lead) of the Data Platform and co-lead of the Machine Learning (ML) focus group. His lab is part of the Gene Ontology, InterPro, Pfam and PDBe-KB consortia, where it contributes data on intrinsically disordered and repetitive proteins from the MobiDB, DisProt, PED and RepeatsDB databases hosted in Padova. His on-going work in evaluating ML methods has prompted the DOME recommendations for ML publications and the CAID experiment for assessing predictors of intrinsically disordered proteins.
Talk title: CAID 2: Lessons from the Second Critical Assessment of Protein Intrinsic Disorder Prediction.
Abstract: Protein intrinsic disorder (ID) is a complex and context-dependent phenomenon that covers a continuum between fully disordered states and folded states with long dynamic regions. The lack of a ground truth that fits all ID flavors and the potential for order-to-disorder transitions depending on specific conditions makes ID prediction challenging. The second round of the Critical Assessment of protein Intrinsic Disorder prediction (CAID 2) challenge aimed to evaluate the performance of different prediction methods across different benchmarks, leveraging the annotation provided by the DisProt database, which stores the coordinates of ID regions when there is experimental evidence in the literature. The CAID 2 challenge demonstrated varying performance of different prediction methods across different benchmarks, highlighting the need for continued development of more versatile and efficient prediction software. Depending on the application, researchers may need to balance performance with execution time when selecting a predictor. AlphaFold seems to be a good ID predictor but it is better at detecting absence of order rather than ID regions as defined in DisProt. The CAID 2 predictors can be freely used through the CAID Prediction Portal, and CAID has been integrated into OpenEBench, which will become the official platform for running future CAID challenges.
BITS Lecture
Mario Cannataro is a full professor of computer engineering and the director of the Data Analytics research center at the University "Magna Græcia" of Catanzaro, Italy. His current research interests include bioinformatics, health informatics, artificial intelligence, data mining, parallel computing. He has published 5 books and more than 300 papers in international journals and conference proceedings. Mario Cannataro is Editor-in-Chief of the Encyclopedia of Bioinformatics and Computational Biology 2nd Ed., and Associate Editor of Briefings in Bioinformatics and IEEE/ACM Transactions on Computational Biology and Bioinformatics journals. He is a Senior Member of ACM, ACM SIGBio, IEEE, IEEE Computer Society, BITS (Bioinformatics Italian Society) and SIBIM (Italian Society of Biomedical Informatics). He regularly co-organizes international workshops on bioinformatics and high- performance computing in primary conferences such as ACM-BCB, IEEE- BIBM and ICCS.
Talk title: High Performance Analysis of Omics Data: 20 Years Experiences at University Magna Graecia of Catanzaro.
Abstract: Omics sciences (e.g. genomics, proteomics, and interactomics, to cite a few) are gaining an increasing interest in the scientific community due to the availability of novel, high throughput platforms for the investigation of the cell machinery, and have a central role in the so called P4 (predictive, preventive, personalized and participatory) medicine and in particular in cancer research. High-throughput experimental platforms, such as next generation sequencing, microarray, mass spectrometry, clinical diagnostic tools, and medical imaging, are producing overwhelming volumes of molecular and clinical data and the storage, integration, and analysis of such data is today the main bottleneck of the bioinformatics pipelines. Indeed, textual documents, such as clinical records, Electronic Health Records, and patient’s blogs describing their healthcare experiences (Narrative Medicine), are more and more used in the biomedical research to extract patient’s opinions and sentiments about their healthcare experience, by using NLP, Text Mining, and Sentiment Analysis methods. The talk recalls main omics data and discusses some parallel and distributed bioinformatics tools developed in the last 20 years at University Magna Graecia of Catanzaro and their application in cancer research. Some recent applications of Sentiment Analysis to mine patient’s blogs and questionnaires, as well as recent initiatives to exploit Electronic Health Records in COVID-19 research are also discussed.
Invited Talks
Gabriella Casalino is currently an Assistant Professor at the Computational Intelligence Laboratory (CILab) of the Informatics Department of the University of Bari, working on machine learning techniques applied to the Web Economy domain. This position has been funded by the Italian Ministry of University and Research (M.U.R.) through a European project. Her research activity is focused on computational Intelligence with a particular interest in data analysis. Three are the main themes she is currently working on: Intelligent Data Analysis, Computational Intelligence for eHealth, and Data Stream Mining. She is active in the computer science community as a reviewer for international journals and conferences. She is also involved in the organizing committees of international conferences such as IEEE EAIS, Eusflat, FUZZ-IEEE. She is Associate Editor of the Journal of Intelligent and Fuzzy Systems and she is Guest Editor of several special issues (IEEE SMC magazine, IEEE Transactions on Computational Social Systems). She was visiting researcher at the Universitè de Mons (Belgium), at the Polish Academy of Sciences, the Warsaw University of Technology (Poland), the University of Ghent (Belgium), and the Adam Mickiewicz University in Poznan (Poland). She is Senior member of IEEE society and she received the FUZZ-IEEE best paper award.
Talk title: Explainable Artificial Intelligence in Biomedicine for trustworthy decision.
Abstract: With the surge of biomedical data science, more and more AI techniques are employed to discover knowledge, unveil latent data behavior, generate new insight, and seek optimal strategies in decision-making. Different AI methods have been proposed and developed in almost all different biomedical data science fields that range from drug discovery to electronic medical records (EMRs) data automation, single-cell RNA sequencing, early disease diagnosis, and healthcare analytics. The AI methods also generate a massive amount of data that not only bring unpreceded progress in biomedical fields but also new challenges. One of the key challenges is the explainability of AI in biomedical data science problem-solving. It refers to that an AI method or system should not only bring good results but also be transparent, i.e., let users know why this way is the optimal one rather than the others. When AI models cannot explain themselves well, it is likely to encounter a high risk to make incorrect decisions and decrease their trustworthiness and reliability, even if it has the advantage of accuracy, speed, or hidden data relationship revealing. Therefore, it is urgent to develop explainable Artificial Intelligence (XAI) methods that act more transparently to provide reliable results along with good interpretations on ‘why it works’ rather than only ‘it works’. Furthermore, biomedical data science requires a higher standard for the AI methods’ interpretability and transparency because of its special subjects and application domains. It can be hard or even dangerous to believe the results from non-transparent AI methods because opacity can be harmful and unpredictable.
Anna De Grassi is a scientist and associate professor in Applied Biology at the University of Bari, Italy. She has been working in the bioinformatics field for fifteen years encompassing a variety of topics in genomics research. She worked on vertebrate genomics during her Ph.D. at the University of Bari, she then moved to cancer genomics as a post-doc and staff scientist at the European Institute of Oncology (Milan), and to bacterial genomics as a maître de conférences at the EPHE (Paris). Coming back to the University of Bari, she developed a method for identifying structurally-relevant protein sites in the mitochondrial carrier family. She also coordinated an interdisciplinary group for discovering and validating new pathogenic DNA variants in human inherited disorders. She was also a member of the BROWSer Company (Bioinformatics Resource for Omics Wide Services) and co-authored three international patents for cancer diagnosis and drug design.
Talk title: Genome-Wide Identification and Validation of Gene Expression Biomarkers in the Diagnosis of Ovarian Serous Cystadenocarcinoma.
Abstract: Despite ovarian serous cystadenocarcinoma (OSCA) being a high-incidence type of cancer, limited molecular screening methods are available and the diagnosis mostly occurs at a late stage. This study aimed to screen the potential of gene expression for identifying OSCA-specific molecular biomarkers for improving diagnosis. A genome-wide survey was performed on high-throughput RNA-sequencing experiments on hundreds of ovarian cancer samples and healthy ovarian tissues, providing several putative OSCA biomarkers, which were then validated on an independent sample set and using a different RNA-quantification technology. Combinations of gene expression biomarkers were identified, which showed high accuracy in discriminating OSCA tissues from their normal counterpart and other tumor types. This story tells of a simple and unconventional bioinfo pipeline that can concretely improve the molecular diagnosis of OSCA.