Dominik Heider (born 1982) is Professor for Bioinformatics at the Department of Mathematics and Computer Science at the University of Marburg, Germany.He studied Computer Science at the University of Muenster from 2002 until 2006 and subsequently started his PhD studies at the Department of Experimental Tumorbiology and the Department of Computer Science at the University of Muenster. After receiving his PhD in 2008, he worked as a Postdoc at the Department of Bioinformatics at the University of Duisburg-Essen where he finished his habilitation thesis in 2012. He then became an Associate Director and Head of the Clinical and Diagnostic Bioinformatics at QIAGEN, and in 2014 he accepted a professorship for Bioinformatics at the Straubing Center of Science and an adjunct professorship at the Technical University of Muenchen, before joining the University of Marburg in 2016. His main research focus is set on the development of bioinformatics solutions for next-generation sequencing (NGS) data, e.g., machine learning algorithms for predicting drug resistance of pathogens or for modeling of diseases. In another main part of our research he aims to develop new methods and algorithms for analyzing (meta-)genomic and (meta-)transcriptomic data of microorganisms, as well as genome assembly and functional annotation. Since NGS technologies have great potential in biomedical research but data processing is still limited by computational power, he further investigates in techniques based on high-performance computing. Dominik Heider is an Associate Editor for the international journals BMC Bioinformatics and BioData Mining. Moreover, he is a member of the PC of the German Conference on Bioinformatics. He is also a member of the German Society for Computer Science and FaBI.
Pietro Liò is a Reader in Computational Biology in the Computer Laboratory which is the department of Computer Science of the University of Cambridge and I am a member of the Artificial Intelligence group of the Computer Laboratory.
He has a MA from Cambridge, a PhD in Complex Systems and Non Linear Dynamics (School of Informatics, dept of Engineering of the University of Firenze, Italy) and a PhD in (Theoretical) Genetics (University of Pavia, Italy).
Andrew C.R. Martin
Andrew Martin studied Biochemistry at the University of Oxford where he stayed for his D.Phil. After working at the National Institute for Medical Research in London and a few years as a self-employed scientific software developer, he joined the group of Professor Dame Janet Thornton, FRS at University College London. He moved from there to Inpharmatica, a UCL spin-out company, and then to the University of Reading as a Lecturer in Bioinformatics. He returned to UCL in 2004 where he is now a Reader in Bioinformatics and Computational Biology. His research focuses on two main areas: the sequence and structure of antibodies and the effects of mutations on protein structure and function. He has published over 80 papers and reviews, six book chapters and a co-authored book on moonlighting proteins. As well as widely used web-based software, he has developed software that has been downloaded over 8,000 times. He has consulted for a number of companies and acted as an expert witness in several patent disputes related both to antibodies and to general bioinformatics. He is also an advisor to the WHO International Nonproprietary Names (INN) committee on the naming of antibody-based drugs.
Preparata Lecture - To be announced
Talk at BITS2017
BITS Lecture - To be announced
Talk at BITS2017
The development of computational approaches for predictive modeling of diseases or drug resistance predictions has opened a new era in precision medicine. Clinical decision-support-systems have been designed for assistance in molecular diagnostics (MDx) or companion diagnostics (CDx) to enhance therapeutic success. These systems are typically based on statistical or machine learning models that were build based on clinical data. The main pitfall of computational models for precision medicine developed in academia is however that most of the software is developed by individuals on a one-person-one-project basis. Thus, these researchers develop software in a prototype-centered manner, meaning that they develop software for quick publishable results and that they neither care on regulatory aspects regarding software development processes nor on the documentation and the maintenance for MDx or CDx software. One important aspect of producing reliable computational models is however the evaluation in clinical trials, which is a necessary, but not sufficient condition towards application in MDx or CDx scenarios.
Main aspects of the talk:
- Statistical and machine learning models for precision medicine
- Clinical evaluation and clinical trials
- Regulatory aspects of software for MDx and CDx applications
The aims of the talk is to describe the design and implementation of different types of neural network architectures to perform inference on gene expression and methylation data. The models were chosen such that each of them explores different properties of the epigenetic data in order to overcome the problem of having sparse and imbalanced datasets. All of these models were subsequently compared in order to assess their pattern recognition abilities.
The talk will contain elements of tutorial so also beginners willing to practice on deep learning could benefit by bringing their laptops.
High-throughput sequencing platforms are increasingly used to screen patients with genetic disease for pathogenic mutations, but prediction of the effects of mutations remains challenging.
We have developed SAAPdap (Single Amino Acid Polymorphism Data Analysis Pipeline) which uses a set of rule-based analyses for predicting the likely local effects of mutations. These analyses are then used with SAAPpred (Single Amino Acid Polymorphism Predictor) which uses a random forest to predict whether a mutation is likely to be pathogenic. The method gives a fully crossvalidated MCC=0.692 and a partially cross-validated MCC=0.944 (where the same protein is allowed in training and test sets, but not the same mutation). This considerably outperforms well known methods such as MutationAssessor, SIFT and PolyPhen2 (MCC between 0.452 and 0.572).
We have also extended the method to create SAAPpred-myh7 which is able to distinguish between the two major clinical phenotypes (hypertrophic cardiomyopathy, HCM and dilated cardiomyopathy, DCM) associated with mutations in the beta-myosin heavy chain (MYH7) gene product (Myosin-7). Despite having a small and unbalanced dataset, we achieve an MCC=0.53 and a post hoc removal of machine learning models that performed particularly badly, further increased the performance (MCC=0.61). Thus our method for performing the difficult task of differential phenotype prediction is competitive with other methods simply for performing pathogenicity prediction.