Approaching intriguing problems with machine learning: the full picture from data availability to methodology development and assessment, with the adaptive immune system as case

Biostatistical seminar with Geir Kjetil Ferkingstad Sandve, Professor, Biomedical Informatics Research Group (BMI), Section of Machine learning, Department of informatics (IFI), University of Oslo.

Abstract

As statisticians and machine learners, we often talk quite exclusively about the methodology we develop, although we typically agree that appropriate data and method assessment is equally important. 
I will here present how we through a broad interdisciplinary collaboration have tried to approach the full spectrum of aspects that influence our success in approaching an intriguing problem with machine learning. At the core is the development of a novel deep learning architecture whose components are motivated by (tailored according to) insights from the application domain. Looking ahead, we have also critically analysed how large language models might improve on the more classic deep learning approaches to the problem. To support the methodology development, we have initiated separate projects to generate both experimental and synthetic data. And to support interoperability, reproducibility and rigorous assessment of the developed methodology, we have developed a software platform for machine learning in the domain, as well as initiating an international competition to benchmark competing methods in the field. 
The case (the machine learning problem) that is underlying the above developments is the question of how the adaptive immune system recognises foreign threats - e.g. viruses, bacteria or cancer. This is essentially a DNA sequence classification problem, known to be driven by complex, higher-order interactions. There is a strong interest in better solutions to this computational problem, as it could accelerate drug development and allow early diagnosis of disease.

Published Nov. 14, 2023 9:46 AM - Last modified Dec. 6, 2023 12:06 PM