Combining low-dimensional clinical and high-dimensional molecular data in a survival prediction model
Speaker: Ricardo De Bin, Associate Professor, Department of Mathematics, University of Oslo.
Combining low-dimensional clinical and high-dimensional molecular sources of information in a survival prediction model is not straightforward. Several issues arise due to their difference in nature: the former is characterized by few predictors whose prediction value is usually well-validated in the biomedical literature; the latter by a large number of predictors and a low signal to noise ratio. Different strategies have been proposed to efficiently combine the two sources of data, mainly aiming at fully exploiting the clinical information notwithstanding the noise linked to the molecular part. In this talk we show how these strategies work in practice, with a special focus on their performances when used within a statistical boosting procedure. Merging the powerfulness of a machine learning algorithm and the interpretability of a statistical model, boosting is one of the most interesting approaches to use when dealing with both low and high-dimensional sources of data. The results are illustrated through two real data examples.