Two uses of stagewise regression: from landmarking in cancer patients to deep learning for SNPs
Speaker: Harald Binder, Professor, Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center, Johannes Gutenberg University Mainz, Germany.
At the end of the seminar simple food and refreshments will be served.
Regularized techniques allow to estimate the parameters of a regression model and to perform variable selection even when the number of covariates is larger than the number of observations. Penalized likelihood approaches are the most prominent way for performing such regularization, e.g. the Lasso for variable selection. Stagewise regression techniques, also known as componentwise boosting, provide an alternative to the latter. Starting from estimates all equal to zero, one parameter is updated in each of a potentially large number of steps. Typically, this does not stop at the maximum of a global criterion, such as a penalized log-likelihood, which makes such approaches difficult to treat analytically might be seen as a disadvantage. Yet, there are two distinct advantages: optimization in settings where no (overall) likelihood is a available, and speed. I will show two applications, where these advantages play out. In a dynamic prediction application with data from hepatocellular carcinoma patients, effects of patient characteristics on the cumulative incidence of death (in presence of competing risks) are investigated using pseudo-value regression models at different landmarks. Stagewise regression allows to couple variable selection between landmarks without enforcing similarity between parameter estimates. As a result the approach selects a set of variables that is relevant for all landmarks. In a second application, the aim is to search for potentially complex patterns in molecular data, namely single nucleotide polymorphisms (SNPs), which might be relevant for patient prognosis. The modeling of patterns is performed by a deep learning approach, specifically deep Boltzmann machines, which cannot be applied in a straightforward manner if the number of SNPs is larger than the number of observations. Thus, stagewise regression is used to obtain a rather crude (implicit) estimate of the joint distribution of SNPs, which forms the basis for partitioned training of deep Boltzmann machines. The resulting SNP patterns are seen to be relevant with respect to clinical outcomes.