Norwegian version of this page

High-dimensional statistics

We are working on methodological problems related to analysis of high-dimensional data.

Illustration: Colourbox

About the group

High-dimensional statistics is about analyzing data where the number of variables (or features) is of the same size, or larger than the number of observations. This type of data has become common in medical research as well as in other sciences during the last decades. Analysis of high-dimensional data has thus become an important topic in statistics, with links to Machine Learning and Artificial Intelligence. Analysis of such data leads to a number of new methodological challenges, as traditional methods may break down in unexpected ways.

We are interested in all aspects of analysis of high-dimensional data, but our research has mainly focused on two important topics; data integration and variable screening and -selection.

Data integration

Multiple data sources give complimentary information about systems or individuals from different angles on different scales. Each input contributes unique information, but there is also considerable overlap and often in the presence of heavy noise. Disentangling signal from noise in complex and high-dimensional data is a key step. Much of the work that has been done has focused directly on genomics, where one has to deal with different layers or levels of genomic information. Efficient integration of complementary information sources from these multiple levels can greatly facilitate the discovery of true causes and states of disease in specific sub-groups of patients sharing a common genetic background. Our focus has mainly been on data integration by use of dimension reduction techniques, with a number of contributions.

Variable selection and feature screening

The selection or identification of a sparse and manageable set of important variables is a crucial problem in many high-dimensional settings. In ultra-high dimensional situations, a pre-screening step might also be necessary. We have contributed to this literature by developing robust feature screening methods in addition to methods for variable selection in high-dimensional measurement error problems, high-dimensional mixed models and high-dimensional mediation problems. Our current focus is on variable selection with error control, most of which goes under the heading Post-selection inference.

Published Feb. 24, 2011 8:29 PM - Last modified Sep. 1, 2023 10:54 AM

Contact

Dept. of Biostatistics
Domus Medica
Gaustad
Sognsvannsveien 9
0372 Oslo
 

Group leader

Participants

Detailed list of participants