Post doc Sahar Hassani
Identifying the polygenic basis of the human brain and psychiatric disorders
Sahar Hassani, PhD
Owing to the rapid rate of development in the field of systems biology researchers have faced many new challenges with regard to handling the large amount of generated data sets originating from different –omics techniques, integrating and analyzing them and finally interpreting the results in a meaningful way. Collecting data from each technique in a separate data matrix, results in multi-block multivariate data set containing different types of measurements belonging to the same samples. An example of a multi-block data set is illustrated in Figure 1. As it can be seen in the figure, different blocks of a multi-block data set always contain the same sample set while they contain different variable sets.
Several statistical methods have been implemented in the field of systems biology. The use of chemometrics approaches for the integration and analysis of systems biology data has recently increased. Different chemometrics methods are potentially available for integrating –omics data and detecting variable and sample patterns. An important challenge is to decide which method to use for the analysis of –omics data sets and how to pre-process the data sets for this purpose. In addition, special attention needs to be given to the validity of the detected patterns due to the fact that visual perception can be misleading since scientist’s mind is always looking for patterns of grouping.
During my Ph.D. studies I worked on developing multi-block methods for integrating different types of systems biology data and investigating the co-variation patterns among the measured variables. A special focus was given to the validation of the results of the multi-block methods. Several graphical tools were introduced for the purpose of validation. A multi-block framework was also built for pre-processing, integrating and analyzing –omics data sets. I used explorative unsupervised and supervised chemometrics approaches (Consensus Principal Component Analysis (CPCA) and Multi-block Partial Least Squares Regression (MB-PLSR), respectively) for building the framework.
The focus of my current post-doc research project at NORMENT is to identify the human brain variation in psychiatric disorders. I plan to implement the above-mentioned multi-block framework on different types of data coming from psychiatric disorders e.g. the multi-block data set in Figure 1. The ongoing challenge here will be to integrate these different measurements, analyze them in light of the background knowledge and interpret the outcomes. The multi-block tools enable investigating the common underlying patterns in such complex multi-block data sets. They make it possible to investigate the pattern shared by all data blocks as well as the presentation of the global pattern in each block. The eventual aim of my project is to discover single nucleotide polymorphisms (SNPs) and patterns of brain variations associated with different clinical diagnoses.
The overall aim of the current project “Identifying the polygenic basis of the human brain and psychiatric disorders” (Research Council of Norway) is to identify common and rare genetic factors associated to altered brain development, and explore their role in neurodevelopmental disorders. The objectives are to generate genetic atlases of the human cortex based on MRI and SNP data and to discover SNPs associated with individual genetic divisions of the cortical cluster maps. Further, we want to combine MRI and genetics information for patient stratification and risk prediction.