Journal article
A sparse PLS for variable selection when integrating omics data
KA Lê Cao, D Rossouw, C Robert-Granié, P Besse
Statistical Applications in Genetics and Molecular Biology | Published : 2008
Abstract
Recent biotechnology advances allow for multiple types of omics data, such as transcriptomic, proteomic or metabolomic data sets to be integrated. The problem of feature selection has been addressed several times in the context of classification, but needs to be handled in a specific manner when integrating data. In this study, we focus on the integration of two-block data that are measured on the same samples. Our goal is to combine integration and simultaneous variable selection of the two data sets in a one-step procedure using a Partial Least Squares regression (PLS) variant to facilitate the biologists' interpretation. A novel computational methodology called "sparse PLS" is introduced ..
View full abstract