Journal article

Fusing Data Mining, Machine Learning and Traditional Statistics to Detect Biomarkers Associated with Depression

Joanna F Dipnall, Julie A Pasco, Michael Berk, Lana J Williams, Seetal Dodd, Felice N Jacka, Denny Meyer

PLOS ONE | PUBLIC LIBRARY SCIENCE | Published : 2016

Abstract

BACKGROUND: Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. METHODS: The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009-2010). Depression was measured using the Pa..

View full abstract