Fusing Data Mining, Machine Learning and Traditional Statistics to Detect Biomarkers Associated with Depression
Joanna F Dipnall, Julie A Pasco, Michael Berk, Lana J Williams, Seetal Dodd, Felice N Jacka, Denny Meyer
PLOS ONE | PUBLIC LIBRARY SCIENCE | Published : 2016
BACKGROUND: Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. METHODS: The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009-2010). Depression was measured using the Pa..View full abstract
Awarded by NHMRC
Michael Berk is supported by a NHMRC Senior Principal Research Fellowship 1059660 and Lana J Williams is supported by a NHMRC Career Development Fellowship 1064272. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.