Fusing data mining, machine learning and traditional statistics to detect biomarkers associated with depression

JF Dipnall; JA Pasco; M Berk; LJ Williams; S Dodd; FN Jacka; D Meyer

Journal article

Fusing data mining, machine learning and traditional statistics to detect biomarkers associated with depression

JF Dipnall, JA Pasco, M Berk, LJ Williams, S Dodd, FN Jacka, D Meyer

Plos One | Published : 2016

DOI: 10.1371/journal.pone.0148195

Open access

Download PDF

Abstract

Background Atheoretical large-scale data mining techniques using machine learning algorithms have promise in the analysis of large epidemiological datasets. This study illustrates the use of a hybrid methodology for variable selection that took account of missing data and complex survey design to identify key biomarkers associated with depression from a large epidemiological study. Methods The study used a three-step methodology amalgamating multiple imputation, a machine learning boosted regression algorithm and logistic regression, to identify key biomarkers associated with depression in the National Health and Nutrition Examination Study (2009- 2010). Depression was measured using the Pat..

View full abstract

University of Melbourne Researchers

Michael Berk Author

Grants

Awarded by National Health and Medical Research Council

Funding Acknowledgements

Michael Berk is supported by a NHMRC Senior Principal Research Fellowship 1059660 and Lana J Williams is supported by a NHMRC Career Development Fellowship 1064272. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.