Journal article

Predicting gene essentiality in Caenorhabditis elegans by feature engineering and machine-learning

Tulio L Campos, Pasi K Korhonen, Paul W Sternberg, Robin B Gasser, Neil D Young

Computational and Structural Biotechnology Journal | ELSEVIER | Published : 2020


Defining genes that are essential for life has major implications for understanding critical biological processes and mechanisms. Although essential genes have been identified and characterised experimentally using functional genomic tools, it is challenging to predict with confidence such genes from molecular and phenomic data sets using computational methods. Using extensive data sets available for the model organism Caenorhabditis elegans, we constructed here a machine-learning (ML)-based workflow for the prediction of essential genes on a genome-wide scale. We identified strong predictors for such genes and showed that trained ML models consistently achieve highly-accurate classification..

View full abstract


Awarded by U.S. National Institutes of Health

Funding Acknowledgements

This research was funded by grants from the National Health and Medical Research Council (NHMRC) of Australia and the Australian Research Council (ARC) to RBG, PKK and/or NDY. Other support to RBG was from the Melbourne Water. NDY was supported by a Career Development Fellowship, and PKK by an Early Career Research Fellowship from NHMRC. TLC was a recipient of a Research Training Program Scholarship from the Australian Government and is also supported by the Oswaldo Cruz Foundation (Fiocruz/Brazil). PWS was supported by U.S. National Institutes of Health grant U24-HG002223.