An Evaluation of Machine Learning Approaches for the Prediction of Essential Genes in Eukaryotes Using Protein Sequence-Derived Features
Tulio L Campos, Pasi K Korhonen, Robin B Gasser, Neil D Young
Computational and Structural Biotechnology Journal | ELSEVIER | Published : 2019
The availability of whole-genome sequences and associated multi-omics data sets, combined with advances in gene knockout and knockdown methods, has enabled large-scale annotation and exploration of gene and protein functions in eukaryotes. Knowing which genes are essential for the survival of eukaryotic organisms is paramount for an understanding of the basic mechanisms of life, and could assist in identifying intervention targets in eukaryotic pathogens and cancer. Here, we studied essential gene orthologs among selected species of eukaryotes, and then employed a systematic machine-learning approach, using protein sequence-derived features and selection procedures, to investigate essential ..View full abstract
This research was funded by grants from the National Health and Medical Research Council (NHMRC) and the Australian Research Council (ARC) to RBG and NDY. Other support from the Yourgene Bioscience and Melbourne Water Corporation is gratefully acknowledged (RBG). NDY is supported by a Career Development Fellowship, and PKK by an Early Career Research Fellowship from NHMRC. TLC is a recipient of a Research Training Program Scholarship from the Australian Government and is also supported by the Oswaldo Cruz Foundation (Fiocruz/Brazil).