Accurate multistage prediction of protein crystallization propensity using deep-cascade forest with sequence-based features.
Yi-Heng Zhu, Jun Hu, Fang Ge, Fuyi Li, Jiangning Song, Yang Zhang, Dong-Jun Yu
Brief Bioinform | Published : 2021
X-ray crystallography is the major approach for determining atomic-level protein structures. Because not all proteins can be easily crystallized, accurate prediction of protein crystallization propensity provides critical help in guiding experimental design and improving the success rate of X-ray crystallography experiments. This study has developed a new machine-learning-based pipeline that uses a newly developed deep-cascade forest (DCF) model with multiple types of sequence-based features to predict protein crystallization propensity. Based on the developed pipeline, two new protein crystallization propensity predictors, denoted as DCFCrystal and MDCFCrystal, have been implemented. DCFCry..View full abstract