Journal article

Unsupervised pattern recognition of mixed data structures with numerical and categorical features using a mixture regression modelling framework

Shu-Kay Ng, Richard Tawiah, Geoffrey J McLachlan

Pattern Recognition | Elsevier | Published : 2019


In the present era of “Big Data”, data collection involving massive amount of features with a mix of variable types is commonplace. Mixture model-based techniques for statistical cluster analysis of mixed numerical and categorical feature data have their limitations, due to the difficulty in specifying appropriate component-densities when common multivariate distributions become invalid. This problem is particularly apparent in applications where the outcome feature variables are in a categorical form. An example of such an application is the analysis of binary morbidity data in national health survey, where the aims are to quantify heterogeneous comorbidity patterns of health conditions and..

View full abstract

University of Melbourne Researchers


Awarded by Australian Research Council

Funding Acknowledgements

The authors wish to thank the Editor, an Associate Editor, and three reviewers for helpful comments on the paper. This work was supported by the Australian Research Council (Grant number DP170100907). The authors have no competing interests to declare.