Journal article
A Bayesian mixture model for clustering and selection of feature occurrence rates under mean constraints
Q Li, M Guindani, BJ Reich, HD Bondell, M Vannucci
Statistical Analysis and Data Mining | WILEY | Published : 2017
DOI: 10.1002/sam.11350
Abstract
In this paper, we consider the problem of modeling a matrix of count data, where multiple features are observed as counts over a number of samples. Due to the nature of the data generating mechanism, such data are often characterized by a high number of zeros and overdispersion. In order to take into account the skewness and heterogeneity of the data, some type of normalization and regularization is necessary for conducting inference on the occurrences of features across samples. We propose a zero-inflated Poisson mixture modeling framework that incorporates a model-based normalization through prior distributions with mean constraints, as well as a feature selection mechanism, which allows u..
View full abstract