Conference Proceedings

Exploiting Worker Correlation for Label Aggregation in Crowdsourcing

Yuan LI, Benjamin Rubinstein, Trevor Cohn, Kamalika Chaudhuri (ed.), Ruslan Salakhutdinov (ed.)

Proceedings of the 36th International Conference on Machine Learning | International Machine Learning Society | Published : 2019


Crowdsourcing has emerged as a core component of data science pipelines. From collected noisy worker labels, aggregation models that incorporate worker reliability parameters aim to infer a latent true annotation. In this paper, we argue that existing crowdsourcing approaches do not sufficiently model worker correlations observed in practical settings; we propose in response an enhanced Bayesian classifier combination (EBCC) model, with inference based on a mean-field variational approach. An introduced mixture of intra-class reliabilities---connected to tensor decomposition and item clustering---induces inter-worker correlation. EBCC does not suffer the limitations of existing correlation m..

View full abstract


Awarded by Australian Research Council

Funding Acknowledgements

This work was sponsored by Facebook AI and the Defense Advanced Research Projects Agency Information Innovation Office (I2O) under the Low Resource Languages for Emergent Incidents (LORELEI) program issued by DARPA/I2O under Contract No. HR0011-15-C-0114. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government. Benjamin Rubinstien was supported by the Australian Research Council, DP150103710.