Conference Proceedings

Effective and scalable authorship attribution using function words

Y Zhao, J Zobel, GG Lee (ed.), A Yamada (ed.), H Meng (ed.), SH Myaeng (ed.)

INFORMATION RETRIEVAL TECHNOLOGY, PROCEEDINGS | SPRINGER-VERLAG BERLIN | Published : 2005

Abstract

Techniques for identifying the author of an unattributed document can be applied to problems in information analysis and in academic scholarship. A range of methods have been proposed in the research literature, using a variety of features and machine learning approaches, but the methods have been tested on very different data and the results cannot be compared. It is not even clear whether the differences in performance are due to feature selection or other variables. In this paper we examine the use of a large publicly available collection of newswire articles as a benchmark for comparing authorship attribution methods. To demonstrate the value of having a benchmark, we experimentally comp..

View full abstract

University of Melbourne Researchers