Conference Proceedings

Application of Information Retrieval Techniques for Source Code Authorship Attribution

Steven Burrows, Alexandra L Uitdenbogerd, Andrew Turpin, X Zhou (ed.), H Yokota (ed.), K Deng (ed.), Q Liu (ed.)

Proceedings of 14th International Conference on Database Systems for Advanced Application | SPRINGER-VERLAG BERLIN | Published : 2009


Authorship attribution assigns works of contentious authorship to their rightful owners solving cases of theft, plagiarism and authorship disputes in academia and industry. In this paper we investigate the application of information retrieval techniques to attribution of authorship of C source code. In particular, we explore novel methods for converting C code into documents suitable for retrieval systems, experimenting with 1,597 student programming assignments. We investigate several possible program derivations, partition attribution results by original program length to measure effectiveness of modest and lengthy programs separately, and evaluate three different methods for interpreting ..

