Conference Proceedings

Evaluating Transfer Learning for Simplifying GitHub READMEs

Haoyu Gao, Christoph Treude, Mansooreh Zahedi, S Chandra (ed.), K Blincoe (ed.), P Tonella (ed.)

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering | Association for Computing Machinery | Published : 2023

Abstract

Software documentation captures detailed knowledge about a software product, e.g., code, technologies, and design. It plays an important role in the coordination of development teams and in conveying ideas to various stakeholders. However, software documentation can be hard to comprehend if it is written with jargon and complicated sentence structure. In this study, we explored the potential of text simplification techniques in the domain of software engineering to automatically simplify GitHub README files. We collected software-related pairs of GitHub README files consisting of 14,588 entries, aligned difficult sentences with their simplified counterparts, and trained a Transformer-based m..

View full abstract

University of Melbourne Researchers