Journal article

When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair

W Luo, JW Keung, B Yang, H Ye, CL Goues, TF Bissyandé, H Tian, B Le

ACM Transactions on Software Engineering and Methodology | Association for Computing Machinery (ACM) | Published : 2026

Abstract

Software systems have been evolving rapidly and inevitably introducing bugs at an increasing rate, leading to significant maintenance costs. While large language models (LLMs) have demonstrated remarkable potential in enhancing software development and maintenance practices, particularly in automated program repair (APR), they rely heavily on high-quality code repositories. Most code repositories are proprietary assets that capture the diversity and nuances of real-world industry software practices, which public datasets cannot fully represent. However, obtaining such data from various industries is hindered by data privacy concerns, as companies are reluctant to share their proprietary code..

View full abstract

University of Melbourne Researchers