Journal article
When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair
W Luo, JW Keung, B Yang, H Ye, CL Goues, TF Bissyandé, H Tian, B Le
ACM Transactions on Software Engineering and Methodology | Association for Computing Machinery (ACM) | Published : 2026
DOI: 10.1145/3733599
Abstract
Software systems have been evolving rapidly and inevitably introducing bugs at an increasing rate, leading to significant maintenance costs. While large language models (LLMs) have demonstrated remarkable potential in enhancing software development and maintenance practices, particularly in automated program repair (APR), they rely heavily on high-quality code repositories. Most code repositories are proprietary assets that capture the diversity and nuances of real-world industry software practices, which public datasets cannot fully represent. However, obtaining such data from various industries is hindered by data privacy concerns, as companies are reluctant to share their proprietary code..
View full abstractGrants
Awarded by Government of Western Australia