When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair

W Luo; JW Keung; B Yang; H Ye; CL Goues; TF Bissyandé; H Tian; B Le

Journal article

When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair

W Luo, JW Keung, B Yang, H Ye, CL Goues, TF Bissyandé, H Tian, B Le

ACM Transactions on Software Engineering and Methodology | Association for Computing Machinery (ACM) | Published : 2026

DOI: 10.1145/3733599

Abstract

Software systems have been evolving rapidly and inevitably introducing bugs at an increasing rate, leading to significant maintenance costs. While large language models (LLMs) have demonstrated remarkable potential in enhancing software development and maintenance practices, particularly in automated program repair (APR), they rely heavily on high-quality code repositories. Most code repositories are proprietary assets that capture the diversity and nuances of real-world industry software practices, which public datasets cannot fully represent. However, obtaining such data from various industries is hindered by data privacy concerns, as companies are reluctant to share their proprietary code..

View full abstract

University of Melbourne Researchers

Bach Le Author

Grants

Awarded by Government of Western Australia

Citation metrics

3Scopus

11Dimensions

Keywords

Networking and Information Technology R&d (nitrd)

46 Information and Computing Sciences

4612 Software Engineering