The impact of automated feature selection techniques on the interpretation of defect models

J Jiarpakdee; C Tantithamthavorn; C Treude

Journal article

The impact of automated feature selection techniques on the interpretation of defect models

J Jiarpakdee, C Tantithamthavorn, C Treude

Empirical Software Engineering | SPRINGER | Published : 2020

DOI: 10.1007/s10664-020-09848-1

Abstract

The interpretation of defect models heavily relies on software metrics that are used to construct them. Prior work often uses feature selection techniques to remove metrics that are correlated and irrelevant in order to improve model performance. Yet, conclusions that are derived from defect models may be inconsistent if the selected metrics are inconsistent and correlated. In this paper, we systematically investigate 12 automated feature selection techniques with respect to the consistency, correlation, performance, computational cost, and the impact on the interpretation dimensions. Through an empirical investigation of 14 publicly-available defect datasets, we find that (1) 94–100% of the..

View full abstract

University of Melbourne Researchers

Christoph Treude Author

Related Projects (1)

Automatically summarising and measuring software development activity

Grants

Awarded by Australian Research Council

Funding Acknowledgements

C. Tantithamthavorn is supported by the Australian Research Council's Discovery Early Career Researcher Award (DECRA) funding scheme (DE200100941). C. Treude is supported by the Australian Research Council's Discovery Early Career Researcher Award (DECRA) funding scheme (DE180100153).