Conference Proceedings

Improving k-Nearest Neighbour Classification with Distance Functions Based on Receiver Operating Characteristics

Md Rafiul Hassan, M Maruf Hossain, James Bailey, Kotagiri Ramamohanarao, W Daelemans (ed.), B Goethals (ed.), K Morik (ed.)

MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, PART I, PROCEEDINGS | SPRINGER-VERLAG BERLIN | Published : 2008

Abstract

The k-nearest neighbour (k-NN) technique, due to its interpretable nature, is a simple and very intuitively appealing method to address classification problems. However, choosing an appropriate distance function for k-NN can be challenging and an inferior choice can make the classifier highly vulnerable to noise in the data. In this paper, we propose a new method for determining a good distance function for k-NN. Our method is based on consideration of the area under the Receiver Operating Characteristics (ROC) curve, which is a well known method to measure the quality of binary classifiers. It computes weights for the distance function, based on ROC properties within an appropriate neighbou..

View full abstract