Detection of near-duplicate images for web search

JJ Foo, J Zobel, R Sinha, SMM Tahaghoghi

Proceedings of the 6th ACM International Conference on Image and Video Retrieval, CIVR 2007 | Published : 2007


Among the vast numbers of images on the web are many duplicates and near-duplicates, that is, variants derived from the same original image. Such near-duplicates appear in many web image searches and may represent infringements of copyright or indicate the presence of redundancy. While methods for identifying near-duplicates have been investigated, there has been no analysis of the kinds of alterations that are common on the web or evaluation of whether real cases of near-duplication can in fact be identified. In this paper we use popular queries and a commercial image search service to collect images that we then manually analyse for instances of near-duplication. We show that such duplicat..

