Conference Proceedings
Fast document ranking for large scale information retrieval
M Persin, J Zobel, R Sacks-Davis
Lecture Notes in Computer Science Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics | Published : 1994
Abstract
For large document databases, evaluation of ranked queries can be expensive in cpu time, memory usage, and disk traffic. It has been shown that memory usage can be dramatically reduced by use of a simple filtering heuristic that eliminates most documents from consideration. In this paper we show that, by designing inverted indexes explicitly to support filtering, cpu time and disk traffic can also be dramatically reduced. The principle of the index design is that inverted lists are sorted by in document frequency rather than by document number. In the context of compressed indexes such a re-ordering could result in a large increase in index size. We show, however, that it is possible to use ..
View full abstract