Conference Proceedings

External sorting with on-the-fly compression

J Yiannis, J Zobel, A James (ed.), B Lings (ed.), M Younas (ed.)

NEW HORIZONS IN INFORMATION MANAGEMENT | SPRINGER-VERLAG BERLIN | Published : 2003

Abstract

Evaluating a query can involve manipulation of large volumes of temporary data. When the volume of data becomes too great, activities such as joins and sorting must use disk, and cost minimisation involves complex trade-offs. In this paper, we explore the effect of compression on the cost of external sorting. Reduction in the volume of data potentially allows costs to be reduced - through reductions in disk traffic and numbers of temporary files - but on-the-fly compression can be slow and many compression methods do not allow random access to individual records. We investigate a range of compression techniques for this problem, and develop successful methods based on common letter sequences..

View full abstract

University of Melbourne Researchers