Conference Proceedings

Using random sampling to build approximate tries for efficient string sorting

R Sinha, J Zobel, SL Martins (ed.)

EXPERIMENTAL AND EFFICIENT ALGORITHMS | SPRINGER-VERLAG BERLIN | Published : 2004

Abstract

Algorithms for sorting large datasets can be made more efficient with careful use of memory hierarchies and reduction in the number of costly memory accesses. In earlier work, we introduced burstsort, a new string-sorting algorithm that on large sets of strings is almost twice as fast as previous algorithms, primarily because it is more cache efficient. Burstsort dynamically builds a small trie that is used to rapidly allocate each string to a bucket. In this paper, we introduce new variants of our algorithm: SR-burstsort, DR-burstsort, and DRL-burstsort. These algorithms use a random sample of the strings to construct an approximation to the trie prior to sorting. Our experimental results w..

View full abstract

University of Melbourne Researchers