I am recently working on an algorithm, which surprisingly spends more than half of its time on sorting huge partially ordered arrays of 64-bit integer pairs (one for key and the other for value). Naturally, I want to optimize sorting such arrays. Initially, I tried my implementation of introsort. The program takes about 90 seconds on a sample data set. I then switched to my iterative mergesort in the same library. It takes 55 seconds. I guess the mergesort is faster because the arrays are partially ordered. However, my implementation of mergesort requires a temporary array of the same size. As the arrays are huge, it is unacceptable to allocate this array for real data. It seems that implementing an in-place mergesort is quite challenging. Then I think of radix sort, which I have not implemented before.
My radix sort implementation is here. It is not written as a library, but it should be easy to be adapted to other data types. The C program is quite simple and is not much different from existing ones.
How about the performance? With radix sort, my program takes 35 seconds using little extra working space. I get 100% speedup by replacing introsort with integer-only radix sort. To evaluate the performance of radix sort separately, I put the code in my old ksort_test.cc. Here are the CPU seconds spent on sorting 50 million random or sorted integers:
Algorithm | Sorted? | Mac CPU (sec) | Linux CPU |
---|---|---|---|
STL introsort | No | 4.9 | 5.1 |
STL introsort | Yes | 0.9 | 1.1 |
STL stablesort | No | 6.7 | 6.1 |
STL stablesort | Yes | 2.0 | 2.0 |
STL heapsort | No | 54.1 | 32.2 |
STL heapsort | Yes | 4.5 | 4.2 |
libc qsort | No | 11.3 | 9.7 |
ac’s radix | No | 1.9 | 2.0 |
ac’s radix | Yes | 0.8 | 0.9 |
ac’s combsort | No | 11.9 | 11.5 |
ac’s introsort | No | 5.5 | 5.7 |
ac’s introsort | Yes | 7.4 | 6.8 |
ac’s mergesort | No | 6.1 | 6.6 |
ac’s mergesort | Yes | 2.1 | 2.3 |
ac’s heapsort | No | 54.4 | 31.8 |
ac’s heapsort | Yes | 4.9 | 6.6 |
ph’s heapsort | No | 54.6 | 30.8 |
ph’s quicksort | No | 5.6 | 5.8 |
ph’s mergesort | No | 7.0 | 6.9 |
Please see my old post for the information on other algorithms and implementations. You can also clone my klib repository and play with ksort_test.cc by yourself.
What do ac’s, ph’s stand for?
ac=attractive chaos, i.e. those in my klib libraries
ph=Paul Hsieh. These are thought to be efficient implementations of heap/quick/merge sorts.
[…] A quick note on radix sort (attractivechaos.wordpress.com) […]