Again, this post is a follow-up of this page. Source code is available here.
AVL Tree vs. Red-Black Tree
If you google “avl vs. red-black”, the first returned page will largely answer your question. In summary, the height of an AVL tree is at most ~1.44log(N), lower than the maximum height of a red-black tree, ~2log(N). However, upon insertion, AVL tree may need O(log(N)) operations to rebalance the tree, but red-black tree requires O(1). This means AVL tree is probably faster on searching but slower on insertion. In addition, red-black tree is easier to be implemented as a persistent data structure.
Practical analysis is not so clear as theoretical analysis. I think AVL and red-black may have quite similar overall performance. Firstly, although the maximum possible height of a AVL tree is better, both AVL and red-black trees are probably well balanced without reaching the maximum height frequently. Secondly, although AVL tree has an O(log(N)) loop upon insertion, it is a very tight loop with a small constant, while red-black tree may have a loop of small size with big constant (amortized O(1), though). I am not sure whether red-black tree really wins on insertion.
In Daniel’s benchmark, AVL tree and red-black tree have quite similar speed (in consideration that Daniel implemented AVL tree). In my benchmark, libavl’s red-black tree and AVL tree also deliver similar speed.
On memory, a standard AVL tree requires two pointers and one balancing factor at each node, while a red-black tree can be implemented using two or three pointers plus a color bit (either the balancing factor or the color bit will cost the size of a pointer due to memory alignment). A red-black tree with two pointers is usually slower than a red-black tree with three pointers (libavl_rb_cpp vs. NP_rbtree), but the difference is marginal. In my benchmark, a red-black tree with two pointers has pretty similar performance in comparison to an AVL tree (libavl_rb_cpp vs. libavl_avl_cpp).
In all, my opinion is AVL and red-black are very similar in many aspects. Which to use does not matter too much if you are not implementing a persistent data structure.
Splay Tree vs. AVL/RB Tree
Search on a splay tree may be O(N) if elements are inserted in order. This “feature” alone pushes me away. Although wiki says it is possible to avoid the worst case, I am not aware of a practical implementation. On random input, splay tree is also slower than AVL and red-black trees.
One of the advantages of a splay tree is it only requires two pointers with no additional information. This is helpful when memory is critical, but I would rather use a modified red-black to achieve the same memory footprint. Another advantage of splay tree is it is self-adjusting. I have not tested this in my benchmark.
Treap v. AVL/RB Tree
I do not know much about treap. It seems to me treap does not guarantee O(log(N)) search. Fortunately, in practice the worst O(N) case should almost never happen due to the random behaviour of a treap. In my benchmark, treap is slower than splay tree on random input. Probably it is not the best choice.
Implementations
As to red-black tree, SGI STL’s set/map is really a good one. You can find other C++ implementations (TN_rbtree libavl_rb_cpp) in my benchmark. They are also efficient, but incomplete. Libavl_rb_cpp is adapted from libavl_rb which only uses two pointers. NP_rbtree (FreeBSD’s tree.h) is the best choice for a C programmer.
As to AVL tree, I am not aware of good AVL implementations in either C or C++. Stlavlmap is supposed to be efficient, but I cannot compile it. WK_avl may be a good library, judged from its source code, but it is not easy to use. GM_avl uses recursion and so cannot be efficient. Libavl_avl_cpp is my ported version of libavl_avl. It is efficient but incomplete. You may finish it if you really like to try an AVL tree. BTW, libavl is in fact a very high-quality library, but using void* makes it inferior to others. Someone should improve its API using C macros or C++ template.
Concluding Remarks
For general purpose, red-black tree might still be the best choice. It is among the fastest binary search trees and more easily adapted to a persistent data structure. AVL tree is equally good when persistence is not needed (well, all my applications do not require persistence). At last, do not forget my previous post: in-memory B-tree can be both faster and more light-weighted than red-black/AVL trees.
Update
Sambal told me that on insertion, red-black only guarantees amortized O(1). The worst case can be O(log(N)) with large constant. Then the advantage of red-black tree on insertion is less obvious.
Update the claim about the memory usage of AVL and red-black tree.
Why you no prove SGLIB?. It’s a good library
and very fast in my tests.
http://sglib.sourceforge.net/
Thanks a lot, Guillermo. I did not know this library. Now I have added sglib_rbtree to the table. SGLIB is a very good library. It also saves the parent pointer and so has smaller memory footprint. However, sglib_rbtree uses recursion to insert/erase an element, which causes a little overhead on speed.
I did not put sglib’s hash table in my benchmark as apparently it only allows fixed-sized hash table, while all the hash table libraries in the benchmark are dynamic.
Why would a RB-tree make a better persistent data structure than an AVL-tree?
As I understand it, any time you modify any node in a persistent tree, you must replace it. This means you must modify its parent to point to the new version. Recursively, you end up replacing the entire ancestry. Any change, therefore will require that O(h) new nodes are created, where h is the height where the change occurred. The only advantage of a RB-tree is that inserts are local and do not work their way up most of the tree. In persistent trees, however, every change always goes all the way to the top.
It seems, with a smaller height, persistent AVL-trees will have faster inserts, use less memory and have faster searches than persistent RB-trees.
I think that an AVL is better than a red black in search,insertion or deletion, takes less memory than Rb, so i don’t see the advantages of RB trees and why we continue to use them?
@mikou
If you read these figures, you will find AVL/RB has quite similar performance in terms of speed and memory (and actually also in implementation difficulty). I do not think either of them outperforms the other. People perfer RB maybe because theorists and textbooks like RB. In addition, as I pointed out, it is possible to implement RB (libavl_rb_cpp) to use only two pointers with the color bit encoded in one of the pointers, but we can never do this for AVL.
i implemented deletion and adjonction for RB AVL and hashcoding to see and make compare between them, i saw that AVL are better than Rb in memory capacity(we don’t use the sentinel) in time because the height is at most =|1| (for rb height >1) so it takes less time to do the 3 operations…
@mikou
Thanks for your interest. Nonetheless, I would suggest you compare your implementation with other implementations such as STL/libavl and so on. You can also find other libraries in the source code package I provide in this page. Note that the fact that your RB is slower than your AVL does not necessarily mean RB is in general worse than AVL; maybe you have not got the best way to implement RB. Of course benchmarking algorithms is actually benchmarking implementations, but you should at least make sure your implementation is comparable to the best before coming to a final conclusion. It would be interesting to see an AVL implementation that could outperform the best RB library. If this is the case, I will surely add your library to the benchmark.
Can we say tha hashcoding and avl or rb are independant i mean we can’t compare hashcoding with the two others
Not that I’m impressed a lot, but this is a lot more than I expected when I stumpled upon a link on SU telling that the info is quite decent. Thanks.
@mikou
Hash table, avl and rb are different, but they share something in common. They are all dictionary data structures: we put things in it and get them back efficiently. In this sense, they are comparable.