TL;DR: The code is available in klib/kavl.h with a toy example in comments and at the end of this post. kavl_test.c tests correctness. Insertion performance is evaluated separately.
Motivation
I need a container which, upon each insertion, tells me the number of smaller objects than the inserted one. A natural choice is a binary search tree. We store at each node the number of objects descended from the node. On insertion, we sum over numbers on nodes immediately left to the search path to find the answer. This algorithm sounds easy but is not implemented in existing libraries. In addition, I also want to learn how AVL tree and intrusive containers work down to every detail. So, here we go.
Definition
An intrusive container is a container that requires each object in it to have one or multiple predefined member variables. Such a container intrudes the object definition – this is how it is named.
Implementation overview
kavl.h is broadly similar to khash.h. It requires you to expand a macro to insert the actual implementation into your code before using it. As an intrusive container, kavl.h doesn’t call the malloc() equivalent inside the library. In fact, it doesn’t even depend on libc. Like my other container implementations, kavl.h strives for performance. It avoids recursion, and doesn’t keep a pointer to the parent node – this saves space at the cost of code complexity.
A popular way to implement intrusive containers is to use offsetof, as is described in this blog post. This strategy avoids all the macro magic, but makes it impossible to inline simple comparisons. It is less efficient.
The advantage of intrusive containers
A non-intrusive container allocates memory inside the library. It is non-trivial (if possible at all) to replace the allocator used in the library. A true intrusive container lets you allocate memory in whatever way you prefer. You can opt to a custom heap allocator, a memory pool or even allocate on stack, which may help performance a little if used correctly.
In addition, when storing strings or other variable-length data, an intrusive tree/list may reduce one heap allocation per node. In case of kavl.h, you can define a tree node with a flexible array member:
struct my_node { int len; KAVL_HEAD(struct my_node) head; char str[]; };
This way, you can allocate the node along with the string, which again may help performance.
The disadvantage
With an intrusive container, you have to take care of all memory management. This is inconvenient and opens doors to potential memory leaks. At least in C, the APIs of intrusive containers are less intuitive and harder to understand, requiring users to have a deeper knowledge in language features.
The myth
The Boost library argues that intrusive containers are faster with less stress on memory management. They tried to prove this with a benchmark. That goes a little too far. Intrusive lists shine there mainly because their programs “allocate” list nodes from a pre-allocated vector. In practice, we still have to allocate each node individually on heap when deletions are involved or when we can’t preallocate all nodes. Intrusive containers can be faster, but most often they are not. Even when they are faster, the performance gap is small.
It is believed among C programmers that intrusive data structures are a great way to achieve generic programming. This is only partially true. First, of common containers, only lists and binary search trees (BSTs) can be made truly intrusive in the sense that they need no heap allocation inside the libraries. Dynamic chaining-based hash tables still have to allocate the bucket array on heap, and they are often slower than open-addressing hash tables and should be avoided anyway. Second, only intrusive lists, the least useful data structure, can be implemented efficiently without ugly macros everywhere. For BSTs, we still have to use the macro magic to achieve the performance of type-specific code. Intrusive containers are not a general solution to generic programming in C.
Conclusions
To most developers, non-intrusive containers are the better choice. However, when you implement a memory allocator or when you micro-manage memory for the best performance, you will appreciate the flexibility of intrusive containers. Combined with a simple memory pool, kavl.h does speed up my program in the end.
Code example
The following implements the AVL tree example on wiki.
#include <stdio.h> #include <string.h> #include <stdlib.h> #include "kavl.h" struct my_node { char key; KAVL_HEAD(struct my_node) head; }; #define my_cmp(p, q) (((q)->key < (p)->key) - ((p)->key < (q)->key)) KAVL_INIT(my, struct my_node, head, my_cmp) int main(void) { const char *str = "MNOLKQOPHIA"; // from wiki, except a duplicate struct my_node *root = 0; int i, l = strlen(str); for (i = 0; i < l; ++i) { // insert in the input order struct my_node *q, *p = malloc(sizeof(*p)); p->key = str[i]; q = kavl_insert(my, &root, p, 0); if (p != q) free(p); // if already present, free } kavl_itr_t(my) itr; kavl_itr_first(my, root, &itr); // place at first do { // traverse const struct my_node *p = kavl_at(&itr); putchar(p->key); free((void*)p); // free node } while (kavl_itr_next(my, &itr)); putchar('\n'); return 0; }
Leave a comment