Feeds:
Posts
Comments

It seems that the Computer Language Benchmarks Game has dropped LuaJIT very recently. I wish the site could have a ChangeLog to state the reason.

Getopt for Lua

When I switched to a new programming language, one of the first thing I do is to find or implement a getopt() that is compatible with the elegant Berkeley getopt.c.

When I started to actually use Lua two months ago, I also spent some time to look for a getopt function, but none of them is satisfactory. PosixGetOpt seems to bind the Posix C library and may have compatibility issue. CommandLineModule is powerful, but seems overkilling. AlternativeGetOpt tries to mimic the Berkeley getopt, but its functionality is very limited in comparison to the C version. There is a getopt module in lua-stdlib, but it has massive dependencies and is not Berkeley compatible. lua-alt-getopt is the closest I can find, but I need a lightweight version without the getopt_long support, such that I can copy-paste a function without worrying about dependency.

In the end I implemented my own getopt for Lua. It is a single function in 50 lines. The following shows an example about how to use this function.

for opt, optarg in os.getopt(arg, 'a:b') do
    print(opt, optarg)
end

BTW, I have also started to build my own Lua library. The goal is still: free, efficient and independent. If you want to use a function, you may just copy and paste one or a few relevant functions. The length of a dependency chain is maximally 3 right now.

Amazed by LuaJIT

I have kept looking for a replacement of Perl for several years. Now I have found it: Lua, although the decision is not made based on the language itself, but on its implementation LuaJIT.

LuaJIT is by far faster than all the other scripting languages and even comes close to the speed of Java with fewer lines of code and a smaller memory footprint. To further confirm the efficiency of LuaJIT, I implemented matrix multiplication in C, Lua, JavaScript and Perl. On my laptop, the C implementation multiplies two 1000×1000 matrices in 2.0 seconds (BTW, 1.4 sec if I use “float”; 0.9 if SSE is used; 26.8 sec without matrix transpose), LuaJIT-jit in 2.3 seconds, LuaJIT-interpreter in 24 sec, JavaScript in 40 sec with V8, Lua-5.1.4 in 64 sec and Perl in 283 sec. As a dynamically typing scripting language, LuaJIT achieves a speed comparable to C, which is simply amazing.

Not only that, LuaJIT fixes IMO a major weakness of Lua: the lack of native bit operations; the upcoming Foreign Function Interface (FFI) library, according to the LuaJIT roadmap 2011, will definitely make Lua one of the best scripting languages to bind dynamic libraries, even surpassing Python’s elegant ctypes library.

With the unparalleled efficiency and the addition of important features, LuaJIT makes Lua the best scripting language at present in my opinion. Mike Pall, the single developer of LuaJIT, commented in an interesting discussion that it is possible to implement a JIT compiler for Javascript, a language similar to Lua in many aspects, as efficient as LuaJIT. But Javascript is not a general-purpose programming language by design. Standardizing additional language features would take years. As to other scripting languages, my impression is their complexity is the major obstacle to the implementation of an efficient JIT compiler.

Probably more developers are concerned about the lack of standard libraries in Lua. Personally, I do not see why Lua cannot be a general-purpose scripting language. Probably the creators of Lua just did not intend to or have energy to implement a comprehensive standard library. I hope someone may organize a group of good programmers to develop such a library. Furthermore, with the upcoming FFI library in LuaJIT, we may be able to easily call library routines, which may solve the lack of library issue again with one man only.

LuaJIT is the future of all scripting languages. Even if LuaJIT were not adopted as widely as I wish it to be, I hope the advanced techniques and ideas developed in LuaJIT can be incorporated into other interpreters and JIT compilers.

Why not Lua?

By design, Lua has a lot in common to Javascript. According to the Compute Language Benchmark Game Lua is faster than Javascript/V8 (Lua is slower, but LuaJIT is faster) and much more memory efficient. Lua is also believed to be easier to be embedded. However, the most important reason that I gave up Lua in the end is the lack regex which I heavily rely on. Other minor reasons include:

  • I prefer “{}” over “begin–end”.
  • Javascript is more popular.
  • Coupled with HTML+CSS, Javascript can be used to develop GUI.

Why Javascript shell?

In comparison to Lua and most other general-purpose languages, Javascript itself is useless for my purposes: processing text files. This is because Javascript is unable to open a file on the filesystem. One has to extend Javascript at the level of Javascript engine.

Why v8?

A great complication to extend Javascript is caused by the variety of Javascript engines which are all incompatible with each other in APIs. I choose v8 mainly because of its efficiency and independency:

  • SpiderMonkey is inefficient
  • TraceMonkey and JaegerMonkey are bloatware (over 70MB tar-ball to download)
  • Tamarin is believed to be slower than SpiderMonkey according to wiki.
  • Rhino is written in Java and is probably as slow as SpiderMonkey.
  • SquirrelFish/Nitro are integrated too closely with WebKit.

V8 is not perfect. It is poorly documented, lacks examples and important features (e.g. calling a destructor when an object is out of the scope). Nonetheless, v8 is generally well designed. It is speedy, standalone and not bloated (about 2.6MB compressed without testsuite and so on). It is much younger than Lua after all. I should not ask for too much.

Why not node.js, v8cgi or Narwhal?

These are all great Javascript shells. I am not satisfied with them because they provide too much I do not need, but lack key features I need most. As I said, the component I desperately need is the ability to read a file. While these implementations all support file reading, node.js uses unbuffered I/O and only supports fread() like calls; v8cgi and Narwhal, which both follow the commonJS spec, only offers an API to read the entire file into the memory.

What does K8 offer?

For now, very little: a File object with next() and close() methods. Nonetheless, K8 does file processing in the right way. The File object is a buffered file stream which can be both uncompressed or zlib/gzip compressed. It offers the most useful API next() which reads the next line or token. This simple object opens a variety of new applications to me.

The k8.js Javascript library

As I have just started to write Javascript, I have to build my own toolbox as most javascript libraries deal with web applications. This k8.js is a start. It implements getopt(), a few matrix operations and a FASTA/Q format parser. Probably more will come if I continue to write Javascript.

Get K8

For people who are interested (I guess none, though), it is available at github.

Most of the source codes from this blog have been put into my github repository. These are the master copies, meaning that most recent changes will be committed to github first.

First, sorry for my laziness — I have not posted good/useful posts for a long time. I did not do nothing, but the results were made public under my true name rather than here. Sorry.

This is another rant.

I use C and Perl and see them as the best combination, which has unfortunately prevented me from systematically learning other programming languages, although I wished to. I know some Java and Python, because so many are using them; I also know a bit D, Scheme and Tcl. Nevertheless, I have never written a serious program in them.

Now the situation has been changed a little. Out of curiosity, I wrote something in Javascript. Fooled by its name, I originally thought Javascript is just a scripting version of Java, but gradually, I found some nice features about javascript, which from my angle are listed below:

  • Simplicity. The basic elements in Javascript are simple: we have all the C-like flow controls and then objects, with which we can do most of programming.
  • Versatility. Although simple, javascript comes with a very powerful object system, which allows us to mimic procedural programming, object-oriented programming and even functional programming.
  • Efficiency. While many other scripting languages achieve the same level of simplicity and versatility, few achieve the same speed of javascript (more exactly, the javascript interpreter/compiler). It seems to be the fastest scripting language in the Computer Language Benchmark Games. It is true that modern javascript engines compile Javascript to native code behind the scene, but personally, I do not feel a long delay like when I compile a C++ program.
  • Usefulness and uniqueness. Javascript is the only programming language that provides dynamic web contents at the client side. Personally, I found writing a graphical interface with HTML+Javascript is much easier than using C++ and Java (although writing a professional GUI in html+javascript requires efforts).
  • Back to succinctness. What I hate most about the modern programming is bloatware. I frequently feel that many programmers are writing unnecessarily complicated programs for simple tasks. In the world of javascript, in contrast, we care about the code size again because a large javascript program means more data transfer, slower responsiveness and a worse user experience.

(This post is unfinished. I need to point out the weakness of javascript. It is not perfect.)

Although I do not use D, I always see it as one of the most attractive programming languages, smartly balancing efficiency, simplicity and extensibility. At the same time, I keep getting frustrated when I see such an elegant thing fade away gradually given that a) D has dropped out of top 20 in TIOBE Programming Community Index and b) it was not evaluated by the Computer Language Benchmarks Game any more. Most programmers know why this happens. I am simply frustrated.

D is falling while Go is rising. I do appreciate the philosophy behind the design of Go and trust Rob Pike and Ken Thompson to deliver another great product, but right now I do not see Go as a replacement of any mainstream programming languages as long as it is way slower than Java, not to speak C/C++. To me, Go’s rising is merely due to the support from Google. It is good as a research project, but it needs time to reach the critical mass in practice.

While reading the Computer Language Benchmarks Game, I am really amazed by LuaJIT. Probably I am going to try it some day.

After nearly 5 years, zlib finally gets updated. In the release notes, we see this: “Wholesale replacement of gz* functions with faster versions”. How much faster? Let’s see.

In the following table, gzread/gzgetc/gzgets are direct zlib function calls. ks_getc/ks_getuntil are from my generic buffered wrapper.

Metrics

1.2.3 (CPU sec)

1.2.5 (CPU sec)
gzread(), 4KB buffer

2.69

2.87
gzgetc()

73.25

7.34
gzgets(), max 4KB

73.29

4.32
ks_getc()

4.06

4.38
ks_getuntil()

3.57

3.76

It seems that gzread() in 1.2.5 becomes slightly slower, which also slows down my ks_getc/ks_getuntil. But gzgetc/gzgets show a 10-fold speedup. This greatly helps to simplify programming as we do not need to implement a buffer by ourselves (although my wrapper is slightly more efficient). Greak work!

Benchmarking source code:

#include <zlib.h>
#include <stdio.h>
#include <time.h>
#include <stdint.h>
#include <stdlib.h>
#include "kseq.h"

#define BUF_SIZE 4096
KSTREAM_INIT(gzFile, gzread, BUF_SIZE)

int main(int argc, char *argv[])
{
	gzFile fp;
	clock_t t;
	if (argc == 1) {
		fprintf(stderr, "Usage: kseq_bench <in.gz>\n");
		return 1;
	}
	{
		uint8_t *buf = malloc(BUF_SIZE);
		fp = gzopen(argv[1], "r");
		t = clock();
		while (gzread(fp, buf, BUF_SIZE) > 0);
		fprintf(stderr, "[gzread] %.2f sec\n", (float)(clock() - t) / CLOCKS_PER_SEC);
		gzclose(fp);
		free(buf);
	}
	{
		kstream_t *ks;
		fp = gzopen(argv[1], "r");
		ks = ks_init(fp);
		t = clock();
		while (ks_getc(ks) >= 0);
		fprintf(stderr, "[ks_getc] %.2f sec\n", (float)(clock() - t) / CLOCKS_PER_SEC);
		ks_destroy(ks);
		gzclose(fp);
	}
	{
		kstream_t *ks;
		kstring_t *s;
		int dret;
		s = calloc(1, sizeof(kstring_t));
		fp = gzopen(argv[1], "r");
		ks = ks_init(fp);
		t = clock();
		while (ks_getuntil(ks, '\n', s, &dret) >= 0);
		fprintf(stderr, "[ks_getuntil] %.2f sec\n", (float)(clock() - t) / CLOCKS_PER_SEC);
		ks_destroy(ks);
		gzclose(fp);
		free(s->s); free(s);
	}
	{
		fp = gzopen(argv[1], "r");
		t = clock();
		while (gzgetc(fp) >= 0);
		fprintf(stderr, "[gzgetc] %.2f sec\n", (float)(clock() - t) / CLOCKS_PER_SEC);
		gzclose(fp);
	}
	{
		char *buf = malloc(BUF_SIZE);
		fp = gzopen(argv[1], "r");
		t = clock();
		while (gzgets(fp, buf, BUF_SIZE) > 0);
		fprintf(stderr, "[gzgets] %.2f sec\n", (float)(clock() - t) / CLOCKS_PER_SEC);
		gzclose(fp);
		free(buf);
	}
	return 0;
}

and the source code of kseq.h:

#ifndef AC_KSEQ_H
#define AC_KSEQ_H

#include <ctype.h>
#include <string.h>
#include <stdlib.h>

#define KS_SEP_SPACE 0 // isspace(): \t, \n, \v, \f, \r
#define KS_SEP_TAB   1 // isspace() && !' '
#define KS_SEP_MAX   1

#define __KS_TYPE(type_t)						\
	typedef struct __kstream_t {				\
		unsigned char *buf;						\
		int begin, end, is_eof;					\
		type_t f;								\
	} kstream_t;

#define ks_eof(ks) ((ks)->is_eof && (ks)->begin >= (ks)->end)
#define ks_rewind(ks) ((ks)->is_eof = (ks)->begin = (ks)->end = 0)

#define __KS_BASIC(type_t, __bufsize)								\
	static inline kstream_t *ks_init(type_t f)						\
	{																\
		kstream_t *ks = (kstream_t*)calloc(1, sizeof(kstream_t));	\
		ks->f = f;													\
		ks->buf = malloc(__bufsize);								\
		return ks;													\
	}																\
	static inline void ks_destroy(kstream_t *ks)					\
	{																\
		if (ks) {													\
			free(ks->buf);											\
			free(ks);												\
		}															\
	}

#define __KS_GETC(__read, __bufsize)						\
	static inline int ks_getc(kstream_t *ks)				\
	{														\
		if (ks->is_eof && ks->begin >= ks->end) return -1;	\
		if (ks->begin >= ks->end) {							\
			ks->begin = 0;									\
			ks->end = __read(ks->f, ks->buf, __bufsize);	\
			if (ks->end < __bufsize) ks->is_eof = 1;		\
			if (ks->end == 0) return -1;					\
		}													\
		return (int)ks->buf[ks->begin++];					\
	}

#ifndef KSTRING_T
#define KSTRING_T kstring_t
typedef struct __kstring_t {
	size_t l, m;
	char *s;
} kstring_t;
#endif

#ifndef kroundup32
#define kroundup32(x) (--(x), (x)|=(x)>>1, (x)|=(x)>>2, (x)|=(x)>>4, (x)|=(x)>>8, (x)|=(x)>>16, ++(x))
#endif

#define __KS_GETUNTIL(__read, __bufsize)								\
	static int ks_getuntil(kstream_t *ks, int delimiter, kstring_t *str, int *dret) \
	{																	\
		if (dret) *dret = 0;											\
		str->l = 0;														\
		if (ks->begin >= ks->end && ks->is_eof) return -1;				\
		for (;;) {														\
			int i;														\
			if (ks->begin >= ks->end) {									\
				if (!ks->is_eof) {										\
					ks->begin = 0;										\
					ks->end = __read(ks->f, ks->buf, __bufsize);		\
					if (ks->end < __bufsize) ks->is_eof = 1;			\
					if (ks->end == 0) break;							\
				} else break;											\
			}															\
			if (delimiter > KS_SEP_MAX) {								\
				for (i = ks->begin; i < ks->end; ++i)					\
					if (ks->buf[i] == delimiter) break;					\
			} else if (delimiter == KS_SEP_SPACE) {						\
				for (i = ks->begin; i < ks->end; ++i)					\
					if (isspace(ks->buf[i])) break;						\
			} else if (delimiter == KS_SEP_TAB) {						\
				for (i = ks->begin; i < ks->end; ++i)					\
					if (isspace(ks->buf[i]) && ks->buf[i] != ' ') break; \
			} else i = 0; /* never come to here! */						\
			if (str->m - str->l < i - ks->begin + 1) {					\
				str->m = str->l + (i - ks->begin) + 1;					\
				kroundup32(str->m);										\
				str->s = (char*)realloc(str->s, str->m);				\
			}															\
			memcpy(str->s + str->l, ks->buf + ks->begin, i - ks->begin); \
			str->l = str->l + (i - ks->begin);							\
			ks->begin = i + 1;											\
			if (i < ks->end) {											\
				if (dret) *dret = ks->buf[i];							\
				break;													\
			}															\
		}																\
		if (str->l == 0) {												\
			str->m = 1;													\
			str->s = (char*)calloc(1, 1);								\
		}																\
		str->s[str->l] = '\0';											\
		return str->l;													\
	}

#define KSTREAM_INIT(type_t, __read, __bufsize) \
	__KS_TYPE(type_t)							\
	__KS_BASIC(type_t, __bufsize)				\
	__KS_GETC(__read, __bufsize)				\
	__KS_GETUNTIL(__read, __bufsize)

#endif

khash.h

I put my khash.h library here. It is now compatible with VC++ 8.0. I believe it is quite stable now.

/* The MIT License

   Copyright (c) 2008, 2009 by attractor <attractor@live.co.uk>

   Permission is hereby granted, free of charge, to any person obtaining
   a copy of this software and associated documentation files (the
   "Software"), to deal in the Software without restriction, including
   without limitation the rights to use, copy, modify, merge, publish,
   distribute, sublicense, and/or sell copies of the Software, and to
   permit persons to whom the Software is furnished to do so, subject to
   the following conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   SOFTWARE.
*/

/*
  An example:

#include "khash.h"
KHASH_MAP_INIT_INT(32, char)
int main() {
	int ret, is_missing;
	khiter_t k;
	khash_t(32) *h = kh_init(32);
	k = kh_put(32, h, 5, &ret);
	if (!ret) kh_del(32, h, k);
	kh_value(h, k) = 10;
	k = kh_get(32, h, 10);
	is_missing = (k == kh_end(h));
	k = kh_get(32, h, 5);
	kh_del(32, h, k);
	for (k = kh_begin(h); k != kh_end(h); ++k)
		if (kh_exist(h, k)) kh_value(h, k) = 1;
	kh_destroy(32, h);
	return 0;
}
*/

/*
  2009-09-26 (0.2.4):

    * Improve portability

  2008-09-19 (0.2.3):

	* Corrected the example
	* Improved interfaces

  2008-09-11 (0.2.2):

	* Improved speed a little in kh_put()

  2008-09-10 (0.2.1):

	* Added kh_clear()
	* Fixed a compiling error

  2008-09-02 (0.2.0):

	* Changed to token concatenation which increases flexibility.

  2008-08-31 (0.1.2):

	* Fixed a bug in kh_get(), which has not been tested previously.

  2008-08-31 (0.1.1):

	* Added destructor
*/


#ifndef __AC_KHASH_H
#define __AC_KHASH_H

/*!
  @header

  Generic hash table library.

  @copyright Heng Li
 */

#define AC_VERSION_KHASH_H "0.2.4"

#include <stdlib.h>
#include <string.h>
#include <limits.h>

/* compipler specific configuration */

#if UINT_MAX == 0xffffffffu
typedef unsigned int khint32_t;
#elif ULONG_MAX == 0xffffffffu
typedef unsigned long khint32_t;
#endif

#if ULONG_MAX == ULLONG_MAX
typedef unsigned long khint64_t;
#else
typedef unsigned long long khint64_t;
#endif

#ifdef _MSC_VER
#define inline __inline
#endif

typedef khint32_t khint_t;
typedef khint_t khiter_t;

#define __ac_HASH_PRIME_SIZE 32
static const khint32_t __ac_prime_list[__ac_HASH_PRIME_SIZE] =
{
  0ul,          3ul,          11ul,         23ul,         53ul,
  97ul,         193ul,        389ul,        769ul,        1543ul,
  3079ul,       6151ul,       12289ul,      24593ul,      49157ul,
  98317ul,      196613ul,     393241ul,     786433ul,     1572869ul,
  3145739ul,    6291469ul,    12582917ul,   25165843ul,   50331653ul,
  100663319ul,  201326611ul,  402653189ul,  805306457ul,  1610612741ul,
  3221225473ul, 4294967291ul
};

#define __ac_isempty(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&2)
#define __ac_isdel(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&1)
#define __ac_iseither(flag, i) ((flag[i>>4]>>((i&0xfU)<<1))&3)
#define __ac_set_isdel_false(flag, i) (flag[i>>4]&=~(1ul<<((i&0xfU)<<1)))
#define __ac_set_isempty_false(flag, i) (flag[i>>4]&=~(2ul<<((i&0xfU)<<1)))
#define __ac_set_isboth_false(flag, i) (flag[i>>4]&=~(3ul<<((i&0xfU)<<1)))
#define __ac_set_isdel_true(flag, i) (flag[i>>4]|=1ul<<((i&0xfU)<<1))

static const double __ac_HASH_UPPER = 0.77;

#define KHASH_INIT(name, khkey_t, khval_t, kh_is_map, __hash_func, __hash_equal) \
	typedef struct {													\
		khint_t n_buckets, size, n_occupied, upper_bound;				\
		khint32_t *flags;												\
		khkey_t *keys;													\
		khval_t *vals;													\
	} kh_##name##_t;													\
	static inline kh_##name##_t *kh_init_##name() {						\
		return (kh_##name##_t*)calloc(1, sizeof(kh_##name##_t));		\
	}																	\
	static inline void kh_destroy_##name(kh_##name##_t *h)				\
	{																	\
		if (h) {														\
			free(h->keys); free(h->flags);								\
			free(h->vals);												\
			free(h);													\
		}																\
	}																	\
	static inline void kh_clear_##name(kh_##name##_t *h)				\
	{																	\
		if (h && h->flags) {											\
			memset(h->flags, 0xaa, ((h->n_buckets>>4) + 1) * sizeof(khint32_t)); \
			h->size = h->n_occupied = 0;								\
		}																\
	}																	\
	static inline khint_t kh_get_##name(const kh_##name##_t *h, khkey_t key) \
	{																	\
		if (h->n_buckets) {												\
			khint_t inc, k, i, last;									\
			k = __hash_func(key); i = k % h->n_buckets;					\
			inc = 1 + k % (h->n_buckets - 1); last = i;					\
			while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
				if (i + inc >= h->n_buckets) i = i + inc - h->n_buckets; \
				else i += inc;											\
				if (i == last) return h->n_buckets;						\
			}															\
			return __ac_iseither(h->flags, i)? h->n_buckets : i;		\
		} else return 0;												\
	}																	\
	static inline void kh_resize_##name(kh_##name##_t *h, khint_t new_n_buckets) \
	{																	\
		khint32_t *new_flags = 0;										\
		khint_t j = 1;													\
		{																\
			khint_t t = __ac_HASH_PRIME_SIZE - 1;						\
			while (__ac_prime_list[t] > new_n_buckets) --t;				\
			new_n_buckets = __ac_prime_list[t+1];						\
			if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0;	\
			else {														\
				new_flags = (khint32_t*)malloc(((new_n_buckets>>4) + 1) * sizeof(khint32_t));	\
				memset(new_flags, 0xaa, ((new_n_buckets>>4) + 1) * sizeof(khint32_t)); \
				if (h->n_buckets < new_n_buckets) {						\
					h->keys = (khkey_t*)realloc(h->keys, new_n_buckets * sizeof(khkey_t)); \
					if (kh_is_map)										\
						h->vals = (khval_t*)realloc(h->vals, new_n_buckets * sizeof(khval_t)); \
				}														\
			}															\
		}																\
		if (j) {														\
			for (j = 0; j != h->n_buckets; ++j) {						\
				if (__ac_iseither(h->flags, j) == 0) {					\
					khkey_t key = h->keys[j];							\
					khval_t val;										\
					if (kh_is_map) val = h->vals[j];					\
					__ac_set_isdel_true(h->flags, j);					\
					while (1) {											\
						khint_t inc, k, i;								\
						k = __hash_func(key);							\
						i = k % new_n_buckets;							\
						inc = 1 + k % (new_n_buckets - 1);				\
						while (!__ac_isempty(new_flags, i)) {			\
							if (i + inc >= new_n_buckets) i = i + inc - new_n_buckets; \
							else i += inc;								\
						}												\
						__ac_set_isempty_false(new_flags, i);			\
						if (i < h->n_buckets && __ac_iseither(h->flags, i) == 0) { \
							{ khkey_t tmp = h->keys[i]; h->keys[i] = key; key = tmp; } \
							if (kh_is_map) { khval_t tmp = h->vals[i]; h->vals[i] = val; val = tmp; } \
							__ac_set_isdel_true(h->flags, i);			\
						} else {										\
							h->keys[i] = key;							\
							if (kh_is_map) h->vals[i] = val;			\
							break;										\
						}												\
					}													\
				}														\
			}															\
			if (h->n_buckets > new_n_buckets) {							\
				h->keys = (khkey_t*)realloc(h->keys, new_n_buckets * sizeof(khkey_t)); \
				if (kh_is_map)											\
					h->vals = (khval_t*)realloc(h->vals, new_n_buckets * sizeof(khval_t)); \
			}															\
			free(h->flags);												\
			h->flags = new_flags;										\
			h->n_buckets = new_n_buckets;								\
			h->n_occupied = h->size;									\
			h->upper_bound = (khint_t)(h->n_buckets * __ac_HASH_UPPER + 0.5); \
		}																\
	}																	\
	static inline khint_t kh_put_##name(kh_##name##_t *h, khkey_t key, int *ret) \
	{																	\
		khint_t x;														\
		if (h->n_occupied >= h->upper_bound) {							\
			if (h->n_buckets > (h->size<<1)) kh_resize_##name(h, h->n_buckets - 1); \
			else kh_resize_##name(h, h->n_buckets + 1);					\
		}																\
		{																\
			khint_t inc, k, i, site, last;								\
			x = site = h->n_buckets; k = __hash_func(key); i = k % h->n_buckets; \
			if (__ac_isempty(h->flags, i)) x = i;						\
			else {														\
				inc = 1 + k % (h->n_buckets - 1); last = i;				\
				while (!__ac_isempty(h->flags, i) && (__ac_isdel(h->flags, i) || !__hash_equal(h->keys[i], key))) { \
					if (__ac_isdel(h->flags, i)) site = i;				\
					if (i + inc >= h->n_buckets) i = i + inc - h->n_buckets; \
					else i += inc;										\
					if (i == last) { x = site; break; }					\
				}														\
				if (x == h->n_buckets) {								\
					if (__ac_isempty(h->flags, i) && site != h->n_buckets) x = site; \
					else x = i;											\
				}														\
			}															\
		}																\
		if (__ac_isempty(h->flags, x)) {								\
			h->keys[x] = key;											\
			__ac_set_isboth_false(h->flags, x);							\
			++h->size; ++h->n_occupied;									\
			*ret = 1;													\
		} else if (__ac_isdel(h->flags, x)) {							\
			h->keys[x] = key;											\
			__ac_set_isboth_false(h->flags, x);							\
			++h->size;													\
			*ret = 2;													\
		} else *ret = 0;												\
		return x;														\
	}																	\
	static inline void kh_del_##name(kh_##name##_t *h, khint_t x)		\
	{																	\
		if (x != h->n_buckets && !__ac_iseither(h->flags, x)) {			\
			__ac_set_isdel_true(h->flags, x);							\
			--h->size;													\
		}																\
	}

/* --- BEGIN OF HASH FUNCTIONS --- */

/*! @function
  @abstract     Integer hash function
  @param  key   The integer [khint32_t]
  @return       The hash value [khint_t]
 */
#define kh_int_hash_func(key) (khint32_t)(key)
/*! @function
  @abstract     Integer comparison function
 */
#define kh_int_hash_equal(a, b) ((a) == (b))
/*! @function
  @abstract     64-bit integer hash function
  @param  key   The integer [khint64_t]
  @return       The hash value [khint_t]
 */
#define kh_int64_hash_func(key) (khint32_t)((key)>>33^(key)^(key)<<11)
/*! @function
  @abstract     64-bit integer comparison function
 */
#define kh_int64_hash_equal(a, b) ((a) == (b))
/*! @function
  @abstract     const char* hash function
  @param  s     Pointer to a null terminated string
  @return       The hash value
 */
static inline khint_t __ac_X31_hash_string(const char *s)
{
	khint_t h = *s;
	if (h) for (++s ; *s; ++s) h = (h << 5) - h + *s;
	return h;
}
/*! @function
  @abstract     Another interface to const char* hash function
  @param  key   Pointer to a null terminated string [const char*]
  @return       The hash value [khint_t]
 */
#define kh_str_hash_func(key) __ac_X31_hash_string(key)
/*! @function
  @abstract     Const char* comparison function
 */
#define kh_str_hash_equal(a, b) (strcmp(a, b) == 0)

/* --- END OF HASH FUNCTIONS --- */

/* Other necessary macros... */

/*!
  @abstract Type of the hash table.
  @param  name  Name of the hash table [symbol]
 */
#define khash_t(name) kh_##name##_t

/*! @function
  @abstract     Initiate a hash table.
  @param  name  Name of the hash table [symbol]
  @return       Pointer to the hash table [khash_t(name)*]
 */
#define kh_init(name) kh_init_##name()

/*! @function
  @abstract     Destroy a hash table.
  @param  name  Name of the hash table [symbol]
  @param  h     Pointer to the hash table [khash_t(name)*]
 */
#define kh_destroy(name, h) kh_destroy_##name(h)

/*! @function
  @abstract     Reset a hash table without deallocating memory.
  @param  name  Name of the hash table [symbol]
  @param  h     Pointer to the hash table [khash_t(name)*]
 */
#define kh_clear(name, h) kh_clear_##name(h)

/*! @function
  @abstract     Resize a hash table.
  @param  name  Name of the hash table [symbol]
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  s     New size [khint_t]
 */
#define kh_resize(name, h, s) kh_resize_##name(h, s)

/*! @function
  @abstract     Insert a key to the hash table.
  @param  name  Name of the hash table [symbol]
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  k     Key [type of keys]
  @param  r     Extra return code: 0 if the key is present in the hash table;
                1 if the bucket is empty (never used); 2 if the element in
				the bucket has been deleted [int*]
  @return       Iterator to the inserted element [khint_t]
 */
#define kh_put(name, h, k, r) kh_put_##name(h, k, r)

/*! @function
  @abstract     Retrieve a key from the hash table.
  @param  name  Name of the hash table [symbol]
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  k     Key [type of keys]
  @return       Iterator to the found element, or kh_end(h) is the element is absent [khint_t]
 */
#define kh_get(name, h, k) kh_get_##name(h, k)

/*! @function
  @abstract     Remove a key from the hash table.
  @param  name  Name of the hash table [symbol]
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  k     Iterator to the element to be deleted [khint_t]
 */
#define kh_del(name, h, k) kh_del_##name(h, k)


/*! @function
  @abstract     Test whether a bucket contains data.
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  x     Iterator to the bucket [khint_t]
  @return       1 if containing data; 0 otherwise [int]
 */
#define kh_exist(h, x) (!__ac_iseither((h)->flags, (x)))

/*! @function
  @abstract     Get key given an iterator
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  x     Iterator to the bucket [khint_t]
  @return       Key [type of keys]
 */
#define kh_key(h, x) ((h)->keys[x])

/*! @function
  @abstract     Get value given an iterator
  @param  h     Pointer to the hash table [khash_t(name)*]
  @param  x     Iterator to the bucket [khint_t]
  @return       Value [type of values]
  @discussion   For hash sets, calling this results in segfault.
 */
#define kh_val(h, x) ((h)->vals[x])

/*! @function
  @abstract     Alias of kh_val()
 */
#define kh_value(h, x) ((h)->vals[x])

/*! @function
  @abstract     Get the start iterator
  @param  h     Pointer to the hash table [khash_t(name)*]
  @return       The start iterator [khint_t]
 */
#define kh_begin(h) (khint_t)(0)

/*! @function
  @abstract     Get the end iterator
  @param  h     Pointer to the hash table [khash_t(name)*]
  @return       The end iterator [khint_t]
 */
#define kh_end(h) ((h)->n_buckets)

/*! @function
  @abstract     Get the number of elements in the hash table
  @param  h     Pointer to the hash table [khash_t(name)*]
  @return       Number of elements in the hash table [khint_t]
 */
#define kh_size(h) ((h)->size)

/*! @function
  @abstract     Get the number of buckets in the hash table
  @param  h     Pointer to the hash table [khash_t(name)*]
  @return       Number of buckets in the hash table [khint_t]
 */
#define kh_n_buckets(h) ((h)->n_buckets)

/* More conenient interfaces */

/*! @function
  @abstract     Instantiate a hash set containing integer keys
  @param  name  Name of the hash table [symbol]
 */
#define KHASH_SET_INIT_INT(name)										\
	KHASH_INIT(name, khint32_t, char, 0, kh_int_hash_func, kh_int_hash_equal)

/*! @function
  @abstract     Instantiate a hash map containing integer keys
  @param  name  Name of the hash table [symbol]
  @param  khval_t  Type of values [type]
 */
#define KHASH_MAP_INIT_INT(name, khval_t)								\
	KHASH_INIT(name, khint32_t, khval_t, 1, kh_int_hash_func, kh_int_hash_equal)

/*! @function
  @abstract     Instantiate a hash map containing 64-bit integer keys
  @param  name  Name of the hash table [symbol]
 */
#define KHASH_SET_INIT_INT64(name)										\
	KHASH_INIT(name, khint64_t, char, 0, kh_int64_hash_func, kh_int64_hash_equal)

/*! @function
  @abstract     Instantiate a hash map containing 64-bit integer keys
  @param  name  Name of the hash table [symbol]
  @param  khval_t  Type of values [type]
 */
#define KHASH_MAP_INIT_INT64(name, khval_t)								\
	KHASH_INIT(name, khint64_t, khval_t, 1, kh_int64_hash_func, kh_int64_hash_equal)

typedef const char *kh_cstr_t;
/*! @function
  @abstract     Instantiate a hash map containing const char* keys
  @param  name  Name of the hash table [symbol]
 */
#define KHASH_SET_INIT_STR(name)										\
	KHASH_INIT(name, kh_cstr_t, char, 0, kh_str_hash_func, kh_str_hash_equal)

/*! @function
  @abstract     Instantiate a hash map containing const char* keys
  @param  name  Name of the hash table [symbol]
  @param  khval_t  Type of values [type]
 */
#define KHASH_MAP_INIT_STR(name, khval_t)								\
	KHASH_INIT(name, kh_cstr_t, khval_t, 1, kh_str_hash_func, kh_str_hash_equal)

#endif /* __AC_KHASH_H */

Here is a small example:

#include <stdio.h>
#include <assert.h>
#include <time.h>
#include <stdlib.h>
#include <string.h>

#include "khash.h"
KHASH_SET_INIT_STR(str)
KHASH_SET_INIT_INT(int)

static int data_size = 5000000;
static unsigned *int_data;
static char **str_data;

void ht_init_data()
{
	int i;
	char buf[256];
	khint32_t x = 11;
	printf("--- generating data... ");
	int_data = (unsigned*)calloc(data_size, sizeof(unsigned));
	str_data = (char**)calloc(data_size, sizeof(char*));
	for (i = 0; i < data_size; ++i) {
		int_data[i] = (unsigned)(data_size * ((double)x / UINT_MAX) / 4) * 271828183u;
		sprintf(buf, "%x", int_data[i]);
		str_data[i] = strdup(buf);
		x = 1664525L * x + 1013904223L;
	}
	printf("done!\n");
}
void ht_destroy_data()
{
	int i;
	for (i = 0; i < data_size; ++i) free(str_data[i]);
	free(str_data); free(int_data);
}
void ht_khash_int()
{
	int i, ret;
	unsigned *data = int_data;
	khash_t(int) *h;
	unsigned k;

	h = kh_init(int);
	for (i = 0; i < data_size; ++i) {
		k = kh_put(int, h, data[i], &ret);
		if (!ret) kh_del(int, h, k);
	}
	printf("[ht_khash_int] size: %u\n", kh_size(h));
	kh_destroy(int, h);
}
void ht_khash_str()
{
	int i, ret;
	char **data = str_data;
	khash_t(str) *h;
	unsigned k;

	h = kh_init(str);
	for (i = 0; i < data_size; ++i) {
		k = kh_put(str, h, data[i], &ret);
		if (!ret) kh_del(str, h, k);
	}
	printf("[ht_khash_int] size: %u\n", kh_size(h));
	kh_destroy(str, h);
}
void ht_timing(void (*f)(void))
{
	clock_t t = clock();
	(*f)();
	printf("[ht_timing] %.3lf sec\n", (double)(clock() - t) / CLOCKS_PER_SEC);
}
int main(int argc, char *argv[])
{
	if (argc > 1) data_size = atoi(argv[1]);
	ht_init_data();
	ht_timing(ht_khash_int);
	ht_timing(ht_khash_str);
	ht_destroy_data();
	return 0;
}

I was reading some interesting articles about realizing object-oriented programming in ANSI C. It seems that most people commenting on these articles think this is a good idea in general. I want to say something different, though. In my view, it is fine to realize some basic OOP bits such as encapsulation and constructor, but we should not go too far.

In fact, most of well-formed C projects contain some basic OOP bits. To avoid using too many global variables or a long list of arguments of a C function, we usually put related variables in a struct and transfers a pointer to the struct between functions. Frequently we define functions to (de)allocate memory for the struct. Occasionally we even put the definition of the struct in .c files rather than .h to completely hide the details of the struct. This is basic encapsulation and (de)constructor. We frequently use “static” functions inside a source file. This is private function. We should stop here, though.

Most of these OOP-in-C articles further mimic methods, inheritance, messaging and more OOP bits. However, all these things come at the cost of speed and/or space efficiency. For example, although we may use pointers to functions to mimic methods, pointers take memory and prevent the compiler from inlining simple functions. If we really want to following the C++ methodology to make everything objects, the overhead on these bits is huge.

The most frequent motivation to using OOP in C is because the programmer needs portability while (s)he only knows OOP or thinks OOP is better.  I do not want to argue if OOP is better than procedural programming, but I really think it is big fault to try to mimic all the OOP bits in C in an unnecessarily  complicated way given all the overhead on performance. If you have to use C in your project, learn and be good at procedural programming which has been proved to be at least as good as OOP on a lot of practical applications.