With the completion of the sudoku solving benchmark (my last post), my programming language benchmark is also regarded to be completed (still with a few implementations missing). This post gives more context and analyses of the benchmark.
Design
This benchmark is comprised of four tasks:
- solving 1000 Sudoku puzzles
- multiplying two 1000×1000 matrices
- matching URI or URI|Email in a concatenated Linux HowTo file
- counting the occurrences of words using a dictionary
The first two tasks focus on evaluating the performance of the ability of translating the language source code into machine code. For these two tasks, most of CPU time is spent on the benchmarking programs. The last two tasks focus on evaluating the efficiency of the companion libraries. For these two tasks, most of CPU time is spent on the library routines. These tasks are relatively simple and cannot be easily hand optimized for better performance.
Results and discussions
The complete results are available here. The following figure shows the CPU time for Sudoku solving and matrix multiplication, both evaluating the language implementation itself (click for a larger figure):

In the plots, a number in red indicates that the corresponding implementation requires explicit compilation; in blue shows that the implementation applies a Just-In-Time compilation (JIT); in black implies the implementation interprets the program but without JIT.
The overall message is the following. Languages compiled into machine code (C and D) are slightly faster than languages compiled into bytecode (Java and C#); compilers tend to be faster than Just-In-Time (JIT) interpreters (LuaJIT, PyPy and V8); JIT interpreters are much faster than the conventional interpreters (Perl, CPython and Ruby). Between compilers, C is still the winner with a thin margin. Between interpreters, LuaJIT and V8 pull ahead. There is little surprising for most language implementations, perhaps except the few with very bad performance.
On the other hand, the comparison of the library performance yields a vastly different picture (again, click to enlarge):

This time, even conventional interpreters may approach or even surpass the optimized C implementation (Perl vs. C for simple regex matching). Some compiled languages at their early ages may perform badly.
Conclusions
The quality of libraries is a critical part of a programming language. This benchmark is one of few clearly separating the performance of the language implementation itself and its companion libraries. While compiled languages are typically one or two orders of magnitude faster than interpreted languages, library performance may be very similar. For algorithms heavily rely on library routines, the choice of programming language does not matter too much. It is likely to come up with a benchmark to beat C/C++ in a certain application.
All the benchmarking programs are distributed under the MIT/X11 license. Please follow the links below for the source code and the complete results:
- Source code: https://github.com/attractivechaos/plb
- Complete results: http://attractivechaos.github.com/plb/
There are actually more to say about each specific language implementation, but perhaps I’d better leave the controversial part to readers.