Feeds:
Posts
Comments

Archive for the ‘thinking’ Category

With the completion of the sudoku solving benchmark (my last post), my programming language benchmark is also regarded to be completed (still with a few implementations missing). This post gives more context and analyses of the benchmark.

Design

This benchmark is comprised of four tasks:

  1. solving 1000 Sudoku puzzles
  2. multiplying two 1000×1000 matrices
  3. matching URI or URI|Email in a concatenated Linux HowTo file
  4. counting the occurrences of words using a dictionary

The first two tasks focus on evaluating the performance of the ability of translating the language source code into machine code. For these two tasks, most of CPU time is spent on the benchmarking programs. The last two tasks focus on evaluating the efficiency of the companion libraries. For these two tasks, most of CPU time is spent on the library routines. These tasks are relatively simple and cannot be easily hand optimized for better performance.

Results and discussions

The complete results are available here. The following figure shows the CPU time for Sudoku solving and matrix multiplication, both evaluating the language implementation itself (click for a larger figure):

In the plots, a number in red indicates that the corresponding implementation requires explicit compilation; in blue shows that the implementation applies a Just-In-Time compilation (JIT); in black implies the implementation interprets the program but without JIT.

The overall message is the following. Languages compiled into machine code (C and D) are slightly faster than languages compiled into bytecode (Java and C#); compilers tend to be faster than Just-In-Time (JIT) interpreters (LuaJIT, PyPy and V8); JIT interpreters are much faster than the conventional interpreters (Perl, CPython and Ruby). Between compilers, C is still the winner with a thin margin. Between interpreters, LuaJIT and V8 pull ahead. There is little surprising for most language implementations, perhaps except the few with very bad performance.

On the other hand, the comparison of the library performance yields a vastly different picture (again, click to enlarge):

This time, even conventional interpreters may approach or even surpass the optimized C implementation (Perl vs. C for simple regex matching). Some compiled languages at their early ages may perform badly.

Conclusions

The quality of libraries is a critical part of a programming language. This benchmark is one of few clearly separating the performance of the language implementation itself and its companion libraries. While compiled languages are typically one or two orders of magnitude faster than interpreted languages, library performance may be very similar. For algorithms heavily rely on library routines, the choice of programming language does not matter too much. It is likely to come up with a benchmark to beat C/C++ in a certain application.

All the benchmarking programs are distributed under the MIT/X11 license. Please follow the links below for the source code and the complete results:

There are actually more to say about each specific language implementation, but perhaps I’d better leave the controversial part to readers.

Advertisements

Read Full Post »

Amazed by LuaJIT

I have kept looking for a replacement of Perl for several years. Now I have found it: Lua, although the decision is not made based on the language itself, but on its implementation LuaJIT.

LuaJIT is by far faster than all the other scripting languages and even comes close to the speed of Java with fewer lines of code and a smaller memory footprint. To further confirm the efficiency of LuaJIT, I implemented matrix multiplication in C, Lua, JavaScript and Perl. On my laptop, the C implementation multiplies two 1000×1000 matrices in 2.0 seconds (BTW, 1.4 sec if I use “float”; 0.9 if SSE is used; 26.8 sec without matrix transpose), LuaJIT-jit in 2.3 seconds, LuaJIT-interpreter in 24 sec, JavaScript in 40 sec with V8, Lua-5.1.4 in 64 sec and Perl in 283 sec. As a dynamically typing scripting language, LuaJIT achieves a speed comparable to C, which is simply amazing.

Not only that, LuaJIT fixes IMO a major weakness of Lua: the lack of native bit operations; the upcoming Foreign Function Interface (FFI) library, according to the LuaJIT roadmap 2011, will definitely make Lua one of the best scripting languages to bind dynamic libraries, even surpassing Python’s elegant ctypes library.

With the unparalleled efficiency and the addition of important features, LuaJIT makes Lua the best scripting language at present in my opinion. Mike Pall, the single developer of LuaJIT, commented in an interesting discussion that it is possible to implement a JIT compiler for Javascript, a language similar to Lua in many aspects, as efficient as LuaJIT. But Javascript is not a general-purpose programming language by design. Standardizing additional language features would take years. As to other scripting languages, my impression is their complexity is the major obstacle to the implementation of an efficient JIT compiler.

Probably more developers are concerned about the lack of standard libraries in Lua. Personally, I do not see why Lua cannot be a general-purpose scripting language. Probably the creators of Lua just did not intend to or have energy to implement a comprehensive standard library. I hope someone may organize a group of good programmers to develop such a library. Furthermore, with the upcoming FFI library in LuaJIT, we may be able to easily call library routines, which may solve the lack of library issue again with one man only.

LuaJIT is the future of all scripting languages. Even if LuaJIT were not adopted as widely as I wish it to be, I hope the advanced techniques and ideas developed in LuaJIT can be incorporated into other interpreters and JIT compilers.

Read Full Post »

Although I do not use D, I always see it as one of the most attractive programming languages, smartly balancing efficiency, simplicity and extensibility. At the same time, I keep getting frustrated when I see such an elegant thing fade away gradually given that a) D has dropped out of top 20 in TIOBE Programming Community Index and b) it was not evaluated by the Computer Language Benchmarks Game any more. Most programmers know why this happens. I am simply frustrated.

D is falling while Go is rising. I do appreciate the philosophy behind the design of Go and trust Rob Pike and Ken Thompson to deliver another great product, but right now I do not see Go as a replacement of any mainstream programming languages as long as it is way slower than Java, not to speak C/C++. To me, Go’s rising is merely due to the support from Google. It is good as a research project, but it needs time to reach the critical mass in practice.

While reading the Computer Language Benchmarks Game, I am really amazed by LuaJIT. Probably I am going to try it some day.

Read Full Post »

Just now I got an email from a mailing list, saying that C++ helps to greatly reduce coding time in comparison to C. I have heard a lot about this argument. But is that true?

C++ can possibly accelerate development in two ways: firstly, OOP (Object-Oriented Programming) helps to organize large projects, and secondly, STL (Standard Template Library) saves time on reimplementing frequently used subroutines. However, I do not find C++ OOP greatly helps me. To me, it is not right to clearly classify a programming language as a procedure-oriented or object-oriented language. It is only right to say a development methodology is procedure-oriented or object-oriented. We can effectively mimic the fundamental OOP ideas in C, a socalled procedure-oriented language, by packaging related data in a struct and transfer the a pointer to the struct to subroutines. I know C++ programmers would argue doing in this way is far from OOP, but it has captured the essence of OOP and in practice sufficient to organize large projects with this simple and natural idea. The large amount of existing C projects, such as Linux kernel, gcc and Emacs, prove this is the truth. With OOP ideas, we can use C to organize large projects without difficulty. C++ does not provide more power except introducing more complicated concepts.

I do not use STL most of time. I have implemented most of useful subroutines in C/C++ by myself. I actually spend less time in using my own library than using STL as I am very familiar with my own codes. Of course, implementing an efficient and yet generic library by myself takes a lot of time, but I really learn a lot in this invaluable process. I can hardly imagine how a programmer who does not get a firm grasp of data structures, which can only be achieved by implementing by him/herself, can ever write good programs. To this end, I agree that for elementary programmers using STL reduces coding time; but this is achieved at the cost of weakening the ability to write better programs. And for an advanced programmer, using STL may help but probably does not save much time.

Note that I am not saying C++ is a bad language as a whole. In fact, I use C++ template functions a lot and C++ template classes at times. In this post, I just want to emphasize the importantance to focusing on the art of programming instead of on the artificial concepts or on the degree of laziness a language can provide.

Read Full Post »

What is Moral?

When I was little, I was taught that moral is the guideline for the majority of people. When I was older, I became skeptical about this definition: would cheating be moral if the majority of people cheated each other? Would the minority be immoral if cheating became moral? According to the definition I learnt cheating could be moral, but this strongly counters my view of the world. Then what is moral?

Now I think I find my own definition. Moral is the guideline that adds the value to the whole human community. With this definition, cheating is immoral as it wastes the human resources and aggregates the expenses on discovering the truth. The behaviour held by the majority people can be immoral.

In the real life, I do see people regard cheating as a normal behaviour from time to time. The argument I heard many times is “if I had been honest, I would not have achieved”. Well, I can do nothing but escaping such people. It is regretful and dangerous when the immoral become the majority while I am not among the majority. I could behave like the majority, but then I would ruin myself.

Read Full Post »

Language War

Language war denotes the debate over which is the best programming language, especially between languages with similar applications like C/C++, C++/Java and Perl/Python. A lot of programmers, including me, are absorbed in this debate. They write articles; they design benchmarks. However, they can never come to a unanimous conclusion. Language war is never ending.

The long-lasting debate itself implies that there is no best programming language. All the mainstream programming languages have their own advantages and disadvantages. The answer to “which is the best programming language” largely depends on how a programmer weights the advantages and disadvantages. It is subjective rather than objective: different programmers weight differently.

However, it is still beneficial to be involved in the language war. The intense debate makes us think over the strength and weakness of programming lanuages we may not be aware of otherwise. I will also post my opinions here in future.

Read Full Post »