Feeds:
Posts
Comments

Posts Tagged ‘Dart’

I have played with Dart a little bit. Although overall the language is interesting and full of potentials, it indeed has some rough edges.

1) Flawed comma operator.

main() {
	int i = 1, j = 2;
	i = 2, j = 3;
}

Dart will accept line 2 but report a syntax error at line 3. In C and Java, the line is perfectly legitimate.

2) Non-zero integers are different from “true”.

main() {
	if (1) print("true");
	else print("false");
}

The above program will output “false”, which will surprise most C/Java/Lua/Perl programmers.

3) No “real” dynamic arrays.

main() {
	var a = [];
	a[0] = 1;
}

Dart will report a run-time error at line 3. Most scripting languages will automatically expand an array. I know disabling this feature helps to prevent errors, but I always feel it is very inconvenient.

4) No easy ways to declare a constant-sized array. As Dart does not automatically expand arrays, to declare an array of size 10, you have to do this:

main() {
	var a = new List(10);
}

It is more verbose than “int a[10]” in C.

5) No on-stack replacement (OSR). I discussed this point in my last post, but I still feel it is necessary to emphasize again: if you do not know well how Dart works or are not careful enough, the bottleneck of your code may be interpreted but not compiled and then you will experience bad performance. The Dart developers argued that Dart is tuned for real-world performance, but in my view, if a language does not work well with micro-benchmarks, it has a higher chance to deliever bad performance in larger applications.

6) Lack of C/Perl-like file reading. The following is the Dart way to read a file by line:

main() {
	List<String> argv = new Options().arguments;
	var fp = new StringInputStream(new File(argv[0]).openInputStream(), Encoding.ASCII);
	fp.onLine = () {
		print(fp.readLine());
	};
}

Note that you have to use callback to achieve that. This is firstly verbose in comparison to other scripting languages and more importantly, it is very awkward to work with multiple files at the same time. I discussed the motivation of the design with Dart developers and buy their argument that such APIs are useful for event driven server applications. However, for most programmers in my field whose routine work is text processing for huge files, lack of C-like I/O is a showstopper. Although the Dart developers pointed out that openSync() and readSyncList() are closer to C APIs, openSync() does not work on STDIN (and Dart’s built-in STDIN still relies on callback). APIs are also significantly are lacking. For example, dart:io provides APIs to read the entire file as lines, but no APIs to read a single line. In my field, the golden rule is to always avoid reading the entire file into memory. This is probably the recommendation for most text processing.

On file I/O, Dart is not alone. Node.js and most shells built upon Javascript barely provides usable file I/O APIs. It is clear that these developers do not understand the needs of a large fraction of UNIX developers and bioinformatics developers like me.

Summary: Generally, Dart is designed for web development and for server-side applications. The design of Dart (largely the design of APIs) does not fit well for other applications. In principle, nothing stops Dart becoming a widely used general-purpose programming language like Python, but in this trend, it will only be a replacement, at the best, of javascript, or perhaps more precisely, node.js. At the same time, I admit I know little about server-side programming. It would be really good if different camps of programmers work together to come to a really great programming language. Dart is not there, at least not yet.

Read Full Post »

Dart: revisiting matrix multiplication

First of all, I know little about JIT and VM. Some of what I said below may well be wrong, so read this blog post with a grain of salt.

My previous microbenchmark showed that dart is unable to optimize the code to achieve comparable speed to LuaJIT. Vyacheslav Egorov commented that the key reason is that I have not “warmed up” the code. John McCutchan further wrote an article about how to perform microbenchmarks, also emphasizing the importance of warm-up. After reading a thread on the Dart mailing list, I know better about the importance of warm-up now.

If I am right, JIT can be classified as more traditional method JIT whereby the VM compiles a method/function to machine code at a time, and tracing JIT whereby the VM may optimize a single loop. There is a long discussion on Lambda the Ultimate about them. Typically, method JIT needs to identify hot functions and then compile AFTER the method call is finished. What I did not know previously is On Stack Replacement (OSR), with which we are able to compile a method (or part of the method?) to machine code while it is running. This in some way blurs the boundary between method JIT and tracing JIT.

Among popular JIT implementations, V8 and Java use method JIT with OSR, while Pypy and LuaJIT use tracing JIT. They are all able to perform well for matrix multiplication even if the hot method is called only once. In my previous post, Dart has bad performance because it uses method JIT but without OSR. It is unable to optimize the hot function while it is being executed. The Dart development team argued that the lack of OSR is because implementing OSR is complicated and “experience with Javascript and Java programs has shown that it very rarely benefits real applications.”

I hold the opposite opinion, strongly. There is no clear distinction between benchmarks and real applications. It is true that in web development, a program rarely spends more than a few seconds in a function called only once, but there are more real applications than web. In my daily work, I may need to do Smith-Waterman alignment between two long sequences or to compute the first few eigenvalues of a huge positive matrix. The core functions will be called only once. I have also written many one-off scripts having only a main function. Without OSR, Dart won’t perform better than Perl/Python, either, I guess. If the Dart development team want Dart to be widely adopted in addition to web development, OSR will be a key feature (well, a general-purpose language may not be the goal of Dart, which would be a pity!). I wholeheartedly hope they can implement OSR in future.

Fortunately, before OSR gets implemented in Dart (if ever), there is a simpler and more practical solution than warm-up: hoisting the content in the hot loop into a function to allow Dart compiles that function to machine code after it is called for a few times (though to do this, you need to know which loop is hot).

At the end of the post is an updated implementation of matrix multiplication, where “mat_mul1()” and “mat_mul2()” have the same functionality but differ in the use of function. The new implementation (mat_mul2) multiplies two 500×500 matrices in 1.0 second, as opposed to 14 seconds by the old one (mat_mul1). This is still much slower than LuaJIT (0.2 second) and V8 (0.3 second), but I would expect Dart to catch up in the future. Actually Vyacheslav commented that a nightly build might have already achieved or approached that.

SUMMARY: Dart as of now only compiles an entire method to machine code, but it cannot compile the method while it is running. Therefore, if the hot method is called only once, it will not be compiled and you will experience bad performance. An effective solution is to hoist the content of the hot loop to a separate function such that Dart can compile the function after it is executed a few times.

mat_transpose(a)
{
	int m = a.length, n = a[0].length; // m rows and n cols
	var b = new List(n);
	for (int j = 0; j < n; ++j) b[j] = new List<double>(m);
	for (int i = 0; i < m; ++i)
		for (int j = 0; j < n; ++j)
			b[j][i] = a[i][j];
	return b;
}

mat_mul1(a, b)
{
	int m = a.length, n = a[0].length, s = b.length, t = b[0].length;
	if (n != s) return null;
	var x = new List(m), c = mat_transpose(b);
	for (int i = 0; i < m; ++i) {
		x[i] = new List<double>(t);
		for (int j = 0; j < t; ++j) {
			double sum = 0.0;
			for (int k = 0; k < n; ++k) sum += a[i][k] * c[j][k];
			x[i][j] = sum;
		}
	}
	return x;
}

mat_mul2(a, b)
{
	inner_loop(t, n, ai, c)
	{
		var xi = new List<double>(t);
		for (int j = 0; j < t; ++j) {
			double sum = 0.0;
			for (int k = 0; k < n; ++k) sum += ai[k] * c[j][k];
			xi[j] = sum;
		}
		return xi;
	}

	int m = a.length, n = a[0].length, s = b.length, t = b[0].length;
	if (n != s) return null;
	var x = new List(m), c = mat_transpose(b);
	for (int i = 0; i < m; ++i)
		x[i] = inner_loop(t, n, a[i], c);
	return x;
}

mat_gen(int n)
{
	var a = new List(n);
	double t = 1.0 / n / n;
	for (int i = 0; i < n; ++i) {
		a[i] = new List<double>(n);
		for (int j = 0; j < n; ++j)
			a[i][j] = t * (i - j) * (i + j);
	}
	return a;
}

main()
{
	int n = 500;
	var a = mat_gen(n), b = mat_gen(n);
	var c = mat_mul2(a, b);
	print(c[n~/2][n~/2]);
}

Read Full Post »

The dart programming language

The first dart SDK is released today. Since the initial announcement, most web developers have been strongly against dart. The typical argument is that javascript meets our needs and even if it does not there are a bunch of other languages translated to javascript. Why do we need a new one? Because google can take control over it?

While these arguments are true, I see dart in the angle of a command-line tool developer. For javascript or a language translated to javascript, such as coffeescript, it cannot provide basic file I/O and system utilities, which makes it not suitable for developing command-line tools at all. A few years ago when I investigated nodejs, it did not provide proper file I/O, either (it seems much better now, but I have not tried). Another problem with Javascript is that it was not designed for JIT at the beginning. Naively, a language designed with JIT in mind is likely to perform better.

From a quick look, Dart apparently overcomes the major weakness of javascript mentioned above. It has clean C++-like syntax with modern language features, inherites the flexibility of javascript, supports at least basic I/O and system utilities (perhaps a replacement of nodejs?), and is designed for JIT from the beginning. I have not evaluated its performance, but I would expect it will compete or outperform V8 in the long run, though the release note seems to suggest that right now V8 is faster. I will evaluate its performance when I have time.

I have to admit that I am a little anti-google in general (not much), but I applaud google’s decision on developing the dart programming language amidst massively axing other projects. From a quick tour, it seems to be the closest to the ideal programming language in my mind (EDIT: I should add that this is the ideal scripting language; no matter how much I like dart, I will always use C for performance-critical tasks).

Read Full Post »