<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Attractive Chaos</title>
	<atom:link href="http://attractivechaos.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://attractivechaos.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Wed, 15 May 2013 22:31:51 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='attractivechaos.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Attractive Chaos</title>
		<link>http://attractivechaos.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://attractivechaos.wordpress.com/osd.xml" title="Attractive Chaos" />
	<atom:link rel='hub' href='http://attractivechaos.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Does packed struct hurt performance on x86_64?</title>
		<link>http://attractivechaos.wordpress.com/2013/05/02/does-packed-struct-hurt-performance-on-x86_64/</link>
		<comments>http://attractivechaos.wordpress.com/2013/05/02/does-packed-struct-hurt-performance-on-x86_64/#comments</comments>
		<pubDate>Thu, 02 May 2013 20:34:49 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1302</guid>
		<description><![CDATA[Most C programmers know that in a C struct, members have to be aligned in memory. Take the following struct as an example: The two members of this struct take 5 bytes in total. However, because &#8220;val&#8221; has to be aligned with the longer &#8220;key&#8221;, &#8220;sizeof(UnpackedStruct)&#8221; returns 8. 3 bytes are wasted in this struct. [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1302&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Most C programmers know that in a C struct, members have to be aligned in memory. Take the following struct as an example:</p>
<pre class="brush: cpp; title: ; notranslate">
typedef struct {
  unsigned key;
  unsigned char val;
} UnpackedStruct;
</pre>
<p>The two members of this struct take 5 bytes in total. However, because &#8220;val&#8221; has to be aligned with the longer &#8220;key&#8221;, &#8220;sizeof(UnpackedStruct)&#8221; returns 8. 3 bytes are wasted in this struct. Waste of memory is the key reason why <a href="https://github.com/attractivechaos/klib/blob/master/khash.h">my khash library</a> uses two separate arrays to keep keys and values even though this leads to more cache misses.</p>
<p>Khash was initially written about 10 years ago when I was young and foolish. I later learned that with gcc/clang, it is possible to byte-pack the struct:</p>
<pre class="brush: cpp; title: ; notranslate">
typedef struct {
  unsigned key;
  unsigned char val;
}  __attribute__ ((__packed__)) PackedStruct;
</pre>
<p>With this, &#8220;sizeof(PackedStruct)&#8221; returns 5. Then why gcc does not use this by default? Is it because unaligned memory hurt performance? Google search pointed me to <a href="http://stackoverflow.com/questions/3454673/can-attribute-packed-affect-the-performance-of-a-program">this question</a> on StackOverflow. There was a discussion, but no clear conclusions.</p>
<p>Hash table has become the bottleneck of my recent works, so I decided to revisit the question: does packed struct hurt performance on x86_64 CPUs? As usual, I did a very simple benchmark: with khash, I insert/delete 50 million (uint32_t,uint8_t) integer pairs stored in either packed or unpacked struct shown above and see if the performance is different. The following table shows the CPU time on my x86_64 laptop:</p>
<table border="1" cellspacing="0">
<tr style="background-color:lightgray;">
<th>Key type
<th>Value type
<th>size per elem
<th>CPU seconds</tr>
<tr>
<td>Unsigned
<td>uint8_t
<td>5
<td>10.249</tr>
<tr>
<td>UnpackedStruct
<td>N/A
<td>8
<td>9.429</tr>
<tr>
<td>PackedStruct
<td>N/A
<td>5
<td>9.287</tr>
</table>
<p>The table says it all: <b>on x64 CPUs, a packed struct array does not hurt performance in comparison to an unpacked struct array</b>. With both gcc and clang, packed struct is consistently faster, perhaps because packed struct takes smaller space, which might help cache performance. The source code can be <a href="https://github.com/attractivechaos/klib/blob/master/test/khash_test.c">found here</a>.</p>
<p>At last, it should be noted that x86 CPUs have been optimized for unaligned memory access. On other CPUs, the results may be very different. Perhaps that is why gcc does not pack struct by default.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1302/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1302/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1302&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2013/05/02/does-packed-struct-hurt-performance-on-x86_64/feed/</wfw:commentRss>
		<slash:comments>8</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>Performance of Rust and Dart in Sudoku Solving</title>
		<link>http://attractivechaos.wordpress.com/2013/04/06/performance-of-rust-and-dart-in-sudoku-solving/</link>
		<comments>http://attractivechaos.wordpress.com/2013/04/06/performance-of-rust-and-dart-in-sudoku-solving/#comments</comments>
		<pubDate>Sat, 06 Apr 2013 04:03:39 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[benchmark]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1225</guid>
		<description><![CDATA[Introduction About two years ago I evaluated the performance of ~20 compilers and interpreters on sudoku solving, matrix multiplication, pattern matching and dictionary operations. Two years later, I decide update a small part of the benchmark on Sudoku solving. I choose this problem because it is practically and algorithmically interesting, and simple enough to be [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1225&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<h3>Introduction</h3>
<p>About two years ago I <a href="http://attractivechaos.wordpress.com/2011/06/22/my-programming-language-benchmark-analyses/">evaluated</a> the performance of ~20 compilers and interpreters on sudoku solving, matrix multiplication, pattern matching and dictionary operations. Two years later, I decide update a small part of the benchmark on Sudoku solving. I choose this problem because it is practically and algorithmically interesting, and simple enough to be easily ported to multiple languages. Meanwhile, I am also adding two new programming languages: Mozilla&#8217;s <a href="http://www.rust-lang.org/">Rust</a> and Google&#8217;s <a href="http://www.dartlang.org/">Dart</a>. They are probably the most promising languages announced in the past two years.</p>
<h3>Results</h3>
<p>In this small benchmark, I am implementing Sudoku solvers in multiple programming languages. The algorithm, adapted from <a href="http://magictour.free.fr/suexco.txt">Guenter Stertenbrink&#8217;s solver</a>, was first implemented in C and then ported to other languages. <a href="https://github.com/attractivechaos/plb/blob/master/sudoku/sudoku_v1.c">The C source code</a> briefly describes the method. For more information about Sudoku solving in general, please see <a href="http://attractivechaos.wordpress.com/2011/06/19/an-incomplete-review-of-sudoku-solver-implementations/">my other post</a>.</p>
<p>Before I show the results, there are a couple of caveats to note:</p>
<ul>
<li>Solving Sudoku is NP-hard. The choice of the solving algorithm will dramatically affect the speed. For example, <a href="https://github.com/attractivechaos/plb/blob/master/sudoku/sudoku_v1.rs">my Rust implementation</a> is ~2500 times faster than <a href="https://github.com/mozilla/rust/blob/master/src/test/bench/sudoku.rs">the one in the Rust official repository</a>. For a language benchmark, we must implement exactly the same algorithm.
<li>I am mostly familiar with C but am pretty much a newbie in other programming languages. I am sure some implementations are not optimal. If you can improve the code, please send me a pull request. I am happy to replace with a better version.
</ul>
<p>The following table shows the CPU time for solving <a href="https://github.com/attractivechaos/plb/blob/master/sudoku/sudoku.txt">20 hard Sudokus</a> repeated <strike>50</strike> 500 times (thus <strike>1000</strike> 10000 Sudokus in total). The programs, which are <a href="https://github.com/attractivechaos/plb/tree/master/sudoku">freely available</a>, are compiled and run on my Mac laptop with a 2.66GHz Core i7 CPU.</p>
<table border="1" cellpadding="100" cellspacing="0">
<tr style="background-color:lightgray;">
<th>Compiler/VM
<th>Version
<th>Language
<th>Option
<th>CPU time (sec)</tr>
<tr>
<th>clang
<td>425.0.27 (3.2svn)
<td>C
<td>-O2
<td>8.92</tr>
<tr>
<th>llvm-gcc
<td>4.2.1
<td>C
<td>-O2
<td>9.23</tr>
<tr>
<th>dmd
<td>2.062
<td>D2
<td>-O -release<br />-noboundscheck
<td>11.54<br />11.47</tr>
<tr>
<th>rust
<td>0.6
<td>Rust
<td>&#8211;opt-level 3
<td>11.51</tr>
<tr>
<th>java
<td>1.6.0_37
<td>Java
<td>-d64
<td>11.57</tr>
<tr>
<th>go
<td>1.1beta 20130406
<td>Go
<td>(default)<br />-gcflags -B
<td>14.96<br />13.78</tr>
<tr>
<th>dart
<td>0.4.4.4-r20810
<td>Dart
<td>
<td>21.42</tr>
<tr>
<th>v8
<td>3.16.3
<td>Javascript
<td>
<td>28.19</tr>
<tr>
<th>luajit
<td>2.0.1
<td>Lua
<td>
<td>30.66</tr>
<tr>
<th>pypy
<td>2.0-beta-130405
<td>Python
<td>
<td>44.29</tr>
</table>
<p>In this small benchmark, C still takes the crown of speed, <strike>Other statically typed languages are about twice as slow</strike> but Rust and D are very close to C. It is pretty amazing that Rust as a new language is that performant given the developers have not put too much efforts on speed so far.</p>
<p>Among dynamically typed languages, Dart, V8 and LuaJIT are similar in speed, about 3 times as slow as C. 3 times is arguably not much to many applications. I really hope some day I can use a handy dynamically typed language for most programming. Pypy is slower here, but it is more than twice as fast as the version two years ago.</p>
<h3>Related resources</h3>
<ul>
<li><a href="http://attractivechaos.github.io/plb/">Old benchmark results</a>.
<li><a href="https://github.com/attractivechaos/plb/tree/master/sudoku">Sudoku solver source code based on the same algorithm</a>.
<li><a href="http://attractivechaos.github.io/plb/kudoku.html">Online Sudoku solver using the same algorithm</a>.
<li><a href="https://github.com/attractivechaos/plb/tree/master/sudoku/incoming">Third-party sudoku solvers</a>.
<li><a href="http://www.reddit.com/r/programming/comments/1bs479/performance_of_rust_and_dart_in_sudoku_solving/">Reddit discussions</a>.
<li><a href="https://news.ycombinator.com/item?id=5502884">Hacker News discussions</a>.
</ul>
<h3>Update</h3>
<ul>
<li>I forgot to use `-release&#8217; with dmd. The new result looks much better. Sorry for my mistake.
<li>Mac ships gcc-4.2.1 only due to licensing issues. I have just tried both gcc 4.7.2 and gcc 4.8 from MacPorts. The executables compiled by them take 0.99 second to run, slower than gcc-4.2.1.
<li>Updated to the latest Go compiled from the repository.
<li>Updated the Python implementation (thanks to <a href="https://github.com/rob-smallshire">Rob Smallshire</a>).
<li>Updated the Dart implementation (thanks to <a href="https://github.com/jwendel">jwendel</a>).
<li>Updated the Rust implementation (thanks to <a href="https://github.com/dotdash">dotdash</a>).
<li>Made input 10 times larger to reduce the fraction of time spent on VM startup. Dart/V8/LuaJIT have short VM startup time, but Java is known to have a long startup.
<li>Updated the Go implementation (thanks to <a href="https://github.com/spaolacci">Sébastien Paolacci</a>).
<li>Updated the Python implementation.
</ul>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1225/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1225/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1225&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2013/04/06/performance-of-rust-and-dart-in-sudoku-solving/feed/</wfw:commentRss>
		<slash:comments>60</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>Weekend project: K8 revived</title>
		<link>http://attractivechaos.wordpress.com/2012/12/24/weekend-project-k8-revived/</link>
		<comments>http://attractivechaos.wordpress.com/2012/12/24/weekend-project-k8-revived/#comments</comments>
		<pubDate>Mon, 24 Dec 2012 16:08:09 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[C]]></category>
		<category><![CDATA[Javascript]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1221</guid>
		<description><![CDATA[Around a weekend two years ago, I wrote a Javascript shell, K8, based on Google&#8217;s V8 Javascript engine. It aimed to provide basic file I/O that was surprisingly lacking from nearly all Javascript shells that time. I have spent little time on that project since then. K8 is not compatible with the latest V8 any [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1221&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Around a weekend two years ago, I <a href="http://attractivechaos.wordpress.com/2011/01/13/the-k8-javascript-shell/">wrote</a> a Javascript shell, <a href="https://github.com/attractivechaos/k8">K8</a>, based on Google&#8217;s <a href="http://code.google.com/p/v8/">V8 Javascript engine</a>. It aimed to provide basic file I/O that was surprisingly lacking from nearly all Javascript shells that time. I have spent little time on that project since then. K8 is not compatible with the latest V8 any more.</p>
<p>Two years later, the situation of Javascript shells has not been changed much. Most of them, including Dart, still lack usable file I/O for general-purpose text processing, one of the most fundamental functionality in other programming languages from the low-level C to Java/D to the high-level Perl/Python. Web developers seem to follow a distinct programming paradigm in comparison to typical Unix programmers and programmers in my field.</p>
<p>This weekend, I revived K8, partly as an exercise and partly as my response to the appropriate file I/O APIs in Javascript. K8 is written in a 600-line C++ file. It is much smaller than other JS shells, but it provides features that I need most but lack from Javascript and other JS shells. You can find the docuemtation from <a href="http://attractivechaos.wordpress.com/2011/01/13/the-k8-javascript-shell/">K8 github</a>. I will only show an example:</p>
<pre class="brush: jscript; title: ; notranslate">
var x = new Bytes(), y = new Bytes();
x.set('foo'); x.set([0x20,0x20]); x.set('bar'); x.set('F', 0); x[3]=0x2c;
print(x.toString())   // output: 'Foo, bar'
y.set('BAR'); x.set(y, 5)
print(x)              // output: 'Foo, BAR'
x.destroy(); y.destroy()

if (arguments.length) { // read and print file
  var x = new Bytes(), s = new iStream(new File(arguments[0]));
  while (s.readline(x) &gt;= 0) print(x)
  s.close(); x.destroy();
}
</pre>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1221/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1221/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1221&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/12/24/weekend-project-k8-revived/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>My Concerns with the Dart Programming Language</title>
		<link>http://attractivechaos.wordpress.com/2012/12/07/my-concerns-with-the-dart-programming-language/</link>
		<comments>http://attractivechaos.wordpress.com/2012/12/07/my-concerns-with-the-dart-programming-language/#comments</comments>
		<pubDate>Fri, 07 Dec 2012 18:23:44 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[Dart]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1214</guid>
		<description><![CDATA[I have played with Dart a little bit. Although overall the language is interesting and full of potentials, it indeed has some rough edges. 1) Flawed comma operator. Dart will accept line 2 but report a syntax error at line 3. In C and Java, the line is perfectly legitimate. 2) Non-zero integers are different [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1214&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I have played with Dart a little bit. Although overall the language is interesting and full of potentials, it indeed has some rough edges.</p>
<p>1) Flawed comma operator.</p>
<pre class="brush: cpp; title: ; notranslate">
main() {
	int i = 1, j = 2;
	i = 2, j = 3;
}
</pre>
<p>Dart will accept line 2 but report a syntax error at line 3. In C and Java, the line is perfectly legitimate.</p>
<p>2) Non-zero integers are different from &#8220;true&#8221;.</p>
<pre class="brush: cpp; title: ; notranslate">
main() {
	if (1) print(&quot;true&quot;);
	else print(&quot;false&quot;);
}
</pre>
<p>The above program will output &#8220;false&#8221;, which will surprise most C/Java/Lua/Perl programmers.</p>
<p>3) No &#8220;real&#8221; dynamic arrays.</p>
<pre class="brush: cpp; title: ; notranslate">
main() {
	var a = [];
	a[0] = 1;
}
</pre>
<p>Dart will report a run-time error at line 3. Most scripting languages will automatically expand an array. I know disabling this feature helps to prevent errors, but I always feel it is very inconvenient.</p>
<p>4) No easy ways to declare a constant-sized array. As Dart does not automatically expand arrays, to declare an array of size 10, you have to do this:</p>
<pre class="brush: cpp; title: ; notranslate">
main() {
	var a = new List(10);
}
</pre>
<p>It is more verbose than &#8220;int a[10]&#8221; in C.</p>
<p>5) No on-stack replacement (OSR). I discussed this point in my last post, but I still feel it is necessary to emphasize again: if you do not know well how Dart works or are not careful enough, the bottleneck of your code may be interpreted but not compiled and then you will experience bad performance. The Dart developers argued that Dart is tuned for real-world performance, but in my view, if a language does not work well with micro-benchmarks, it has a higher chance to deliever bad performance in larger applications.</p>
<p>6) Lack of C/Perl-like file reading. The following is the Dart way to read a file by line:</p>
<pre class="brush: cpp; title: ; notranslate">
main() {
	List&lt;String&gt; argv = new Options().arguments;
	var fp = new StringInputStream(new File(argv[0]).openInputStream(), Encoding.ASCII);
	fp.onLine = () {
		print(fp.readLine());
	};
}
</pre>
<p>Note that you have to use callback to achieve that. This is firstly verbose in comparison to other scripting languages and more importantly, it is very awkward to work with multiple files at the same time. I discussed the motivation of the design with Dart developers and buy their argument that such APIs are useful for event driven server applications. However, for most programmers in my field whose routine work is text processing for huge files, lack of C-like I/O is a showstopper. Although the Dart developers pointed out that openSync() and readSyncList() are closer to C APIs, openSync() does not work on STDIN (and Dart&#8217;s built-in STDIN still relies on callback). APIs are also significantly are lacking. For example, dart:io provides APIs to read the entire file as lines, but no APIs to read a single line. In my field, the golden rule is to always avoid reading the entire file into memory. This is probably the recommendation for most text processing.</p>
<p>On file I/O, Dart is not alone. Node.js and most shells built upon Javascript barely provides usable file I/O APIs. It is clear that these developers do not understand the needs of a large fraction of UNIX developers and bioinformatics developers like me.</p>
<p><b>Summary:</b> Generally, Dart is designed for web development and for server-side applications. The design of Dart (largely the design of APIs) does not fit well for other applications. In principle, nothing stops Dart becoming a widely used general-purpose programming language like Python, but in this trend, it will only be a replacement, at the best, of javascript, or perhaps more precisely, node.js. At the same time, I admit I know little about server-side programming. It would be really good if different camps of programmers work together to come to a really great programming language. Dart is not there, at least not yet.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1214/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1214/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1214&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/12/07/my-concerns-with-the-dart-programming-language/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>Dart: revisiting matrix multiplication</title>
		<link>http://attractivechaos.wordpress.com/2012/10/26/dart-revisiting-matrix-multiplication/</link>
		<comments>http://attractivechaos.wordpress.com/2012/10/26/dart-revisiting-matrix-multiplication/#comments</comments>
		<pubDate>Fri, 26 Oct 2012 14:56:52 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[Dart]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1200</guid>
		<description><![CDATA[First of all, I know little about JIT and VM. Some of what I said below may well be wrong, so read this blog post with a grain of salt. My previous microbenchmark showed that dart is unable to optimize the code to achieve comparable speed to LuaJIT. Vyacheslav Egorov commented that the key reason [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1200&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>First of all, I know little about JIT and VM. Some of what I said below may well be wrong, so read this blog post with a grain of salt.</p>
<p><a href="http://attractivechaos.wordpress.com/2012/10/18/initial-evaluation-of-the-dart-performance/">My previous microbenchmark</a> showed that dart is unable to optimize the code to achieve comparable speed to LuaJIT. Vyacheslav Egorov <a href="http://attractivechaos.wordpress.com/2012/10/18/initial-evaluation-of-the-dart-performance/#comment-1139">commented</a> that the key reason is that I have not &#8220;warmed up&#8221; the code. John McCutchan further wrote <a href="http://www.dartlang.org/articles/benchmarking/">an article</a> about how to perform microbenchmarks, also emphasizing the importance of warm-up. After reading <a href="https://groups.google.com/a/dartlang.org/forum/?fromgroups=#!topic/misc/xNUU2lp_Dt0">a thread</a> on the Dart mailing list, I know better about the importance of warm-up now.</p>
<p>If I am right, JIT can be classified as more traditional method JIT whereby the VM compiles a method/function to machine code at a time, and tracing JIT whereby the VM may optimize a single loop.  There is a <a href="http://lambda-the-ultimate.org/node/3851">long discussion</a> on Lambda the Ultimate about them. Typically, method JIT needs to identify hot functions and then compile AFTER the method call is finished. What I did not know previously is On Stack Replacement (OSR), with which we are able to compile a method (or part of the method?) to machine code while it is running. This in some way blurs the boundary between method JIT and tracing JIT.</p>
<p>Among popular JIT implementations, V8 and Java use method JIT with OSR, while Pypy and LuaJIT use tracing JIT. They are all able to perform well for matrix multiplication even if the hot method is called only once. In my previous post, Dart has bad performance because it uses method JIT but without OSR. It is unable to optimize the hot function while it is being executed. The Dart development team argued that the lack of OSR is because implementing OSR is complicated and &#8220;experience with Javascript and Java programs has shown that it very rarely benefits real applications.&#8221;</p>
<p>I hold the opposite opinion, strongly. There is no clear distinction between benchmarks and real applications. It is true that in web development, a program rarely spends more than a few seconds in a function called only once, but there are more real applications than web. In my daily work, I may need to do Smith-Waterman alignment between two long sequences or to compute the first few eigenvalues of a huge positive matrix. The core functions will be called only once. I have also written many one-off scripts having only a main function. Without OSR, Dart won&#8217;t perform better than Perl/Python, either, I guess. If the Dart development team want Dart to be widely adopted in addition to web development, OSR will be a key feature (well, a general-purpose language may not be the goal of Dart, which would be a pity!). I wholeheartedly hope they can implement OSR in future.</p>
<p>Fortunately, before OSR gets implemented in Dart (if ever), there is a simpler and more practical solution than warm-up: hoisting the content in the hot loop into a function to allow Dart compiles that function to machine code after it is called for a few times (though to do this, you need to know which loop is hot).</p>
<p>At the end of the post is an updated implementation of matrix multiplication, where &#8220;mat_mul1()&#8221; and &#8220;mat_mul2()&#8221; have the same functionality but differ in the use of function. The new implementation (mat_mul2) multiplies two 500&#215;500 matrices in 1.0 second, as opposed to 14 seconds by the old one (mat_mul1). This is still much slower than LuaJIT (0.2 second) and V8 (0.3 second), but I would expect Dart to catch up in the future. Actually Vyacheslav commented that a nightly build might have already achieved or approached that.</p>
<p><b>SUMMARY:</b> Dart as of now only compiles an entire method to machine code, but it cannot compile the method while it is running. Therefore, if the hot method is called only once, it will not be compiled and you will experience bad performance. An effective solution is to hoist the content of the hot loop to a separate function such that Dart can compile the function after it is executed a few times.</p>
<pre class="brush: cpp; title: ; notranslate">
mat_transpose(a)
{
	int m = a.length, n = a[0].length; // m rows and n cols
	var b = new List(n);
	for (int j = 0; j &lt; n; ++j) b[j] = new List&lt;double&gt;(m);
	for (int i = 0; i &lt; m; ++i)
		for (int j = 0; j &lt; n; ++j)
			b[j][i] = a[i][j];
	return b;
}

mat_mul1(a, b)
{
	int m = a.length, n = a[0].length, s = b.length, t = b[0].length;
	if (n != s) return null;
	var x = new List(m), c = mat_transpose(b);
	for (int i = 0; i &lt; m; ++i) {
		x[i] = new List&lt;double&gt;(t);
		for (int j = 0; j &lt; t; ++j) {
			double sum = 0.0;
			for (int k = 0; k &lt; n; ++k) sum += a[i][k] * c[j][k];
			x[i][j] = sum;
		}
	}
	return x;
}

mat_mul2(a, b)
{
	inner_loop(t, n, ai, c)
	{
		var xi = new List&lt;double&gt;(t);
		for (int j = 0; j &lt; t; ++j) {
			double sum = 0.0;
			for (int k = 0; k &lt; n; ++k) sum += ai[k] * c[j][k];
			xi[j] = sum;
		}
		return xi;
	}

	int m = a.length, n = a[0].length, s = b.length, t = b[0].length;
	if (n != s) return null;
	var x = new List(m), c = mat_transpose(b);
	for (int i = 0; i &lt; m; ++i)
		x[i] = inner_loop(t, n, a[i], c);
	return x;
}

mat_gen(int n)
{
	var a = new List(n);
	double t = 1.0 / n / n;
	for (int i = 0; i &lt; n; ++i) {
		a[i] = new List&lt;double&gt;(n);
		for (int j = 0; j &lt; n; ++j)
			a[i][j] = t * (i - j) * (i + j);
	}
	return a;
}

main()
{
	int n = 500;
	var a = mat_gen(n), b = mat_gen(n);
	var c = mat_mul2(a, b);
	print(c[n~/2][n~/2]);
}
</pre>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1200/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1200/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1200&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/10/26/dart-revisiting-matrix-multiplication/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>Initial evaluation of the dart performance</title>
		<link>http://attractivechaos.wordpress.com/2012/10/18/initial-evaluation-of-the-dart-performance/</link>
		<comments>http://attractivechaos.wordpress.com/2012/10/18/initial-evaluation-of-the-dart-performance/#comments</comments>
		<pubDate>Thu, 18 Oct 2012 04:24:52 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1181</guid>
		<description><![CDATA[A quick post. I implemented matrix multiplication in Dart. It takes Dart 12 seconds to multiply two 500&#215;500 matrices. In contrast, LuaJIT does the same job in less than 0.2 seconds. Perl takes 26 seconds. This means that Dart fails to JIT the critical loop even though I am trying to use explicit typing. Dart [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1181&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A quick post. I implemented matrix multiplication in Dart. It takes Dart 12 seconds to multiply two 500&#215;500 matrices. In contrast, LuaJIT does the same job in less than 0.2 seconds. Perl takes 26 seconds. This means that Dart fails to JIT the critical loop even though I am trying to use explicit typing. Dart is not quite there yet to match the performance of V8 and LuaJIT. The source code is appended. It looks almost the same as C++.</p>
<pre class="brush: cpp; title: ; notranslate">
mat_transpose(a)
{
	int m = a.length, n = a[0].length; // m rows and n cols
	var b = new List(n);
	for (int j = 0; j &lt; n; ++j) b[j] = new List&lt;double&gt;(m);
	for (int i = 0; i &lt; m; ++i)
		for (int j = 0; j &lt; n; ++j)
			b[j][i] = a[i][j];
	return b;
}

mat_mul(a, b)
{
	int m = a.length, n = a[0].length, s = b.length, t = b[0].length;
	if (n != s) return null;
	var x = new List(m), c = mat_transpose(b);
	for (int i = 0; i &lt; m; ++i) {
		x[i] = new List&lt;double&gt;(t);
		for (int j = 0; j &lt; t; ++j) {
			double sum = 0;
			var ai = a[i], cj = c[j];
			for (int k = 0; k &lt; n; ++k) sum += ai[k] * cj[k];
			x[i][j] = sum;
		}
	}
	return x;
}

mat_gen(int n)
{
	var a = new List(n);
	double t = 1.0 / n / n;
	for (int i = 0; i &lt; n; ++i) {
		a[i] = new List&lt;double&gt;(n);
		for (int j = 0; j &lt; n; ++j)
			a[i][j] = t * (i - j) * (i + j);
	}
	return a;
}

main()
{
	int n = 500;
	var a = mat_gen(n);
	var b = mat_gen(n);
	var c = mat_mul(a, b);
	print(c[n~/2][n~/2]);
}
</pre>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1181/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1181&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/10/18/initial-evaluation-of-the-dart-performance/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>The dart programming language</title>
		<link>http://attractivechaos.wordpress.com/2012/10/17/the-dart-programming-language/</link>
		<comments>http://attractivechaos.wordpress.com/2012/10/17/the-dart-programming-language/#comments</comments>
		<pubDate>Wed, 17 Oct 2012 13:24:43 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[development]]></category>
		<category><![CDATA[Dart]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1178</guid>
		<description><![CDATA[The first dart SDK is released today. Since the initial announcement, most web developers have been strongly against dart. The typical argument is that javascript meets our needs and even if it does not there are a bunch of other languages translated to javascript. Why do we need a new one? Because google can take [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1178&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>The first dart SDK is <a href="http://blog.chromium.org/2012/10/dart-m1-release.html">released today</a>. Since the initial announcement, most web developers have been strongly against dart. The typical argument is that javascript meets our needs and even if it does not there are a bunch of other languages translated to javascript. Why do we need a new one? Because google can take control over it?</p>
<p>While these arguments are true, I see dart in the angle of a command-line tool developer. For javascript or a language translated to javascript, such as coffeescript, it cannot provide basic file I/O and system utilities, which makes it not suitable for developing command-line tools at all. A few years ago when I investigated nodejs, it did not provide proper file I/O, either (it seems much better now, but I have not tried). Another problem with Javascript is that it was not designed for JIT at the beginning. Naively, a language designed with JIT in mind is likely to perform better.</p>
<p>From a quick look, Dart apparently overcomes the major weakness of javascript mentioned above. It has <a href="http://www.dartlang.org/docs/dart-up-and-running/ch02.html">clean C++-like syntax</a> with modern language features, inherites the flexibility of javascript, supports at least basic I/O and system utilities (perhaps a replacement of nodejs?), and is designed for JIT from the beginning. I have not evaluated its performance, but I would expect it will compete or outperform V8 in the long run, though the release note seems to suggest that right now V8 is faster. I will evaluate its performance when I have time.</p>
<p>I have to admit that I am a little anti-google in general (not much), but I applaud google&#8217;s decision on developing the dart programming language amidst massively axing other projects. From a quick tour, it seems to be the closest to <a href="http://attractivechaos.wordpress.com/2011/10/13/the-ideal-programming-language-in-my-mind-a-rant/">the ideal programming language in my mind</a> (EDIT: I should add that this is the ideal scripting language; no matter how much I like dart, I will always use C for performance-critical tasks).</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1178/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1178/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1178&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/10/17/the-dart-programming-language/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>An update on radix sort</title>
		<link>http://attractivechaos.wordpress.com/2012/06/10/an-update-on-radix-sort/</link>
		<comments>http://attractivechaos.wordpress.com/2012/06/10/an-update-on-radix-sort/#comments</comments>
		<pubDate>Sun, 10 Jun 2012 03:43:41 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1169</guid>
		<description><![CDATA[As sorting is the bottleneck of my application, I decided to optimize the code further with reference to this wonderful article. The optimized the version is about 40% faster than my original one. It is now about 2.5 times as fast as STL&#8217;s std::sort on random 32-bit integer arrays. The optimized version is slightly slower [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1169&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>As sorting is the bottleneck of my application, I decided to optimize the code further with reference to <a href="http://www.drdobbs.com/architecture-and-design/221600153">this wonderful article</a>. The optimized the version is about 40% faster than my original one. It is now about 2.5 times as fast as STL&#8217;s std::sort on random 32-bit integer arrays. The optimized version is slightly slower than the implementation in that article, but it is much faster on sorted arrays.</p>
<p>For sorting large integer arrays, radix sort is the king. It is much faster than other standard algorithms and is arguably simpler.</p>
<pre class="brush: cpp; title: ; notranslate">
#define rstype_t uint64_t // type of the array
#define rskey(x) (x) // specify how to get the integer from rstype_t

#define RS_MIN_SIZE 64 // for an array smaller than this, use insertion sort

typedef struct {
  rstype_t *b, *e; // begin and end of each bucket
} rsbucket_t;

void rs_insertsort(rstype_t *beg, rstype_t *end) // insertion sort
{
  rstype_t *i;
  for (i = beg + 1; i &lt; end; ++i)
    if (rskey(*i) &lt; rskey(*(i - 1))) {
      rstype_t *j, tmp = *i;
      for (j = i; j &gt; beg &amp;&amp; rskey(tmp) &lt; rskey(*(j-1)); --j)
        *j = *(j - 1);
      *j = tmp;
    }
}
// sort between [$beg, $end); take radix from &quot;&gt;&gt;$s&amp;((1&lt;&lt;$n_bits)-1)&quot;
void rs_sort(rstype_t *beg, rstype_t *end, int n_bits, int s)
{
  rstype_t *i;
  int size = 1&lt;&lt;n_bits, m = size - 1;
  rsbucket_t *k, b[size], *be = b + size; // b[] keeps all the buckets

  for (k = b; k != be; ++k) k-&gt;b = k-&gt;e = beg;
  for (i = beg; i != end; ++i) ++b[rskey(*i)&gt;&gt;s&amp;m].e; // count radix
  for (k = b + 1; k != be; ++k) // set start and end of each bucket
    k-&gt;e += (k-1)-&gt;e - beg, k-&gt;b = (k-1)-&gt;e;
  for (k = b; k != be;) { // in-place classification based on radix
    if (k-&gt;b != k-&gt;e) { // the bucket is not full
      rsbucket_t *l;
      if ((l = b + (rskey(*k-&gt;b)&gt;&gt;s&amp;m)) != k) { // different bucket
        rstype_t tmp = *k-&gt;b, swap;
        do { // swap until we find an element in bucket $k
          swap = tmp; tmp = *l-&gt;b; *l-&gt;b++ = swap;
          l = b + (rskey(tmp)&gt;&gt;s&amp;m);
        } while (l != k);
        *k-&gt;b++ = tmp; // push the found element to $k
      } else ++k-&gt;b; // move to the next element in the bucket
    } else ++k; // move to the next bucket
  }
  for (b-&gt;b = beg, k = b + 1; k != be; ++k) k-&gt;b = (k-1)-&gt;e; // reset k-&gt;b
  if (s) { // if $s is non-zero, we need to sort buckets
    s = s &gt; n_bits? s - n_bits : 0;
    for (k = b; k != be; ++k)
      if (k-&gt;e - k-&gt;b &gt; RS_MIN_SIZE) rs_sort(k-&gt;b, k-&gt;e, n_bits, s);
      else if (k-&gt;e - k-&gt;b &gt; 1) rs_insertsort(k-&gt;b, k-&gt;e);
  }
}

void radix_sort(rstype_t *beg, rstype_t *end)
{
  if (end - beg &lt;= RS_MIN_SIZE) rs_insertsort(beg, end);
  else rs_sort(beg, end, 8, sizeof(rskey(*beg)) * 8 - 8);
}
</pre>
<p>EDIT: Just found <a href="https://github.com/gorset/radix">this implementation</a>. It is as fast as mine and is simpler. Recommended.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1169/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1169/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1169&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/06/10/an-update-on-radix-sort/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>A quick note on radix sort</title>
		<link>http://attractivechaos.wordpress.com/2012/06/07/a-quick-note-on-radix-sort/</link>
		<comments>http://attractivechaos.wordpress.com/2012/06/07/a-quick-note-on-radix-sort/#comments</comments>
		<pubDate>Thu, 07 Jun 2012 05:42:04 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1153</guid>
		<description><![CDATA[I am recently working on an algorithm, which surprisingly spends more than half of its time on sorting huge partially ordered arrays of 64-bit integer pairs (one for key and the other for value). Naturally, I want to optimize sorting such arrays. Initially, I tried my implementation of introsort. The program takes about 90 seconds [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1153&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I am recently working on an algorithm, which surprisingly spends more than half of its time on sorting huge partially ordered arrays of 64-bit integer pairs (one for key and the other for value). Naturally, I want to optimize sorting such arrays. Initially, I tried <a href="https://github.com/attractivechaos/klib/blob/master/ksort.h">my implementation</a> of <a href="http://en.wikipedia.org/wiki/Introsort">introsort</a>. The program takes about 90 seconds on a sample data set. I then switched to my iterative mergesort in the same library. It takes 55 seconds. I guess the mergesort is faster because the arrays are partially ordered. However, my implementation of mergesort requires a temporary array of the same size. As the arrays are huge, it is unacceptable to allocate this array for real data. It seems that implementing an in-place mergesort is quite challenging. Then I think of radix sort, which I have not implemented before.</p>
<p>My radix sort implementation is <a href="https://gist.github.com/2886685">here</a>. It is not written as a library, but it should be easy to be adapted to other data types. The C program is quite simple and is not much different from <a href="http://www.drdobbs.com/architecture-and-design/221600153?pgno=1">existing ones</a>.</p>
<p>How about the performance? With radix sort, my program takes 35 seconds using little extra working space. I get 100% speedup by replacing introsort with integer-only radix sort. To evaluate the performance of radix sort separately, I put the code in my old <a href="https://github.com/attractivechaos/klib/blob/master/test/ksort_test.cc">ksort_test.cc</a>. Here are the CPU seconds spent on sorting 50 million random or sorted integers:</p>
<table border="1" cellpadding="4" cellspacing="0">
<tr>
<th>Algorithm
<th>Sorted?
<th>Mac CPU (sec)
<th>Linux CPU</tr>
<tr>
<th>STL introsort
<td>No
<td>4.9
<td>5.1</tr>
<tr>
<th>STL introsort
<td>Yes
<td>0.9
<td>1.1</tr>
<tr>
<th>STL stablesort
<td>No
<td>6.7
<td>6.1</tr>
<tr>
<th>STL stablesort
<td>Yes
<td>2.0
<td>2.0</tr>
<tr>
<th>STL heapsort
<td>No
<td>54.1
<td>32.2</tr>
<tr>
<th>STL heapsort
<td>Yes
<td>4.5
<td>4.2</tr>
<tr>
<th>libc qsort
<td>No
<td>11.3
<td>9.7</tr>
<tr>
<th>ac&#8217;s radix
<td>No
<td>1.9
<td>2.0</tr>
<tr>
<th>ac&#8217;s radix
<td>Yes
<td>0.8
<td>0.9</tr>
<tr>
<th>ac&#8217;s combsort
<td>No
<td>11.9
<td>11.5</tr>
<tr>
<th>ac&#8217;s introsort
<td>No
<td>5.5
<td>5.7</tr>
<tr>
<th>ac&#8217;s introsort
<td>Yes
<td>7.4
<td>6.8</tr>
<tr>
<th>ac&#8217;s mergesort
<td>No
<td>6.1
<td>6.6</tr>
<tr>
<th>ac&#8217;s mergesort
<td>Yes
<td>2.1
<td>2.3</tr>
<tr>
<th>ac&#8217;s heapsort
<td>No
<td>54.4
<td>31.8</tr>
<tr>
<th>ac&#8217;s heapsort
<td>Yes
<td>4.9
<td>6.6</tr>
<tr>
<th>ph&#8217;s heapsort
<td>No
<td>54.6
<td>30.8</tr>
<tr>
<th>ph&#8217;s quicksort
<td>No
<td>5.6
<td>5.8</tr>
<tr>
<th>ph&#8217;s mergesort
<td>No
<td>7.0
<td>6.9</tr>
</table>
<p>Please see <a href="http://attractivechaos.wordpress.com/2008/08/28/comparison-of-internal-sorting-algorithms/">my old post</a> for the information on other algorithms and implementations. You can also clone my <a href="https://github.com/attractivechaos/klib">klib repository</a> and play with <a href="https://github.com/attractivechaos/klib/blob/master/test/ksort_test.cc">ksort_test.cc</a> by yourself.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1153/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1153/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1153&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/06/07/a-quick-note-on-radix-sort/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
		<item>
		<title>Rants on font size in Linux UI</title>
		<link>http://attractivechaos.wordpress.com/2012/06/05/rants-on-font-size-in-linux-ui/</link>
		<comments>http://attractivechaos.wordpress.com/2012/06/05/rants-on-font-size-in-linux-ui/#comments</comments>
		<pubDate>Tue, 05 Jun 2012 17:13:39 +0000</pubDate>
		<dc:creator>attractivechaos</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://attractivechaos.wordpress.com/?p=1151</guid>
		<description><![CDATA[Recently I saw Linus&#8217; post on Genome 3.4, where he was complaining that it is hard to make fonts smaller. This reminded me of my recent experience in installing Mint/Ubuntu under a virtual machine. I have exactly the same problem. In Unity, I wanted the font to be smaller, but I found I have to [&#8230;]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1151&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Recently I saw <a href="https://plus.google.com/102150693225130002912/posts/UkoAaLDpF4i">Linus&#8217; post</a> on Genome 3.4, where he was complaining that it is hard to make fonts smaller. This reminded me of my recent experience in installing Mint/Ubuntu under a virtual machine. I have exactly the same problem. In Unity, I wanted the font to be smaller, but I found I have to install a 3rd-party tweak tool that is not part of Ubuntu. I deleted the VM immediately.</p>
<p>Linus blamed Gnome not being customizable enough. I blamed Unity, too (in Xfce, it is easy to change font sizes), but I think a more fundamental problem is the default setting of Gnome3/Unity/Xfce. On Mac, I am happy with the default system fonts and font sizes all the time (I changed font settings in a couple of applications). I have not heard that a Windows user complaining about the system default, either.</p>
<p>No matter how fancy a user interface looks afar, if the fonts are ugly or in the wrong sizes, it is a complete failure. Why haven&#8217;t those UI designers realized this simple fact even till today? It really amazes me how the UI designers can live with such a big ugly font size, and they love such ugly fonts so much that they even disallow end users to change them!</p>
<p>Sorry for the rant.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/attractivechaos.wordpress.com/1151/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/attractivechaos.wordpress.com/1151/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=attractivechaos.wordpress.com&#038;blog=4545823&#038;post=1151&#038;subd=attractivechaos&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://attractivechaos.wordpress.com/2012/06/05/rants-on-font-size-in-linux-ui/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/047ebc7bb9ff37a0da844413856e92cb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">attractivechaos</media:title>
		</media:content>
	</item>
	</channel>
</rss>
