Posts Tagged ‘opinion’

ZOOM [PMID:18684737] is still unavailable even when the manuscript goes online. For the time being, there is no way to confirm whether their benchmarks are unbiased. Fortunately, we can collect some information from what they have presented. In the ZOOM paper, the authors give the memory consumption of ZOOM. It is 2.9GB for 12 million short reads. So far as I know, Eland will only use 30~40% of this memory. If Anthony Cox had meant to use more memory, he would have achieved faster speed. In addition, apparently ZOOM only reports the unique best hit according to this post. Eland will count the occurrences of all the 0-, 1- and 2-mismatch hits anyway. In fact, if you read eland source codes, you will know that counting occurrences is non-trivial and it is much more expensive to implement counting in ZOOM than in Eland. I bet ZOOM would be slower than Eland if it generates the same output as Eland. Of course, it is not always necessary to generate the same output, but detecting unique best hits alone are usually not good enough in some applications, such as detecting structural variations. We need to know whether there are many suboptimal hits with scores close to the best one.

Although the benchmark is biased, the ZOOM paper itself indeed presents an elegant idea about using non-contiguous seeds. I also believe ZOOM is highly tuned for performance by a group of strong programmers. Furthermore, even if ZOOM is slower than Eland, I would still prefer ZOOM as it eliminates the hard limits on the 32bp read length and the maximum two mismatches. I do not mind if ZOOM is a bit slower. I just want to see a fair comparison.

Update: ZOOM has been released and I have tried it a bit. The claim in the paper is honest: ZOOM is indeed faster than Eland while allowing longer reads. Nonetheless, ZOOM only reports unique best hits and therefore my claim here is fair, too: Eland is giving more information, which is one of the reasons that it is slower. Although ZOOM can output N top hits, apparently we can only achieve this by dumping all the potential hits. On human alignment, this is impractical.

Note that I pay a lot of attention to ZOOM mainly because it is a great software. In terms of efficiency and usability, ZOOM is much better than many other similar software that I have not mentioned here.

Read Full Post »