Eland is a computer program that aligns short oligonucleotides against a reference genome. It is written by Anthony Cox from the Illumina company. The source codes are freely available to machine buyers. Eland is the first program for short read alignment and it profoundly influences most of its successors.
Eland guarantees full sensitivity for hits with up to two mismatches. To achieve this, Eland divides a read into four parts with approximately equal lengths. Six noncontiguous seed templates can be constructed with each template covering two parts. Eland applies each seed templates on reads and indexes the sequences that pass the template. After indexing it scans the reference genome sequence base by base and then looks up each K-mer in the index to find hits. As any two mismatches can only occur to at most two of the four parts, the seeding strategy used by Eland guarantees to find all 2-mismatch hits.
Eland might be the first widely used program that achieves full sensitivity given a threshold. Most of Eland successors learn from this point, and some software, such as Maq and SeqMap, even use the same seeding strategy. SOLiD read mapping software and ZOOM further extend the idea of using noncontiguous seed templates to achieve full sensitivity with fewer templates.
Eland is so fast that it effectively sets a goal for all its successors. Although several software are claimed to achieve so, they cannot retain the same useful information as Eland. For example, frequently a program only gives the unique best hit without couting the occurrences of a read. Counting greatly helps to reduce false alignment in some cases, but implementing this is non-trivial. Comparing Eland to a program without counting is unfair.
Eland is not perfect, though. Natively, it does not support reads longer than 32bp, paired end mapping nor gapped alignment. Eland provides several scripts to overcome these limitations, but the power is reduced. In particular, even with the help of the additional scripts, Eland will miss a read alignment if the first 32bp of the read is nonunique or if the read has more than 10 identical hits in the reference genome. These limitations give the room for other short read aligners.