Skip to main content
Skip table of contents

What does hamming distance mean? What is the hamming distance of Lexogen’s i5 and i7 indices?

Hamming distance refers to the number of nucleotide exchanges (substitutions only) needed to convert one index sequence to another. Lexogen’s 6 nt i5 and i7 indices are designed with a minimum hamming distance of 3, meaning that each 6 nt i7 index differs from all other 6 nt i7 indices by at least 3 nucleotides.

 

With a hamming distance of 3, you can detect up to 2 errors (mismatches). However, if you want to perform error correction, only 1 error (or mismatch) can be detected and corrected. For the 6 nt indexing system, we recommend turning off error correction during demultiplexing and allowing for zero mismatches. This results in slightly higher rates of unidentified reads. However, the accuracy of index identification is increased by avoiding mis-assignment, thus giving you more accurate sequencing reads for each library (see also FAQ 5.5).

The design of the advanced Lexogen UDI 12 nt Unique Dual Indexing system is not solely based on hamming distance, which only takes nucleotide substitutions into account, but is built on a global distance measure that accounts for substitutions, insertions, and deletions. This improves the error correction capability for the 12 nt UDIs using Lexogen’s idemuxCPP Tool for demultiplexing and error correction. For a set of 96 samples with 12 nt UDI read-out, the distance is 5, which allows for confident error correction of up to 2 index sequence errors. This is unique to Lexogen’s patented 12 nt UDI indexing system.

For further information on Hamming distance and the index sequence design of Lexogen Indexing Solutions, check out the RNA LEXICON Chapter #9 – Indexing Strategies and Solutions.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.