BurrowsWheelerTransformation (BWT)
The BWT is named after Michael Burrows and David J. Wheeler, who had published this procedure in their article "A Blocksorting Lossless Data Compression Algorithm" in 1994. The fundamental algorithm had been developed by David J. Wheeler in 1983.
Principle of the algorithm
A block of original data will be converted into a certain form and will be sorted afterwards. The result is a sequence of ordered data that reflect frequently appearing symbol combinations by repetitions.
Example: transformed text data
PeterｷPiperｷpickedｷaｷpeckｷofｷpickledｷpeppers.
ｷAｷpeckｷofｷpickledｷpeppersｷPeterｷPiperｷpicked.
v
覧覧覧覧覧覧覧覧覧覧覧覧覧覧覧
BurrowsWheelerTransformation
覧覧覧覧覧覧覧覧覧覧覧覧覧覧覧
v
.dkkAaddsrrffrrsdｷｷeeiiiieeeeppkllkppppttppPP
ooppppPPcccccckkｷｷｷｷｷｷiipp.ｷｷｷｷｷｷｷeeeeeeeerree
Both data blocks consist of identical symbols, only varying in their order. The transformed data may be encoded substantially better, e.g. by adaptive Huffman or arithmetic coding.
The total amount of symbols slightly increases, because additional information are required allowing the reconstruction of the original data.
In real application the number of symbols in a block is however very large and the additional expenditure very small. A typical block size amounts to 900 KByte.
< ^ >

Huffman Code []
Adaptive Huffman Code []
Arithmetic Coding []

