Data Compression


Criteria

Survey Formats

Basics

Basic Terms

Variable Length Codes

Interpretation

Prefix Codes

Distribution of Code Lengths

Code Trees

Compression Methods

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Interpretation


The main requirement is unambiguous interpretation. To reconstruct the original symbols the coding scheme must be completely reversible. For the following examples rules can be established allowing unique decoding of an arbitrary data stream:


   Example A     Example B
   a  -  0       a  -  1
   b  -  10      b  -  10
   c  -  11      c  -  100

In the first example (A) an initial 0 always identifies the symbol 'a'. If a new code starts with a 1, the following value distinguishs between the symbols 'b' and 'c'. Afterwards a new code will follow at any time.


In the second example (B) a new code is always initiated by the value 1. The number of 0s following identify the particular symbol (no 0 -> 'a', one 0 -> 'b', two 0 -> 'c').


In contrast to this stands the following example. An unambiguous interpretation of this code is impossible because the structure of the code does not allow conclusions about the particular code length.


   a  -  0          ("0000" -- "aaaa")
   b  -  00         ("0000" -- "bb")
   c  -  01

The sequence "0000" could be interpreted as "aaaa" or "bb" as well. Such data cannot be decoded without explicit information about the code length. This circumstances lead to the prefix property.


 <   ^   > 

Variable Length Codes Variable Length Codes Prefix Codes (Prefix Property)