Data Compression


Criteria

Survey Formats

Basics

Compression Methods

Data Formats


Huffman Code

Example

Characteristics

Variants

Dynamic Huffman Code

Adaptive Huffman Code

Initialization

Algorithm

Table Code Tree

Update Procedure

New Symbols

Formats

Decoding

Update Code Tree

Decoding

Example


Glossary

Index


Download


www.BinaryEssence.com

Formats for New Symbols


Immediately after the control character (NYA) follows the new symbol which is not encoded by the Huffman code. In principle a large variety of formats is available to encode the new symbol. The simplest example would be linear encoding of the entire range of values.


With regard to the compression, it is advisable to select a more efficient format adapting the code length to the number of remaining symbols. Through this the code length decreases continuously until the last two symbols can be encoded with a single bit.


For realization an ordered list must be defined by both the encoder and decoder. The size of this list will be reduced step-by-step during the coding process.


Subsequently a method which adjusts the required code length to the number of remained symbols is described. The range of values is exploited better in contrast to linear coding with constant code length. Two different lengths which are assigned according to the following scheme are used:


Z: number of remaining symbols
b: required code length [in bit]
r: "remainder", that cannot be
   encoded with code length b
n: code number (1 - Z)

conditions:  Z = 2b + r
             and
             0 <= r < 2b

code length range of values  coding
 b+1 bit     1    to 2r       n-1
   b bit     2r+1 to Z        n-r-1

Example with 5 symbols:


Z = 5
b = 2 bit
r = 1

conditions:
Z = 2b + r     5  = 22 + 1
0 <= r < 2b    0 <=  1 < 4

code length range of values  coding
  3 bit       1 -  2         n-1
  2 bit       3 -  5         n-1-1

symbol  code word
  1       000
  2       001
  3       01
  4       10
  5       11

Example with 66 symbols:


Z = 66
b = 6 bit
r = 2

conditions:
Z = 2b + r     66 = 26 + 2
0 <= r < 2b    0 <=  2 < 64

code length range of values  coding
  7 bit       1 -  4         n-1
  6 bit       5 - 66         n-2-1

symbol  code word
  1      0d  0000 000b
  2      1d  0000 001b
  3      2d  0000 010b
  4      3d  0000 011b
  5      2d  0000 10b
  6      3d  0000 11b
  7      4d  0001 00b
 ..     ...  ...
 64     61d  1111 01b
 65     62d  1111 10b
 66     63d  1111 11b

 <   ^   > 

Integration of New Symbols Integration of New Symbols Decoding of New Symbols