Data Compression


Criteria

Survey Formats

Basics

Basic Terms

Symbol

Set of Symbols

Alphabet

Code

Coding

Redundancy

Information Theory

Message

Probability

Information

Entropy

Calculation

Characteristics

Extreme Values

Diagram

Redundancy Reduction

Irrelevance Reduction

Entropy Coding

Variable Length Codes

Code Trees

Compression Methods

Data Formats


Glossary

Index


Download


www.BinaryEssence.com

Characteristics of Entropy


The entropy results from the sum of

- p(x) log2 p(x)

of any symbol (x) of the alphabet X.

In the following example X is {a, b, c, d, r}.


Example: abracadabra


 Symbol  Freq.    p(x)     H(x) =
  (x)                    - p(x) ld p(x)
   a       5      0.45     0.52
   b       2      0.18     0.45
   r       2      0.18     0.45
   c       1      0.09     0.31
   d       1      0.09     0.31
         ----             ------
          11               2.04

In this example the entropy of the information source is 2.04, i.e. the sequence "abracadabra" can be encoded with an average code length of 2.04 bit per symbol at the best.


Common coding procedures like Huffman coding are only able match this limit approximately. A more precise result is offered by the arithmetic coding.


 <   ^   > 

Huffman Coding (survey) []

Huffman Coding (detailled) []

Arithmetic Coding (survey) []

Arithmetic Coding (detailled) []

Entropy Calculation of Entropy Extreme Values