
Dividing into Intervals
On the basis of a well-known alphabet the probability of all symbols has to be determined and converted into intervals. The size of the interval depends linearly on the symbol's probability. If this is 50% for example, then the associated sub-interval covers the half of the current interval. Usually the initial interval is [0: 1) for the encoding of the first symbol:
Example sub-division of the interval [0.0; 1.0)
Frequency
Symbol Probability Sub-Interval
a 4 0.4 [0.0; 0.4)
b 2 0.2 [0.4; 0.6)
c 2 0.2 [0.6; 0.8)
d 1 0.1 [0.8; 0.9)
e 1 0.1 [0.9; 1.0)
Sum 10 1.0 [0.0; 1.0)
Example sub-division of the interval [0.9; 1.0)
Frequency
Symbol Probability Sub-Interval
a 4 0.4 [0.90; 0.94)
b 2 0.2 [0.94; 0.96)
c 2 0.2 [0.96; 0.98)
d 1 0.1 [0.98; 0.99)
e 1 0.1 [0.99; 1.00)
Example sub-division of the interval [0.99; 1.00)
Frequency
Symbol Probability Sub-Interval
a 4 0.4 [0.990; 0.994)
b 2 0.2 [0.994; 0.996)
c 2 0.2 [0.996; 0.998)
d 1 0.1 [0.998; 0.999)
e 1 0.1 [0.999; 1.000)
Accordingly a rational number in the range of 0.999 to < 1.000 represents the string "eee".
< ^ >
|