
Symbol
In context of data compression and information theory a single symbol is the smallest unit structuring data. Any set of data is a collection of a corresponding amount of symbols. Type and structure of a symbol depends on the implementation.
Most procedures are using the byte as a basic unit. This is provided by the conventional computer architecture that works with the byte as the smallest addressable unit. For larger units commom systems take multiples of a byte (e.g. 2-byte or 4-byte integer values). Applications designed for a specific type of contents use more complex data structures, but internally provide a byte orientation too.
For uncompressed data usually fixed length symbols are used e.g. one byte. Most of the lossless compression procedures transfer them into variable length symbols. In this manner the algorithms represent the particular probability of a symbol.
< ^ >
|