Abstract:
This report presents some preliminary work on a recently described “Block Sorting”
lossless or text compression algorithm. While having little apparent relationship to
established techniques, it has a performance which places it definitely among the bestknown
compressors. The original paper did little more than present the algorithm, with
strong advice for efficient implementation. Here, the algorithm is restated in data compression
terms and various measurements are made on aspects of its operation.
Consideration of the possible efficiency of text compression leads to the revival of ideas
by Shannon as the basis of a text compressor and then to the classification of the Block
Sorting compressor as an example of this “new” type. Finally, this work leads to a
reconsideration of the meaning of escape codes in PPM-style compressors and a suggested
technique for better estimating escape probabilities.