Abstract:
In his work on the information content of English text in 1951, Shannon described
a method of recoding the input text, a technique which has apparently lain dormant for the
ensuing 45 years. Whereas traditional compressors exploit symbol frequencies and symbol
contexts, Shannon’s method adds the concept of “symbol ranking”, as in ‘the next symbol is
the one 3rd most likely in the present context’. This report describes an implementation of his
method and shows that it forms the basis of a good text compressor.1 The recent “acb”
compressor of Buynovsky is shown to belong to the general class of symbol ranking
compressors.