A Fast, Constant-Order, Symbol Ranking Text Compressor

Show simple item record

dc.contributor.author Fenwick, P. en
dc.date.accessioned 2009-04-08T04:03:11Z en
dc.date.available 2009-04-08T04:03:11Z en
dc.date.issued 1997-04 en
dc.identifier.citation Computer Science Technical Reports 145 (1997) en
dc.identifier.issn 1173-3500 en
dc.identifier.uri http://hdl.handle.net/2292/3488 en
dc.description.abstract Recent work on “symbol ranking” text compressors included a version which combines moderate compression with high speed and has been described in a poster at the 1997 Data Compression Conference. That work is extended here, especially to produce a compressor capable of efficient hardware implementation. The compressor is based on a conventional set-associative cache, with LRU update. To process a symbol, a context of the three preceding symbols is hashed to access a particular cache line. If the access “hits”, the LRU index is emitted as the recoded symbol, or otherwise the symbol is emitted and the LRU status updated. Several versions are described, with some improving the performance by maintaining a Move-To-Front list of symbols and emitting some symbols as “short” 5-bit indices into this table, or by encoding runs of correctly-predicted symbols. The final performance is in the range of 3.5 – 4 bits per byte over the Calgary Corpus, with software speeds of up to 1 Mbyte/s. The best of the hardware designs should run at 100Mbyte/s with discrete components, or 200–300 Mbyte/s with a completely integrated design. en
dc.publisher Department of Computer Science, The University of Auckland, New Zealand en
dc.relation.ispartofseries Computer Science Technical Reports en
dc.rights.uri https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm en
dc.source.uri http://www.cs.auckland.ac.nz/staff-cgi-bin/mjd/csTRcgi.pl?serial en
dc.title A Fast, Constant-Order, Symbol Ranking Text Compressor en
dc.type Technical Report en
dc.subject.marsden Fields of Research::280000 Information, Computing and Communication Sciences en
dc.rights.holder The author(s) en


Files in this item

Find Full text

This item appears in the following Collection(s)

Show simple item record

Share

Search ResearchSpace


Browse

Statistics