dc.contributor.author |
Fenwick, P. |
en |
dc.date.accessioned |
2009-04-08T04:03:11Z |
en |
dc.date.available |
2009-04-08T04:03:11Z |
en |
dc.date.issued |
1997-04 |
en |
dc.identifier.citation |
Computer Science Technical Reports 145 (1997) |
en |
dc.identifier.issn |
1173-3500 |
en |
dc.identifier.uri |
http://hdl.handle.net/2292/3488 |
en |
dc.description.abstract |
Recent work on “symbol ranking” text compressors included a version which combines moderate
compression with high speed and has been described in a poster at the 1997 Data Compression
Conference. That work is extended here, especially to produce a compressor capable of efficient
hardware implementation.
The compressor is based on a conventional set-associative cache, with LRU update. To process a
symbol, a context of the three preceding symbols is hashed to access a particular cache line. If the
access “hits”, the LRU index is emitted as the recoded symbol, or otherwise the symbol is emitted
and the LRU status updated. Several versions are described, with some improving the performance
by maintaining a Move-To-Front list of symbols and emitting some symbols as “short” 5-bit indices
into this table, or by encoding runs of correctly-predicted symbols.
The final performance is in the range of 3.5 – 4 bits per byte over the Calgary Corpus, with software
speeds of up to 1 Mbyte/s. The best of the hardware designs should run at 100Mbyte/s with discrete
components, or 200–300 Mbyte/s with a completely integrated design. |
en |
dc.publisher |
Department of Computer Science, The University of Auckland, New Zealand |
en |
dc.relation.ispartofseries |
Computer Science Technical Reports |
en |
dc.rights.uri |
https://researchspace.auckland.ac.nz/docs/uoa-docs/rights.htm |
en |
dc.source.uri |
http://www.cs.auckland.ac.nz/staff-cgi-bin/mjd/csTRcgi.pl?serial |
en |
dc.title |
A Fast, Constant-Order, Symbol Ranking Text Compressor |
en |
dc.type |
Technical Report |
en |
dc.subject.marsden |
Fields of Research::280000 Information, Computing and Communication Sciences |
en |
dc.rights.holder |
The author(s) |
en |