Run Length Encoding compressor program 16 bit header version
Written by Shaun Case 1991 in Borland C++ 2.0
with sizeof (int) == 2
This program and its source code are Public Domain.
This program should be portable to any machine with
2 byte short ints and 8 bit bytes.
What is run length encoding?
Run Length Encoding, also known as RLE, is a method of compressing data
that has a lot of "runs" of bytes (or bits) in it. A "run" is a series
of bytes that are all the same. For instance, the string "THIS IS A
VEEEEEEEEEEEEEEEEEEEEEEEERY INTERESTING SENTENCE" has a run of 23 'E's
in it. This could be compressed in the following manner:
THIS IS A V23ERY INTERESTING SENTENCE
resulting in a savings of 20 characters. A further savings of one
character can be realized if the sequence "23" is replaced by a single
byte with the value 23.
However, if the text to be encoded is arbitrary, then it may contain
numbers as well as letters, and bytes of all possible values. For this
reason, there must be some way to let the decoder know when a compressed
run is encountered, and when a sequence to be passed straight through is
encountered. For this reason, the following file format was used:
========= tech info =========
16 bit header version.
File format:
13 bytes : original filename, followed by:
[ 16 bit header + data ][ 16 bit header + data ][16 bit header + data ]
etc..
header:
[lo byte][hi byte] ==> turn into 16 bit int ==>
bit 15 : 1 if following byte is a run
bit 14 - 0 : length of run (max 32767, min 4)
data: 1 byte : which character run consists of
*** OR ***
header:
[lo byte][hi byte] ==> turn into 16 bit int ==>
bit 15 : 0 if following bytes are sequence
bit 14 - 0 : length of sequence (max 32767)
data: (header & 0x7FFF) bytes of data
: data bytes copied to output stream unchanged
===============================
- 1
- 2
前往页