Abstract:
This paper proposes an efficient text compression technique based
on dynamic source encoding scheme using Lempel-Ziv algorithms. A statistical
analysis is performed to count the occurrence of each character in the original
source. Accordingly, each character is replaced by a weighted fixed-length
code in lieu of arbitrary codes such as ASCII. This replacement generates an
equivalent binary source with two symbols zero and one. Thereafter, the
bitwise Lempel-Ziv algorithms can be applied to the nth-order extension binary
source that contains 2n symbols. As a result, the proposed technique achieves a
high compression ratio in comparison with those techniques that use static and
arbitrary source encoding schemes. In the decompression process, a header file
includes all characters sorted in descending order based on their frequency of
occurrences along with the extension order (n) is received. Thus, each character
can be restored back based on the code shared between the compression and
decompression processes.