Author:Arkadi Kagan
Document:Entropy Compression Methods.

PPM - Prediction by Partial Matching.

Prediction by Partial Matching is a method to predict the next symbol depending on n previous. This method is else called prediction by Markov Model [3] of order n. In case of text in natural language like English it is clear intuitively and proved by some researchers [1] that probability of every next symbol is highly dependent on previous symbols. The most obvious method to use this feature is to incorporate it with one of known entropy compressions which have to gather statistics. The natural candidates are Huffman and Arithmetic compressions. It can be Adaptive or Static variants of this algorithms.

The key issue is to change alphabet to be alphabet of sequences and not alphabet of single symbols: Notice definition of sequence y:
y = [xi-n ... xi].
If xi-n or other particles of the sequence are not available, there are some options to handle this situation: The first option is obvious and do not need farther clarification. The second option can be enhanced and should be explained more detailed. I will not touch this option directly because it can be implemented as variant of Multilevel PPM.

Multilevel PPM.

Instead of constructing alphabet for compression from constant-order Markov Model, it is possible to decide: The last step is a loop over the input and here we have two very probable dangerous:
  1. To find sequence with summar weight that is far from maximal and therefore far from optimal.
  2. To make right but wastefull algorithm that need too many memory and CPU resources.
For best solution I could sought about is modification of Viterbi Algorithm

The far corner of Multilevel PPM is to define the maximal order of Markov Model to be as big as the length of the input stream.
Implementation of this PPM modification could be the closest to optimal variant of Lempel-Ziv compression scheme. Unfortunately there is no existing implementation that could work with reasonable amount of memory and processing time.

The opposite far corner is defining the maximal Markov Model order to zero. In this case we get the ordinal entropy coder that was taken as basis for PPM.

Some skipped details.

The weight.

Notice undefined entity called weight. The only restrictions I put on are this needed to use the modification of Viterbi Algorithm. If other algorithm is used, this restrictions could be different.
The simplest variants to choose the weight can be as follow:
  1. If s = s1a, then weight[s] = weight[s1] + weight[a].
    The rest of laws for this variant are obvious.
  2. The weight of symbol is equal to its length.
    Here we get a variant of LZ77.
There are possible endless variants and I am not ready yet to classify all of them.
The basis algorithm.

One entity that was not discussed is the basis algorithm, PPM was based on.
Huffman or Arithmetic coders can be used to yet improve the compression rate. However, simpler methods can be used for simplicity or as an attempt to get higher modularity.
One of this simpler methods is to save the target symbols indexes as simple integers of constant bit-length.
The gathered statistics.

As with any compression based on gathered statistics there are three main ways to pass this statistics from compressor to decompressor.
  1. Static coding.
    All statistics must be passed somehow from compressor to decompressor. One of variants is encoding statistics data like it is done for Canonical Huffman Tree.
  2. Adaptive coding.
    Statistics is not passed at all. Instead the statistics is gathered in process of encoding and decoding symmetrically. This require a way to encode symbols that found first time.
    In the case of Multilevel PPM unlike the case of ordinal Huffman or Arithmetic coding, the Adaptive coding can reduce compression rate dramatically.
  3. Fixed statistics.
    This variant is usefull, if at all, only for very specific cases where statistics of the source is constant and well investigated.


[1] Data Compression Using Adaptive Coding and Partial String Matching.
John G. Cleary and Ian H. Witten.

[2] Prediction by Partial Matching (PPM) FAQ.
Maksim Smirnov and Dmitri Shkarin.

[3] A Tutorial on Hidden Markov Models.
Rakesh Dugad and U. B. Desai.