Compression : News Postings : RM vs. DCC95

Here are two mail letters I wrote in response to responses to a new article I posted with the title "arithmetic encoding revisisted revisisted". I have since lost the original article in a hard drive crash. If anyone has this article, I would REALLY appreciate a copy!!

From bloom Thu Jun 22 09:34:18 1995
Date: Thu, 22 Jun 95 09:34:14 PDT
From: bloom (Charles Bloom)
To: noesis@ucscb.UCSC.EDU
Subject: Re: Huffman beats Arithmetic !!
Cc: bloom
Content-Length: 4291
Status: RO
X-Lines: 152

> liked the article. common for people to neglect the simple
> when given what looks like better / more complex. linked lists
> and bubble sort are good examples.
> but...that's not why i wrote :)
> i thought i new bunches about compression, but alas, i've never
> heard of either:
> 	>DCC95 code : 10k bps
> 	>Rissanen-mohiudin : 15k bps
> could you (briefly) describe these or send me off to an ftp
> site.
> thanks
> --kyle

response posted to comp.compression.research 6-22-95


"DCC95 code" refers to the source code for "Arithmetic 
 Encoding Revisited" by Moffat, Neal & Witten.  It is
 available at : pub/arith_coder/arith_coder.tar.gz

"Rissanen-mohiudin" refers to the technique in
 "A Multiplication-Free Multialphabet Arithmetic Code"
 by J. Rissanen and K.M. Mohiuddin
 I'm afraid there is no publicly available version of this
 code...  maybe I'll post mine...
  the basic idea is:

normal arithcoding:
 you have range defined my Low (L) and Range (R)
 you want to shrink that range as set by SymbolTotal,
 SymbolLow, and SymbolHigh (St,Sl,Sh) ( Sl < Sh < St )
 (St = cumprobtotal, Sl and Sh are lower and upper
 comprobs of a symbol)

L = L + (Sl/St)*R+
R = ((Sh/St) - (Sl/St))*R

rewrite as:
L = L + Sl*(R/St)
R = (Sh-Sl)*(R/St)

rissanen-mohiuddin coding:
  the idea is to eliminate the multiply divide by
 setting R/St = 1

thus L = L + Sl
     R = Sh - Sl

tada!  The key is:  since the only thing which matters with Sl,Sh,St is their
 ratios (i.e. you can scale them without changing their
 meaning) (this is obvious in the first form of the
  above formulas, in which only ratios of cumprobs appear)
 you can therefore change them so that St = R

If you actually scaled St up to R, this would be
 pointless (because it would require mult & div)
but you can approximately scale St to R by
 just shifting St up to the neighborhood of R.

Now I will go into a more theoretical discussion
 once St is within the region (R>>1) < St < R
 you have a problem : there is extra code space
	E = R - St
 not using this extra code space would cause a 
 serious penalty in compression (about 0.5 bpb worse)

Thus, you must put this extra code space (E) onto
 some symbol.

The fastest way to do this is to put it on the last
 symbol of the alphabet:
   if ( Sh == St )
     R = R - Sl
     L = L + Sl
(note the difference between this and the original)

A simple, pretty good way, is to put E on the MPS

     if C > MPS
	Sl += E
	Sh += E
  else if C == MPS
	Sh += E
	no change

Now, the interesting thing : the best way of all
 to allocate E is to distribute it in the same
 ratio as the counts.  Thus, we wish to add
 to each count some fraction of itself, until
   St == E
 To do this, we cannot just double each count,
 because St > E , however, we may be able to
 add the count/2 to itself:
 also note that adding count/2 to each count
is the same as adding cumprob/2 to each cumprob.
	if (St>>1) > E
          Sl += Sl>>1
	  Sh += Sh>>1
	  E -= St>>1
 if not, try it for adding count/4 to itself,
 etc. until E == 0 (or thereabouts, E<16 or so
  should be fine)
Note that this is not completely right because
of integer division rounding effectings (i.e.
St>>1 will be larger than the actual sum of all
shifted-one counts, because the odd numbered counts
will drop their least-sig-one )

 Notez: this identical to the DCC95 code!

 Thus : I show that the DCC95 code is just a slick
 way to allocate the extra-code-space from the
 Rissanen-Mohiuddin code !!!

 Pretty cool, huh?

sidenote: I'm not really positive that this 
 result is the same as the DCC95 code, but it
 looks a whole lot like it... definitely, the
 viewpoint/perspective is different, but I
 think the resultant algorithm is the same...

notez: here's an area of research for someone to
 explore (tell me about it if you do) :
	a compromise between the speed of 
 	just adding E to the MPS and properly
	distributing E among the symbols could
	be found, such as properly distrubiting E
	amound the top-four-most-probable symbols
	etc.  Perhaps, for high-order context
	modelling, you could use a very small
	alphabet of the most probable
	symbols and so could very easily
	distribute E...
Charles Bloom

From bloom Thu Jun 22 18:36:27 1995
Date: Thu, 22 Jun 95 18:36:25 PDT
From: bloom (Charles Bloom)
Subject: Re: "Arithmetic Coding Revisited" Revisited
Cc: bloom
Content-Length: 3232
Status: RO
X-Lines: 94

Hi, Dr. Moffat, 

Thanks for the response.

> > > Yes, the RM method is essentially a special case of the DCC'95 code,
> > > when b-f=2.  This relationship was noted in the paper. 
> > 
> > b-f = 2 and they don't bother with spreading the extra code space:
> > they just drop it on the MPS which is the major difference in speed..
> Well, the same is true for the DCC code.  It does b-f iterations of
> ``spreading'' the error, and then puts all residual on the last
> symbol.  The code as distributed does not ensure that the last is the
> MPS, but the paper notes that for best efficiency the extra should be
> given to the MPS.

Exactly, except that the RM code doesn't even do those b-f iterations !
Once you have scaled St such that 
	(R>>1) < St < R  
then there are no more shifts for the RM coder to do, all you do is:
	L += Sl
	R = Sh - Sl
( in the terminology of my original message; review :
  L and R describe the current setting of the arithcoder
  Sl,Sh,&St are symbol cumprobs supplied by the model
    St is the total of all probs
    Sl & Sh are the upper & lower cumprobs
  for full precision you would do:
	L += Sl*R/St
	R = (Sh-Sl)*R/St )

Whereas the DCC95 coder does:
	E = R - St
	Slup = Sl>>1
	Shup = Sh>>1
	Stup = St>>1
	L += Sl
	R = Sh - Sl

note that the DCC95 coder looks a little different than this
because of the fact that it does the normalization of
  (R>>1) < St < R
and the allocation of E in one single merged loop...

the RM code eliminates all of that b-f loop by just doing the last two
instructions.  The RM code sticks a whole lot on the last symbol : E,
whereas the DCC95 code sticks much less than E on the last symbol...

> > BTW is the full paper version of "arithcoding revisited" available yet?
> > (I asked once before and was told that it was still in draft stage)..
> No, not yet. Too many other things have intervened.
> Sorry...

I'll just keep waiting ....

BTW the reason I have investigated this topic is because I developed
my own mult&div-free arithcoder, and discovered that it was the same
as the DCC95 & RM coders.  I never had a really good understanding of
the RM & DCC95 coders until I rediscovered them :^> +

Essentially the problem comes down to this:
	I find that despite many advances, arithcoders are still
too slow for practical use : Static (two-pass) Huffman is about 10 times
faster than the DCC95 coder, and is even 5 times faster than the
very crude RM coder - and static huffman often outperforms the RM coder
for compression ratio (on large, evenly-distributed order-0 alphabets) !
	The DCC95 coder is further slowed by the complex model
maintenance required to maintain the partially-normalized counts, and
the RM coder is slowed by the same need to maintain
	(R>>1) < St < R 
i.e. partial normalization ...

	Is there a solution to his debacle?  The problem is that I
need to be able to do order-1 with the speed of Static Huffman (order-0)
and I had hoped that an approximate arithcoder would do it, but I
find that it does not...

Charles Bloom

Back to the Index

Charles Bloom / cb at my domain