Here are two mail letters I wrote in response to responses to a new article I posted with the title "arithmetic encoding revisisted revisisted". I have since lost the original article in a hard drive crash. If anyone has this article, I would REALLY appreciate a copy!!

From bloom Thu Jun 22 09:34:18 1995 Date: Thu, 22 Jun 95 09:34:14 PDT From: bloom (Charles Bloom) To: noesis@ucscb.UCSC.EDU Subject: Re: Huffman beats Arithmetic !! Cc: bloom Content-Length: 4291 Status: RO X-Lines: 152 > liked the article. common for people to neglect the simple > when given what looks like better / more complex. linked lists > and bubble sort are good examples. > > but...that's not why i wrote :) > > i thought i new bunches about compression, but alas, i've never > heard of either: > > >DCC95 code : 10k bps > >Rissanen-mohiudin : 15k bps > > could you (briefly) describe these or send me off to an ftp > site. > thanks > --kyle > response posted to comp.compression.research 6-22-95 Hi! "DCC95 code" refers to the source code for "Arithmetic Encoding Revisited" by Moffat, Neal & Witten. It is available at munnari.oz.au : pub/arith_coder/arith_coder.tar.gz "Rissanen-mohiudin" refers to the technique in "A Multiplication-Free Multialphabet Arithmetic Code" by J. Rissanen and K.M. Mohiuddin I'm afraid there is no publicly available version of this code... maybe I'll post mine... the basic idea is: normal arithcoding: you have range defined my Low (L) and Range (R) you want to shrink that range as set by SymbolTotal, SymbolLow, and SymbolHigh (St,Sl,Sh) ( Sl < Sh < St ) (St = cumprobtotal, Sl and Sh are lower and upper comprobs of a symbol) L = L + (Sl/St)*R+ R = ((Sh/St) - (Sl/St))*R rewrite as: L = L + Sl*(R/St) R = (Sh-Sl)*(R/St) rissanen-mohiuddin coding: the idea is to eliminate the multiply divide by setting R/St = 1 thus L = L + Sl R = Sh - Sl tada! The key is: since the only thing which matters with Sl,Sh,St is their ratios (i.e. you can scale them without changing their meaning) (this is obvious in the first form of the above formulas, in which only ratios of cumprobs appear) you can therefore change them so that St = R If you actually scaled St up to R, this would be pointless (because it would require mult & div) but you can approximately scale St to R by just shifting St up to the neighborhood of R. Now I will go into a more theoretical discussion once St is within the region (R>>1) < St < R you have a problem : there is extra code space E = R - St not using this extra code space would cause a serious penalty in compression (about 0.5 bpb worse) Thus, you must put this extra code space (E) onto some symbol. The fastest way to do this is to put it on the last symbol of the alphabet: if ( Sh == St ) R = R - Sl L = L + Sl (note the difference between this and the original) A simple, pretty good way, is to put E on the MPS thus: if C > MPS Sl += E Sh += E else if C == MPS Sh += E else no change Now, the interesting thing : the best way of all to allocate E is to distribute it in the same ratio as the counts. Thus, we wish to add to each count some fraction of itself, until St == E To do this, we cannot just double each count, because St > E , however, we may be able to add the count/2 to itself: also note that adding count/2 to each count is the same as adding cumprob/2 to each cumprob. thus: if (St>>1) > E { Sl += Sl>>1 Sh += Sh>>1 E -= St>>1 } if not, try it for adding count/4 to itself, etc. until E == 0 (or thereabouts, E<16 or so should be fine) Note that this is not completely right because of integer division rounding effectings (i.e. St>>1 will be larger than the actual sum of all shifted-one counts, because the odd numbered counts will drop their least-sig-one ) Notez: this identical to the DCC95 code! Thus : I show that the DCC95 code is just a slick way to allocate the extra-code-space from the Rissanen-Mohiuddin code !!! Pretty cool, huh? sidenote: I'm not really positive that this result is the same as the DCC95 code, but it looks a whole lot like it... definitely, the viewpoint/perspective is different, but I think the resultant algorithm is the same... notez: here's an area of research for someone to explore (tell me about it if you do) : a compromise between the speed of just adding E to the MPS and properly distributing E among the symbols could be found, such as properly distrubiting E amound the top-four-most-probable symbols etc. Perhaps, for high-order context modelling, you could use a very small alphabet of the most probable symbols and so could very easily distribute E... Charles Bloom

From bloom Thu Jun 22 18:36:27 1995 Date: Thu, 22 Jun 95 18:36:25 PDT From: bloom (Charles Bloom) To: alistair@cs.mu.OZ.AU Subject: Re: "Arithmetic Coding Revisited" Revisited Cc: bloom Content-Length: 3232 Status: RO X-Lines: 94 Hi, Dr. Moffat, Thanks for the response. > > > Yes, the RM method is essentially a special case of the DCC'95 code, > > > when b-f=2. This relationship was noted in the paper. > > > > b-f = 2 and they don't bother with spreading the extra code space: > > they just drop it on the MPS which is the major difference in speed.. > > Well, the same is true for the DCC code. It does b-f iterations of > ``spreading'' the error, and then puts all residual on the last > symbol. The code as distributed does not ensure that the last is the > MPS, but the paper notes that for best efficiency the extra should be > given to the MPS. Exactly, except that the RM code doesn't even do those b-f iterations ! Once you have scaled St such that (R>>1) < St < R then there are no more shifts for the RM coder to do, all you do is: L += Sl R = Sh - Sl ( in the terminology of my original message; review : L and R describe the current setting of the arithcoder Sl,Sh,&St are symbol cumprobs supplied by the model St is the total of all probs Sl & Sh are the upper & lower cumprobs for full precision you would do: L += Sl*R/St R = (Sh-Sl)*R/St ) Whereas the DCC95 coder does: E = R - St Slup = Sl>>1 Shup = Sh>>1 Stup = St>>1 for(i=0;i>=1; Shup>>=1; Stup>>=1; } L += Sl R = Sh - Sl note that the DCC95 coder looks a little different than this because of the fact that it does the normalization of (R>>1) < St < R and the allocation of E in one single merged loop... the RM code eliminates all of that b-f loop by just doing the last two instructions. The RM code sticks a whole lot on the last symbol : E, whereas the DCC95 code sticks much less than E on the last symbol... > > BTW is the full paper version of "arithcoding revisited" available yet? > > (I asked once before and was told that it was still in draft stage).. > > No, not yet. Too many other things have intervened. > Sorry... I'll just keep waiting .... BTW the reason I have investigated this topic is because I developed my own mult&div-free arithcoder, and discovered that it was the same as the DCC95 & RM coders. I never had a really good understanding of the RM & DCC95 coders until I rediscovered them :^> + Essentially the problem comes down to this: I find that despite many advances, arithcoders are still too slow for practical use : Static (two-pass) Huffman is about 10 times faster than the DCC95 coder, and is even 5 times faster than the very crude RM coder - and static huffman often outperforms the RM coder for compression ratio (on large, evenly-distributed order-0 alphabets) ! The DCC95 coder is further slowed by the complex model maintenance required to maintain the partially-normalized counts, and the RM coder is slowed by the same need to maintain (R>>1) < St < R i.e. partial normalization ... Is there a solution to his debacle? The problem is that I need to be able to do order-1 with the speed of Static Huffman (order-0) and I had hoped that an approximate arithcoder would do it, but I find that it does not... Charles Bloom

Charles Bloom / cb at my domain