Compression : News Postings : Minimum Entropy





this article is changed from posted version in these ways:

11-5-95 :	1. set of { Dn : 0<=n<=N } changed so that N is finite,
					not infinite

11-5-96 : 2. some typos corrected



In article <479diq$mrj@net.auckland.ac.nz>, mtimmerm@microstar.com (Matt Timmermans) says:
>Perhaps you should try to define exactly what you mean by "minimum
>entropy".  My definition, which you didn't didn't like, was:
> 
>   Given a string S and an invertable transformation F(), define
>   Entropy(S,F) to be the size of F(S).
> 
>   The "minimum entropy" of a string S is the minimum Entropy(S,F) for all
>   invertable transformations F.
> 
>I have demonstrated that the minimum entropy of any string is zero.  It
>also works if you take Entropy(S,F) to be Shannons_Entropy(F(S)).
> 
>I suspect that any definition you can think of will be equivalent to
>Kolmogorov complexity or zero.

Nope.  Your definition is true, but totally non-useful.

Thus, I have thought about this problem, and have come up with:

"The Bloom Minimum Entropy".

This is defined as:

Given a string S which you wish to find the B.M.E. of :

let Fn(S) be a reversible transform of S (the nth one)
let L(Fn(S)) be the length of the output of F

let Dn = a random, structured data set.

	Dn could be generated in several ways, and is important

	Way 1 ) randomly grab an email message.
	or: randomly grab a newsgroup posting
	or: randomly grab a data file from ftp
	or: randomly grab a web page

	Way 2 ) create a Markov state machine, with text-like properties
	to create random data, which is also structured.
	You must use a different FSM for each n.

	Why we choose these ways : Dn must be random, but must also
	be structured (i.e. it must have realistic data structure, and
	be compressible; it should not be random noise with no
	correlation on any level).

	In both cases, we want a huge large data set.

	Way 3 ) let Dn = a totally random set of length n

	I'm not sure which one of these ways will be more practical.

	Ways 2 & 3 are better for info-theory mathematic analysis
	Way 1 I think will give better real-world-usage estimates of
	Minimum Entropy.

Way 4 for choosing the Dn set :

  take S, of length N

  let Di , ( 0 <= i < N ) = S with 0 put in position i
  let Di , ( N <= i < 2N ) = S with 1 put in position i
  ... up to i < 256N

  let Di , ( 256N <= i < 257N ) = S with two-bytes '00' put in position i
  ... 65536 more sets of (N-1) modifications

  then again for all 3 byte modifications, etc.

  up through 256^N modifications of the whose S, in only 1 position

  THEN

  do it all again for S+S (i.e. S concatenated on S)

  then again for S+S+S ...

  up to N S+S+S+S+S+S... extensions

the result is an huge set, "most" of whose members "look like" S

thus Fmin will most likely be the same F which gives Qmin.

The ways are better in this order:

Way 4 (mods of S) > Way 1 (net grabs) > Way 2 (markov machine) > Way 3 (random)


-----

(note that the indices n in Fn and Dn are independant).

let Dn have N entries , N >= 1,000,000

	    N
let Z(Fn) = SUM ( L(Fn(Di)) )
	    i=0

let Zmin = Z(Fn) for some n, such that Zmin <= Z(Fn) for all n

let Q(Fn(S)) = L(Fn(S)) + Z(Fn) - Zmin

let Qmin = Q(Fn(S)) for some n, such that Qmin <= Q(Fn(S)) for all n

The minimum entropy of S is Qmin

TADA!

Thank you, thank you, please hold your applause until after the show.

Note that the n for Qmin might not be the same as the n for Zmin.

Consider an example :

let Fmin = the F which results in Zmin

let G(X) = a compressor :
	{
  if ( X == S ) outputbit(0)
  else outputbit(1), output Fmin(X)
	}

then L(G(S)) = 1 bit

BUT Z(G) = Zmin + ( 1 bit for each D )

Therefore Z(G) = Zmin + N bits

Q(G(S)) = 1 bit + Zmin + N bits - Zmin

Q(G(S)) = (N+1) bits

Clearly, G is not a good compressor.

Thus we see that N >= L(S) in bits
	is absolutely required.

The basic purpose of this definition is to eliminate all
trivial compressors.

Therefore, Qmin could be the Kolmogorov complexity, or less.

I'm not sure if it will be less or not..




Note that the Bloom Minimum Entropy is not designed
to be a measure of the entropy of a set with
respect to a particular compressor.

I.e. Q(F(S)) is NOT a measurement of the entropy of S
with respect to F.

Qmin IS a valid minimum entropy, and Qmin is ALWAYS less
than or equal to L(S).

Because :

Qmin <= 

	L(Fmin(S)) <= 

	"Kolmogorov Complexity"(S) <= 

		Order0_ShannonEntropy(S) <=

			L(S)

The proof of all of these inequalities is trivial.

For example :

	if Kolmogorov is better than Fmin, then
	Fmin = Kolmogoroc
	then L(Fmin(S)) == K.Complexity(S)
	else Fmin better than K.C.
	then L(Fmin(S)) <= K.Complexity(S)

all other terms of the inequality can be proved in a
similar manner.



Back to the Index

Charles Bloom / cb at my domain