this article is changed from posted version in these ways: 11-5-95 : 1. set of { Dn : 0<=n<=N } changed so that N is finite, not infinite 11-5-96 : 2. some typos correctedIn article <479diq$mrj@net.auckland.ac.nz>, mtimmerm@microstar.com (Matt Timmermans) says: >Perhaps you should try to define exactly what you mean by "minimum >entropy". My definition, which you didn't didn't like, was: > > Given a string S and an invertable transformation F(), define > Entropy(S,F) to be the size of F(S). > > The "minimum entropy" of a string S is the minimum Entropy(S,F) for all > invertable transformations F. > >I have demonstrated that the minimum entropy of any string is zero. It >also works if you take Entropy(S,F) to be Shannons_Entropy(F(S)). > >I suspect that any definition you can think of will be equivalent to >Kolmogorov complexity or zero. Nope. Your definition is true, but totally non-useful. Thus, I have thought about this problem, and have come up with: "The Bloom Minimum Entropy". This is defined as: Given a string S which you wish to find the B.M.E. of : let Fn(S) be a reversible transform of S (the nth one) let L(Fn(S)) be the length of the output of F let Dn = a random, structured data set. Dn could be generated in several ways, and is important Way 1 ) randomly grab an email message. or: randomly grab a newsgroup posting or: randomly grab a data file from ftp or: randomly grab a web page Way 2 ) create a Markov state machine, with text-like properties to create random data, which is also structured. You must use a different FSM for each n. Why we choose these ways : Dn must be random, but must also be structured (i.e. it must have realistic data structure, and be compressible; it should not be random noise with no correlation on any level). In both cases, we want a huge large data set. Way 3 ) let Dn = a totally random set of length n I'm not sure which one of these ways will be more practical. Ways 2 & 3 are better for info-theory mathematic analysis Way 1 I think will give better real-world-usage estimates of Minimum Entropy. Way 4 for choosing the Dn set : take S, of length N let Di , ( 0 <= i < N ) = S with 0 put in position i let Di , ( N <= i < 2N ) = S with 1 put in position i ... up to i < 256N let Di , ( 256N <= i < 257N ) = S with two-bytes '00' put in position i ... 65536 more sets of (N-1) modifications then again for all 3 byte modifications, etc. up through 256^N modifications of the whose S, in only 1 position THEN do it all again for S+S (i.e. S concatenated on S) then again for S+S+S ... up to N S+S+S+S+S+S... extensions the result is an huge set, "most" of whose members "look like" S thus Fmin will most likely be the same F which gives Qmin. The ways are better in this order: Way 4 (mods of S) > Way 1 (net grabs) > Way 2 (markov machine) > Way 3 (random) ----- (note that the indices n in Fn and Dn are independant). let Dn have N entries , N >= 1,000,000 N let Z(Fn) = SUM ( L(Fn(Di)) ) i=0 let Zmin = Z(Fn) for some n, such that Zmin <= Z(Fn) for all n let Q(Fn(S)) = L(Fn(S)) + Z(Fn) - Zmin let Qmin = Q(Fn(S)) for some n, such that Qmin <= Q(Fn(S)) for all n The minimum entropy of S is Qmin TADA! Thank you, thank you, please hold your applause until after the show. Note that the n for Qmin might not be the same as the n for Zmin. Consider an example : let Fmin = the F which results in Zmin let G(X) = a compressor : { if ( X == S ) outputbit(0) else outputbit(1), output Fmin(X) } then L(G(S)) = 1 bit BUT Z(G) = Zmin + ( 1 bit for each D ) Therefore Z(G) = Zmin + N bits Q(G(S)) = 1 bit + Zmin + N bits - Zmin Q(G(S)) = (N+1) bits Clearly, G is not a good compressor. Thus we see that N >= L(S) in bits is absolutely required. The basic purpose of this definition is to eliminate all trivial compressors. Therefore, Qmin could be the Kolmogorov complexity, or less. I'm not sure if it will be less or not..

Note that the Bloom Minimum Entropy is not designed to be a measure of the entropy of a set with respect to a particular compressor. I.e. Q(F(S)) is NOT a measurement of the entropy of S with respect to F. Qmin IS a valid minimum entropy, and Qmin is ALWAYS less than or equal to L(S). Because : Qmin <= L(Fmin(S)) <= "Kolmogorov Complexity"(S) <= Order0_ShannonEntropy(S) <= L(S) The proof of all of these inequalities is trivial. For example : if Kolmogorov is better than Fmin, then Fmin = Kolmogoroc then L(Fmin(S)) == K.Complexity(S) else Fmin better than K.C. then L(Fmin(S)) <= K.Complexity(S) all other terms of the inequality can be proved in a similar manner.

Charles Bloom / cb at my domain