ÊÊÊÊÊÊÊÊÊÊÊ The quantity now known as Kolmolgorov Complexity was developed independently by Solomonoff, Kolmolgorov, and Chaitin during the 1960s.Ê The Kolmolgorov Complexity, K(x)+c as defined by Kolmolgorov, is the length of the shortest binary computer program that can reproduce an object, up to a machine-dependent additive constant.Ê (2,4)Ê If this constant seems disconcerting, think of it as an emulator or dictionary from one system to another.Ê Like potential energy in a physical system, it contains an arbitrary constant that becomes meaningless when we consider differences in complexity.Ê This idea of a shortest recursive function eliminates the need for a probability distribution and random variables.Ê In fact, it leads to incredibly deep ideas about the nature of randomness and the limits of formal reasoning.Ê For example, we can prove the incompleteness of axiomatic systems (and hence of mathematics) in 5 easy steps (4).
1. Write down your complete axiomatic system with f symbols.
2. Define ãx is randomä to mean that the shortest symbolic description of finite string x is as long as x.
3. Consider some string of length n (which you write down with log n symbols) much greater than f (greater than log n + f, specifically)
4. Search the axiomatic system for a proof that this (some) string is random.Ê When you find it, name it x and write down its n symbols.Ê You described it with log n + f < n symbols.
5. But you were proving that it was random, which contradicts log n + f < n.
Chaitin has extended Gšdelâs incompleteness result to a form in which the natural numbers are shown to be random as defined above.Ê The problem that concerns us, however, is related to the inference problem considered by Solomonoff.Ê The first part of his idea was to define the ãuniversal probabilityä that a binary program with some known output could be produced by random coin flips.Ê This involves a sum of the form 2^(-L) where l is the program length in bits and the sum is over all programs that produce the desired output.Ê It is apparent from the exponential falloff that the sum will be dominated by the shortest program.Ê Hence, M(x)~ 2^-K(x).Ê If we use this as a prior probability in Bayesâs Rule, then the probability that the string x will be followed by a 1 rather than by a 0 is (4)
M(x1)
-----------------
M(x0)+M(x1)
This method was proved by Solomonoff to be optimal induction, with the caveat that neither K(x) nor (by extension) M(x) are computable in finite amount of time.Ê We can, however, approximate K(x) by C(x) where C(x) is the length of x encoded with the best data-compression scheme at our disposal.Ê It follows that random objects are incompressible.Ê
We will use a related scheme to calculate distances for phylogenic trees.