Overview
Which string is more complex?
0101010101010101010111001010001011101010They have the same length. But string 1 can be described as “Print ‘01’ 10 times.” String 2 has no pattern; you just have to print it. String 2 has higher Kolmogorov Complexity.
Core Idea
Shortest Description: The complexity of an object is the size of the smallest file that can generate it.
- Simple objects have short programs.
- Random objects have long programs (incompressible).
Formal Definition (if applicable)
$$ K(s) = \min { |p| : U(p) = s } $$
- $K(s)$: Kolmogorov complexity of string $s$.
- $|p|$: Length of program $p$.
- $U$: Universal Turing Machine.
Intuition
- Fractals: The Mandelbrot set is infinitely detailed, but the program to generate it is very short. Therefore, the Mandelbrot set has low Kolmogorov complexity.
- Pi ($\pi$): Infinite digits, but a short program can calculate them. Low complexity.
- Random Noise: High complexity. You can’t compress it.
Examples
- Occam’s Razor: The simplest explanation is usually the best. In math terms: Pick the hypothesis with the lowest Kolmogorov complexity.
- Zip Files: A practical approximation of Kolmogorov complexity.
Common Misconceptions
- “Complexity = Randomness.” (Sort of. Random things are maximally complex in this definition because they have no structure.)
- “We can calculate it.” (No, it is uncomputable. You can never prove you found the shortest program).
Related Concepts
- Chaitin’s Constant ($\Omega$): The probability that a random program will halt. It is a number that contains the answer to every mathematical problem, but it is uncomputable.
- Minimum Description Length (MDL): A principle in statistics/machine learning.
Applications
- Inductive Inference: Solomonoff Induction.
- Biology: Measuring the complexity of DNA sequences.
Criticism / Limitations
Since it’s uncomputable, we can’t use it directly. We have to use approximations like standard compression algorithms.
Further Reading
- Li & Vitányi, An Introduction to Kolmogorov Complexity and Its Applications