Source Coding Theorem

Overview

This is the theorem that started it all. It says: “You can compress data up to its entropy, but no further.”

Core Idea

Asymptotic Equipartition Property (AEP): For a long sequence of random variables, the outcome will almost certainly belong to a “Typical Set.” We only need to assign codes to the Typical Set. The “Atypical” outcomes are so rare we can ignore them (or use long codes).

Formal Definition (if applicable)

$$ L \ge H(X) $$ The expected length $L$ of a code cannot be less than the entropy $H(X)$.

Intuition

If you are packing a suitcase:

Entropy: The volume of your clothes.
Compression: Vacuum packing.
Theorem: You can vacuum pack the air out, but you can’t make the clothes disappear. There is a minimum size.

Examples

English Text: Entropy is ~1.5 bits per letter (if you consider context). Standard ASCII uses 8 bits. This means English can be compressed by ~80%.
Coin Flips: If the coin is fair (H=1), you can’t compress it. If it’s biased (H<1), you can.

Common Misconceptions

“I found a compression algorithm that compresses every file.” (Impossible. Counting argument: There are fewer short files than long files. You can’t map many long files to few short files uniquely).

Rate-Distortion Theory: The limit for lossy compression. (How much quality do you lose for a given size?).
Kraft’s Inequality: A condition for a code to be uniquely decodable.

Applications

File Formats: ZIP, GZIP, 7z.
Communication: Optimizing bandwidth usage.

Criticism / Limitations

It only applies to the limit of infinite block lengths. For short messages, you might not reach the entropy limit.

Overview#

Core Idea#

Formal Definition (if applicable)#

Intuition#

Examples#

Common Misconceptions#

Related Concepts#

Applications#

Criticism / Limitations#

Further Reading#