Overview

The t-test is one of the most common statistical tests. It was developed by William Sealy Gosset under the pseudonym “Student” while working for Guinness Brewery to monitor the quality of stout.

Core Idea

It compares the difference between two means relative to the variance in the data. If the difference between groups is large compared to the spread within groups, the result is significant.

Formal Definition (if applicable)

$$ t = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} $$

(For independent samples with unequal variances). The resulting t-statistic is compared to a t-distribution with degrees of freedom based on sample size.

Intuition

Imagine comparing the height of men and women.

  • Numerator: Difference in average height (Signal).
  • Denominator: Variability in height within each gender (Noise).
  • t-score: Signal / Noise. A high t-score means the difference is real, not just noise.

Examples

  • Medical: Comparing blood pressure in patients taking a drug vs. a placebo.
  • Business: Comparing conversion rates of two different landing pages (though Z-test is often used for proportions).

Common Misconceptions

  • “Data must be normal”: The t-test is robust to deviations from normality, especially with larger sample sizes (thanks to CLT).
  • “It proves the means are different”: It only shows that the observed difference is unlikely to have occurred by chance if the true means were the same.

Applications

Used extensively in scientific research, A/B testing, and quality assurance.

Criticism / Limitations

It only compares two groups. Using multiple t-tests to compare many groups increases the risk of Type I errors (false positives); ANOVA should be used instead.

Further Reading

  • “The probable error of a mean” by Student (1908)