Frisch–Waugh–Lovell Theorem

Theorem

Let the model be y = β₁X₁ + β₂X₂ + ε. Let and X̃₁ denote the residuals from regressing y and X₁ on X₂ respectively. Then there are three equivalent ways to obtain β̂₁:

  1. 1. β̂₁ = (X′X)⁻¹X′y — regress y on X₁ and X₂
  2. 2. β̂₁ = (X̃₁′X̃₁)⁻¹X̃₁′ỹ — regress on X̃₁
  3. 3. β̂₁ = (X̃₁′X̃₁)⁻¹X̃₁′y — regress y on X̃₁
Intuition
and X̃₁ are the parts of y and X₁ orthogonal to X₂ — the variation unexplained by the control. Regressing on these residuals isolates the pure partial effect of X₁. The raw univariate regression of y on X₁ alone conflates this with the X₁X₂ correlation, producing omitted variable bias when X₂ also affects y. The bottom-right panel illustrates method 2; the sampling distributions below show the consequence for bias.
Raw Y vs X₁
Residual Y vs X₁
Y vs Residual X₁
★ FWL: Residual Y vs Residual X₁
OLS fit
95% CI
True slope
Sampling distribution of β̂₁ across 500 simulations at current parameters
★ Full model / FWL — unbiased
Raw univariate — biased