By Joseph G. Eisenhauer
Misinterpretation of regression results—particularly confusion between correlation, causation, and predictive power—remains a challenge in statistics education. This paper introduces the “perfect” regression as a pedagogical device for illustrating spurious correlation and overfitting. Using extreme cases with small samples and randomly generated data, it shows how exhausting degrees of freedom by adding polynomial and interaction terms can mechanically produce regressions which fit the data precisely, despite lacking substantive meaning or predictive validity. These examples highlight the limitations of relying on measures of fit and statistical significance without context. The approach is intended for advanced high school and introductory college-level courses and is designed to foster critical statistical reasoning about regression results.

Leave a Reply