Interactive Learning Guide

Regression
you can feel

Every concept has a live calculator. Change a number, see the answer update instantly. No static examples — just direct interaction with the maths.

Try this now: Go to Simple Linear → click on the chart to add data points, watch the line fit live. Then go to Beyond Linear → drag the polynomial degree slider to 10 and watch it overfit.
5
Live Calculators
Real-time Fitting
7
Modules
12
Quiz Questions
📐
01 — Foundations
What is Regression?

OLS objective, correlation vs causation, t-tests. Includes an R² & significance calculator.

📈
02 — Simple Linear
Click-to-Fit Canvas

Click to add points, right-click to remove. Full OLS stats update in real time. Prediction calculator included.

🧮
03 — Multiple Regression
Salary & VIF Calculators

Live salary predictor with partial effects. Multicollinearity VIF explorer. Model comparison tool.

🔍
04 — Assumptions
Violation Simulator

Generate data with specific violations, see how residual plots look and what goes wrong with inference.

🚀
05 — Beyond Linear
Logistic, Poly, Ridge/Lasso

Drag the degree slider, tune regularisation λ. Overfitting vs underfitting made visceral.

🩺
06 — Diagnostics
Four Live Diagnostic Plots

Switch between dataset types and watch all four R-style diagnostic plots update simultaneously.

01 — Foundations

What is Regression?

The core idea, OLS, correlation vs causation, and inference — with a live significance calculator

Regression asks: "If I know X, how well can I predict Y?" You have data, you draw the best possible straight line through it, and that line becomes your model. Given a new X, you read off the predicted Y. Everything else in regression is refinement of this simple idea.

You're a coffee shop owner. Hotter days → more iced coffees sold. You plot temperature vs cups sold. Regression finds the best line — the one that minimises total prediction error. Now you can order the right amount of coffee for tomorrow based on the weather forecast.

The Regression Framework
Y — Response

What you're predicting

The dependent variable / outcome / target. House price, test score, blood pressure. Goes on the vertical axis.

X — Predictor

What you're using to predict

The independent variable / feature / covariate. House size, hours studied, dosage. Goes on the horizontal axis.

β — Coefficient

The effect size

How much Y changes per unit of X. The slope. This is usually what you care about — it quantifies the relationship.

ε — Residual

What the model gets wrong

Actual Y minus predicted Ŷ. Good models have small, random residuals. Patterned residuals signal a missing variable or wrong model form.

The Linear Model
\[Y = \beta_0 + \beta_1 X + \varepsilon\]
β₀ = intercept (Y when X=0). β₁ = slope (ΔY per unit ΔX). ε = everything the model doesn't capture.

OLS (Ordinary Least Squares) finds the line that minimises the sum of squared vertical distances from each point to the line. "Squared" punishes big errors more than small ones, and makes the loss function smooth so calculus can find the exact minimum.

OLS Loss Function Explorer

Drag the slope and intercept to see how RSS changes. OLS finds the unique (β₀, β₁) that minimises RSS.

2.0
1.00
Your RSS
OLS RSS (min)
Excess over OLS
Current Line vs OLS Best Fit
Closed-Form Solution
\[\hat{\beta}_1=\frac{\sum(x_i-\bar{x})(y_i-\bar{y})}{\sum(x_i-\bar{x})^2},\quad\hat{\beta}_0=\bar{y}-\hat{\beta}_1\bar{x}\]
OLS has an exact algebraic solution — no iteration needed. The optimal line always passes through (x̄, ȳ).

Correlation (−1 to +1) measures linear association strength. Regression gives a predictive equation. But neither implies causation — that requires experiments, instrumental variables, or other causal methods.

Correlation Calculator — Live

Adjust the sliders to generate data with a specific correlation. See how the regression line and R² change.

0.70
50
Sample r
R² (variance explained)
Slope β₁
Interpretation
Remember: r = 0 means no linear relationship — not "no relationship". X and X² can have r ≈ 0 but a perfect non-linear relationship. And even r = 0.9 doesn't mean X causes Y.
🧮Statistical Inference Calculator

Enter your regression output — get p-values, confidence intervals, and a plain-English verdict instantly.

2.5
0.80
30
t-statistic
p-value (two-sided)
95% CI lower
95% CI upper
Significant at 5%?
Plain-English Verdict

Adjust the sliders to explore how β̂₁, SE, and n jointly determine significance.

02 — Simple Linear

One line, live

Click the canvas to add data — OLS fits instantly with full statistics and a prediction calculator

Click anywhere on the chart to add data points. The OLS regression line fits immediately. Right-click to remove a point. Watch what happens to slope, R², and the prediction calculator when you add an outlier.

Live Regression Canvas
Left-click: add point · Right-click: remove nearest · Drag: move point
OLS Results — Live
Fitted Equation
Add ≥ 2 points to fit
Prediction Calculator
5.0
Predicted Ŷ
95% Prediction Interval
95% Confidence Interval

PI vs CI: Prediction interval (wider) covers a single new observation. Confidence interval (narrower) covers the mean of Y at that X.

03 — Multiple Regression

Many predictors, one model

Live salary calculator, partial effects decomposition, and VIF multicollinearity explorer

This model predicts salary from four inputs. Every slider updates the prediction, equation, and waterfall chart instantly. The chart shows each variable's contribution — the "partial effect" of each predictor while controlling for the others.

💼Salary Prediction — Multiple Regression Model
5
4
Live Equation
Predicted Salary
95% Prediction Interval
Model R²0.79
Contribution Breakdown

VIF (Variance Inflation Factor) measures how much a predictor is explained by the others. VIF = 1 is perfect independence. VIF > 10 means severe multicollinearity — your coefficient estimates become wildly unstable.

📡Multicollinearity VIF Calculator

Drag the correlation between predictors. Watch how VIF, standard errors, and coefficient stability all respond.

0.30
VIF (X₁ and X₂)
SE inflation factor
Effective n reduction
Severity
What this means
Multiple Regression — Core Theory
The Full Model
\[Y = \beta_0 + \beta_1 X_1 + \beta_2 X_2 + \cdots + \beta_p X_p + \varepsilon\]
Each βⱼ = effect of Xⱼ on Y, holding all other Xₖ constant ("ceteris paribus").
Matrix Form — OLS Solution
\[\hat{\boldsymbol{\beta}} = (\mathbf{X}^\top\mathbf{X})^{-1}\mathbf{X}^\top\mathbf{Y}\]
X is the n×(p+1) design matrix. This single formula gives all p+1 coefficients simultaneously.
Adjusted R²

Penalised goodness of fit

R² always rises when you add variables. Adjusted R² penalises each extra parameter. Use this when comparing models of different sizes.

F-test

Are any predictors useful?

Tests H₀: all β₁=...=βₚ=0. A significant F-test means at least one predictor is useful — but doesn't tell you which.

04 — Assumptions

When does it break?

Simulate specific violations — see the residual plots and quantify how inference goes wrong

⚗️Assumption Violation Simulator
0.50
60
True slope (β₁)2.00
Estimated β̂₁
Reported SE
True SE (robust)
SE inflation
Residuals vs Fitted
Select a violation type above.
1. Linearity
E[ε|X] = 0. Residuals should show no pattern vs fitted values.
STATUS: OK
2. Independence
Residuals uncorrelated. Violated by time series or clustered data.
STATUS: OK
3. Homoscedasticity
Constant error variance across all X values. No fan-shaped plots.
STATUS: OK
4. Normal Errors
Residuals ≈ Normal. Needed for exact inference; CLT relaxes this for large n.
STATUS: OK
05 — Beyond Linear

When a line isn't enough

Logistic regression, polynomial overfitting, and regularisation — all live

Logistic regression is for binary outcomes (yes/no, pass/fail). A sigmoid function squashes the linear prediction into a probability between 0 and 1. The threshold you choose trades off false positives vs false negatives.

🎯Logistic Regression — Live Classification
0.8
1.2
0.50
Accuracy
Sensitivity (recall)
Specificity
Precision (PPV)
Move the threshold to explore the sensitivity/specificity tradeoff.
Odds Ratio Interpretation

Polynomial regression adds X², X³ etc. as predictors — still "linear regression" (linear in parameters). The danger: high degree = overfitting. The model memorises training noise. Watch R² → 1 on training data while the curve goes wild.

Overfitting Explorer — Polynomial Degree
1
1.0
(held-out 30%)
Train R²
Test R²
Train RMSE
Test RMSE

Ridge (L2) shrinks all coefficients toward zero. Lasso (L1) can shrink some to exactly zero — automatic variable selection. Higher λ = more shrinkage = less overfitting but more bias. The sweet spot is found by cross-validation.

🎛️Ridge vs Lasso — Live Regularisation
0.0

Coefficient values as λ increases — Ridge shrinks, Lasso zeros out

Train vs Test error — find the λ that minimises test error

Coefficients set to 0 (Lasso)
Train RMSE
Test RMSE
Total shrinkage
What is happening
06 — Diagnostics

Is my model any good?

All four R diagnostic plots, live. Switch dataset type and watch them all update simultaneously.

🩺Four-Plot Diagnostic Dashboard
Residuals vs Fitted

Normal Q-Q

Scale-Location

Leverage vs Std Residuals (Cook's D)

07 — Quiz

Test Your Understanding

12 questions — interpretation, theory, scenarios, and diagnostics

Score
0
Streak
0
Correct
0
Accuracy
Select an answer to begin.