0/70 completed
Core Statistical Model Interactive

Regression Models

Predict continuous outcomes like total points scored. Foundation for player projections and statistical modeling.

๐Ÿ“ˆ Regression Types

Linear Regression

Continuous outcomes (total points scored). Models linear relationship between predictors and response.

y = ฮฒโ‚€ + ฮฒโ‚x + ฮต

Logistic Regression

Binary outcomes (over/under hit rate). Models probability of success.

P(y=1) = 1/(1+e^(-ฮฒx))

Poisson Regression

Count data (goals, touchdowns, strikeouts). Models rate/count outcomes.

log(ฮป) = ฮฒโ‚€ + ฮฒโ‚x

True Model

Underlying relationship (usually unknown)

True Slope (ฮฒโ‚) 0.8
0 2
True Intercept (ฮฒโ‚€) 5
-10 20
Noise Level (ฯƒ) 3
0.5 10

Data Settings

Sample Size (n) 50
10 200

๐Ÿ“Š Estimated Coefficients

Slope (ฮฒฬ‚โ‚) 0.822
True: 0.8 Error: 0.022
Intercept (ฮฒฬ‚โ‚€) 4.51

Rยฒ 86.7%
SE of Slope ยฑ0.046

Linear Regression Fit

Equation: Points = 4.51 + 0.822 ร— Minutes

Interpretation

Slope Interpretation:

For each additional minute played, the player scores 0.82 more points on average.

Rยฒ Interpretation:

Minutes played explains 87% of the variance in points scored.

๐Ÿ€ Player Projection Features

Usage-Based Features

  • โ€ข Minutes per game
  • โ€ข Field goal attempts
  • โ€ข Shot attempts in paint
  • โ€ข Usage rate
  • โ€ข Touches per game

Matchup Features

  • โ€ข Opponent defensive rating
  • โ€ข Position-specific defense
  • โ€ข Pace of opponent
  • โ€ข Home/away adjustments
  • โ€ข Rest days

Context Features

  • โ€ข Vegas total (implied pace)
  • โ€ข Spread (game script)
  • โ€ข Teammate injuries
  • โ€ข Back-to-back indicator
  • โ€ข Season phase

R Code Equivalent

# Linear Regression for Player Projections
library(dplyr)

# Fit model
model <- lm(points ~ minutes + opponent_def_rating + rest_days + home, 
            data = player_games)

# Summary
summary(model)

# Predict for tonight's game
new_data <- data.frame(
  minutes = 34,
  opponent_def_rating = 110,
  rest_days = 2,
  home = 1
)
predict(model, new_data, interval = "prediction")

# Feature importance (standardized coefficients)
library(lm.beta)
lm.beta(model)

โœ… Key Takeaways

  • โ€ข Linear regression predicts continuous outcomes
  • โ€ข More data โ†’ lower standard errors โ†’ more precise estimates
  • โ€ข Rยฒ measures explanatory power (not necessarily predictive)
  • โ€ข Feature engineering is key for sports projections
  • โ€ข Use cross-validation to avoid overfitting
  • โ€ข Consider Poisson for count data (TDs, HRs)

Pricing Models & Frameworks Tutorial

Built for mastery ยท Interactive learning