Core Statistical Model Interactive
Regression Models
Predict continuous outcomes like total points scored. Foundation for player projections and statistical modeling.
๐ Regression Types
Linear Regression
Continuous outcomes (total points scored). Models linear relationship between predictors and response.
y = ฮฒโ + ฮฒโx + ฮต
Logistic Regression
Binary outcomes (over/under hit rate). Models probability of success.
P(y=1) = 1/(1+e^(-ฮฒx))
Poisson Regression
Count data (goals, touchdowns, strikeouts). Models rate/count outcomes.
log(ฮป) = ฮฒโ + ฮฒโx
True Model
Underlying relationship (usually unknown)
0 2
-10 20
0.5 10
Data Settings
10 200
๐ Estimated Coefficients
Slope (ฮฒฬโ) 0.822
True: 0.8 Error: 0.022
Intercept (ฮฒฬโ) 4.51
Rยฒ 86.7%
SE of Slope ยฑ0.046
Linear Regression Fit
Equation: Points = 4.51 + 0.822 ร Minutes
Interpretation
Slope Interpretation:
For each additional minute played, the player scores 0.82 more points on average.
Rยฒ Interpretation:
Minutes played explains 87% of the variance in points scored.
๐ Player Projection Features
Usage-Based Features
- โข Minutes per game
- โข Field goal attempts
- โข Shot attempts in paint
- โข Usage rate
- โข Touches per game
Matchup Features
- โข Opponent defensive rating
- โข Position-specific defense
- โข Pace of opponent
- โข Home/away adjustments
- โข Rest days
Context Features
- โข Vegas total (implied pace)
- โข Spread (game script)
- โข Teammate injuries
- โข Back-to-back indicator
- โข Season phase
R Code Equivalent
# Linear Regression for Player Projections
library(dplyr)
# Fit model
model <- lm(points ~ minutes + opponent_def_rating + rest_days + home,
data = player_games)
# Summary
summary(model)
# Predict for tonight's game
new_data <- data.frame(
minutes = 34,
opponent_def_rating = 110,
rest_days = 2,
home = 1
)
predict(model, new_data, interval = "prediction")
# Feature importance (standardized coefficients)
library(lm.beta)
lm.beta(model)โ Key Takeaways
- โข Linear regression predicts continuous outcomes
- โข More data โ lower standard errors โ more precise estimates
- โข Rยฒ measures explanatory power (not necessarily predictive)
- โข Feature engineering is key for sports projections
- โข Use cross-validation to avoid overfitting
- โข Consider Poisson for count data (TDs, HRs)