Hierarchical Models
Model data with natural grouping structure. Borrow strength across groups while respecting heterogeneity. Perfect for small-sample sports data.
๐ The Small Sample Problem
The Problem
A player has 3 games this season. Sample mean is 28 PPG. Do we really believe they're a 28 PPG player?
Small samples have high varianceโextreme values likely noise.
The Solution: Shrinkage
"Shrink" the estimate toward a population mean. Less shrinkage with more data, more with less.
Hierarchical models do this automatically.
Model Parameters
๐ Pooling Analysis
Low pooling: estimates stay close to raw averages (large samples or high between-group variance)
Team Estimates
Pooling Strategies
Complete Pooling
Ignore groups, one estimate
โ Low variance
โ High bias
No Pooling
Separate estimate per group
โ No bias
โ High variance
Partial Pooling
Shrink toward global mean
โ Balanced
โ Assumes structure
๐ Betting Applications
Player Props
Shrink early-season stats toward career/positional average
Team Ratings
Estimate team strength accounting for roster turnover
Situational Splits
Home/away, day/night with limited data
New Customers
Estimate LTV with few transactions
R Code Equivalent
# Hierarchical model with lme4
library(lme4)
# Random intercepts model
model <- lmer(points ~ 1 + (1 | team), data = df)
# Extract estimates
fixed <- fixef(model) # Global mean
random <- ranef(model)$team # Team deviations
shrunk_estimates <- fixed + random
# Compare to raw means
raw_means <- aggregate(points ~ team, df, mean)
# Shrinkage factor
variance_components <- as.data.frame(VarCorr(model))
between_var <- variance_components$vcov[1]
within_var <- variance_components$vcov[2]
n_per_group <- 10
pooling <- (within_var / n_per_group) / (between_var + within_var / n_per_group)
cat(sprintf("Pooling factor: %.0f%%\n", pooling * 100))โ Key Takeaways
- โข Hierarchical models handle grouped data
- โข Shrinkage reduces noise in small samples
- โข More shrinkage with fewer observations
- โข Partial pooling = best of both worlds
- โข Essential for early-season projections
- โข Use lme4 (R) or PyMC (Python)