Machine Learning Interactive
Model Ensembling
Combine predictions from multiple models for better accuracy. Learn stacking, blending, and optimal weight selection for sports prediction models.
๐ฏ The Wisdom of Crowds
๐ฅ
Condorcet Jury
If each model is > 50% accurate, majority vote approaches 100% as n increases.
๐ฒ
Error Cancellation
Independent errors average out. Bias remains, variance decreases.
๐งฉ
Complementarity
Different models capture different patterns in the data.
Ensemble Configuration
2 10
55 75
0.1 0.8
๐ Ensemble Result
Best Single Model
63.9%
Ensemble Accuracy
72.2%
Ensemble Lift +8.3%
Individual Model Accuracies
RFLinearXGBoostCatBoostLightGBM
Stacking Architecture
Base Models
XGBoostRandom ForestLightGBMCatBoost
Diverse tree-based models
โ
Meta Features
OOF PredictionsRank FeaturesConfidence
Out-of-fold predictions
โ
Meta Learner
Logistic RegressionRidge
Learn optimal blend
๐ Blending Strategies
โ
Simple Average
(1/n) ร ฮฃ predictions Best when: Equal model quality
โ๏ธ
Weighted Average
ฮฃ wแตข ร predictionแตข Best when: Known model quality
๐
Rank Average
Average of ranks Best when: Different scales
โ๏ธ
Geometric Mean
โ predictions^(1/n) Best when: Multiplicative effects
๐ก Practical Tips for Sports Models
Model Selection
- โ Include different algorithm families (trees, linear, neural)
- โ Use different feature subsets per model
- โ Vary hyperparameters for diversity
- โ Remove highly correlated models (> 0.9)
Weight Optimization
- โ Use cross-validation for weight selection
- โ Constrain weights to sum to 1 (convex combination)
- โ Consider time-weighted ensembles for drift
- โ Simple average often beats complex optimization
R Code Equivalent
# Model stacking with caret
library(caret)
library(caretEnsemble)
# Define base models
model_list <- caretList(
outcome ~ .,
data = train_data,
trControl = trainControl(
method = "cv",
number = 5,
savePredictions = "final"
),
methodList = c("xgbTree", "rf", "glmnet")
)
# Check model correlations
modelCor(resamples(model_list))
# Stack with meta-learner
stack <- caretStack(
model_list,
method = "glm", # Simple meta-learner
trControl = trainControl(method = "cv", number = 5)
)
# Predict
ensemble_pred <- predict(stack, test_data)
# Simple weighted ensemble
blend_predictions <- function(predictions, weights = NULL) {
if (is.null(weights)) {
weights <- rep(1/ncol(predictions), ncol(predictions))
}
as.vector(as.matrix(predictions) %*% weights)
}โ Key Takeaways
- โข Diversity is more important than individual accuracy
- โข Stacking: train meta-learner on OOF predictions
- โข Simple averaging is a strong baseline
- โข More models = diminishing returns after 5-7
- โข Validate ensemble on holdout, not CV
- โข Remove correlated models to improve diversity