0/70 completed
Metrics & Evaluation Interactive

Log Loss (Cross-Entropy)

Measure how well probability predictions match outcomes. Heavily penalizes confident wrong predictions. The standard loss for classification.

๐Ÿ“Š The Log Loss Formula

LogLoss = -1/N ร— ฮฃ[yยทlog(p) + (1-y)ยทlog(1-p)]
  • y = Actual outcome (0 or 1)
  • p = Predicted probability
  • N = Number of predictions

Key Properties

  • โ€ข 0 = Perfect predictions
  • โ€ข 0.693 = Random (50/50 guessing)
  • โ€ข โˆž = 100% confident and wrong

Single Prediction

Predicted Probability 0.7
0.01 0.99
Actual Outcome

๐Ÿ“Š Result

0.3567

-log(0.70)

Moderate loss. Prediction was uncertain.

Log Loss Penalty Curves

When actual=1, low probability predictions get heavily penalized (green curve rises left). The opposite for actual=0 (red curve rises right).

Model Comparison

Well-Calibrated Model
0.507
Overconfident Model
1.636
Random Baseline
0.693

Insight: Overconfident models get punished hard by log loss when they're wrong, even if their accuracy is similar.

๐Ÿ“Š Log Loss vs Accuracy

Accuracy

Only cares about correct/incorrect. Ignores confidence.

  • โ€ข 51% prediction, actual=1 โ†’ โœ“ Correct
  • โ€ข 99% prediction, actual=1 โ†’ โœ“ Correct (same credit)
  • โ€ข Doesn't reward calibrated probabilities

Log Loss

Measures confidence AND correctness. Penalizes overconfidence.

  • โ€ข 51% prediction, actual=1 โ†’ Loss: 0.67
  • โ€ข 99% prediction, actual=1 โ†’ Loss: 0.01 (much better)
  • โ€ข Rewards well-calibrated probabilities

๐Ÿ€ Sports Betting Applications

When to Use Log Loss

  • โ†’ Evaluating probability predictions (not just picks)
  • โ†’ Training classification models
  • โ†’ Comparing model calibration

Practical Tips

  • โ†’ Lower is better (unlike accuracy)
  • โ†’ 0.693 = random baseline (beat this!)
  • โ†’ Clip probabilities to [0.01, 0.99] for stability

R Code Equivalent

# Calculate Log Loss
log_loss <- function(predicted, actual, eps = 1e-15) { 
  # Clip predictions to avoid log(0)
  predicted <- pmax(eps, pmin(1 - eps, predicted))
  
  # Calculate loss
  loss <- -(actual * log(predicted) + (1 - actual) * log(1 - predicted))
  
  return(mean(loss))
}

# Example
pred <- c(0.7)
actual <- c(1)
ll <- log_loss(pred, actual)
cat(sprintf("Log Loss: %.4f\n", ll))

# Compare to baseline
baseline_loss <- log(2)  # Random 50/50
cat(sprintf("Baseline (random): %.4f\n", baseline_loss))
cat(sprintf("Improvement: %.1f%%\n", (1 - ll/baseline_loss) * 100))

โœ… Key Takeaways

  • โ€ข Log loss: lower is better (0 = perfect)
  • โ€ข Heavily penalizes confident wrong predictions
  • โ€ข 0.693 = random baseline (50/50 guessing)
  • โ€ข Better than accuracy for probability evaluation
  • โ€ข Use for training classification models
  • โ€ข Clip predictions to avoid infinity

Pricing Models & Frameworks Tutorial

Built for mastery ยท Interactive learning