Dynamic Pricing Interactive
A/B Testing Frameworks
Design and analyze experiments to test pricing changes, payout structures, and features with statistical rigor.
๐ฌ A/B Testing Fundamentals
๐ฏ
Hypothesis
Treatment will improve metric by X%
๐ฅ
Randomization
Random assignment to control/treatment
๐
Sample Size
Enough data for statistical power
โ
Significance
p-value determines if real effect
Experiment Settings
1 20
-20 50
100 10000
Confidence Level
Higher confidence = fewer false positives, but need more samples
๐ Required Sample Size
31,199
Per group for 80% power at 95% confidence
Experiment Results
Control
5.0%
50 / 1000
Treatment
5.5%
55 / 1000
Observed Lift
+10.0%
Z-Score
0.50
p-value
0.616
โ Not Significant
Conversion Rate Over Time
Rates converge as sample size increases. Early reads are noisy!
๐งช Common Pricing Experiments
Payout Structure
A: 2x multiplier
B: 1.9x + bonus
Metric: Retention
Hold Rate
A: 5% hold
B: 4% hold
Metric: Handle Volume
UI Design
A: Current
B: New parlay flow
Metric: Entries/User
Promo Type
A: $10 free bet
B: 50% deposit match
Metric: Deposits
โ A/B Testing Best Practices
Do
- โ Pre-register hypothesis and sample size
- โ Run until predetermined sample reached
- โ Use guardrail metrics (retention, complaints)
- โ Segment results (new vs returning users)
Don't
- โ Peek at results and stop early if significant
- โ Run multiple tests without correction
- โ Ignore long-term effects (measure for weeks)
- โ Test on small segments then assume full rollout same
R Code Equivalent
# A/B test analysis
ab_test <- function(control_conv, treatment_conv, n_per_group) {
# Proportions
p1 <- control_conv / n_per_group
p2 <- treatment_conv / n_per_group
# Pooled proportion and SE
p_pooled <- (control_conv + treatment_conv) / (2 * n_per_group)
se <- sqrt(p_pooled * (1 - p_pooled) * 2 / n_per_group)
# Z-test
z <- (p2 - p1) / se
p_value <- 2 * (1 - pnorm(abs(z)))
return(list(lift = (p2/p1 - 1) * 100, z = z, p_value = p_value))
}
# Sample size calculation
required_n <- power.prop.test(
p1 = 0.05,
p2 = 0.055,
power = 0.8,
sig.level = 0.05
)$n
# Run test
result <- ab_test(50, 55, 1000)
cat(sprintf("Lift: %.1f%%\np-value: %.4f\n", result$lift, result$p_value))โ Key Takeaways
- โข Calculate required sample size BEFORE starting
- โข Don't peekโwait for full sample
- โข p-value < 0.05 = statistically significant
- โข Small effects need large samples to detect
- โข Consider practical significance, not just statistical
- โข Use guardrail metrics to catch negative effects