Detecting Unusual Results in High-Volatility Table Games
A Statistical Framework for Interpreting Drop-to-Hold Discrepancies in VIP Baccarat and High-Limit Blackjack
Abstract
Casino executives evaluate table-game performance through drop, actual win, theoretical win, and hold percentage. These measures are operationally convenient but statistically incomplete, particularly in high-volatility environments such as VIP baccarat, premium blackjack, and short-duration private salon play. Drop does not measure total wagering exposure, and actual hold is not a stable estimator of game quality when outcomes are dominated by a small number of large wagers.
Distinguishing ordinary adverse variance from statistically meaningful anomalies requires tools that are absent from standard casino reporting. The model developed here combines conditional heteroskedastic variance estimation, peaks-over-threshold Extreme Value Theory, and Bayesian posterior updating, yielding an executive decision system that classifies negative hold events into three operational categories: mathematically expected variance, statistically rare but plausible variance, and anomaly-prioritized events warranting surveillance, game-protection, or procedural review.
Management should not ask whether a shift result is above or below theoretical hold. The productive question is whether the result is abnormal after conditioning on wager concentration, game pace, player composition, dealer-player pairings, rule variations, and historical tail behavior.
Keywords: casino analytics, baccarat variance, table games hold, Extreme Value Theory, Generalized Pareto Distribution, Bayesian inference, anomaly detection, heteroskedasticity, game protection, VIP gaming.
1. Introduction: The Drop-to-Hold Fallacy
The hold percentage is one of the most familiar ratios in casino management:
[ \text{Hold Percentage} = \frac{\text{Actual Win}}{\text{Drop}}. ]
A table drops $1,000,000 and wins $150,000; management sees a 15% hold. A table drops $1,000,000 and loses $250,000; management sees a negative 25% hold and asks whether the loss was “normal variance.” In low-limit mass-market games, this intuition may be serviceable — many small wagers diversify outcome risk. In VIP play, however, the same ratio becomes a dangerous simplification.
Drop is a cash-flow measure, not a complete exposure measure. It records the amount exchanged into chips, credit markers, or other table-game value instruments, but it does not capture total amount wagered, number of decisions, bet dispersion, side-bet composition, shoe penetration, player concentration, or the clustering of large wagers around specific dealers, tables, shoes, or time windows. Two shifts may show identical drop while having radically different mathematical risk. A grind-pit shift may contain thousands of low-denomination decisions. A VIP baccarat shift may contain a small number of decisions in which a single patron’s maximum wager dominates the entire risk profile.
Casino losses are not merely deviations from theoretical hold. They are realizations from a conditional distribution whose variance changes with the underlying structure of play. In statistical terms, high-limit table games are heteroskedastic: the variance of actual win is not constant but depends on the size, concentration, and sequence of wagers.
The right question is not whether actual hold is above or below a historical average. The relevant object is the standardized loss:
[ L_t = E(W_t \mid \mathcal{F}_t) - W_t, ]
where (W_t) is actual casino win for period (t), (E(W_t \mid \mathcal{F}_t)) is expected win conditional on observed game information, and (\mathcal{F}_t) represents the operational information available at the time: game type, rules, decisions per hour, bet sizes, player identities, dealer assignments, table conditions, fills and credits, marker activity, and known procedural exceptions.
A large loss is not automatically suspicious. A small loss is not automatically innocent. The relevant question is:
[ P(L_t > x \mid \mathcal{F}_t, \text{fair game conditions}). ]
If that probability is ordinary, the loss should be treated as variance. If it is extreme but plausible, it should be documented and monitored. If it is too rare after conditioning on the real exposure structure, the event should be elevated to an anomaly investigation.
2. Operational Definitions
Before the statistical analysis can proceed, several foundational concepts need to be distinguished from one another.
2.1 Drop
Drop is the value placed into the table drop box or otherwise recorded as table-game statistical drop under the casino’s accounting rules. It is an accounting and control measure — not equivalent to total wagering volume.
2.2 Handle or Total Wagering Exposure
Let (H_t) denote total wagering exposure during period (t):
[ H_t = \sum_{j=1}^{n_t} b_{jt}, ]
where (b_{jt}) is the amount wagered on decision (j), and (n_t) is the number of wagering decisions during period (t).
For baccarat and blackjack, (H_t) is generally unobserved unless the casino has accurate rating, electronic bet tracking, RFID chips, or robust manual recording. In practice, (H_t) must often be estimated from average bet, hands per hour, player time on game, and table utilization.
2.3 Actual Win
Actual win (W_t) is the realized casino result:
[ W_t = \sum_{j=1}^{n_t} Y_{jt}, ]
where (Y_{jt}) is the casino’s profit or loss on decision (j).
2.4 Theoretical Win
Theoretical win is the expected casino win under fair rules and accurate play assumptions:
[ TW_t = E(W_t \mid \mathcal{F}_t). ]
For a single wager (j),
[ E(Y_{jt} \mid \mathcal{F}{j}) = h{g(j)} b_{jt}, ]
where (h_{g(j)}) is the house advantage for the game and wager type (g(j)). Thus,
[ TW_t = \sum_{j=1}^{n_t} h_{g(j)} b_{jt}. ]
2.5 Drop-to-Hold Discrepancy
The discrepancy of interest is not merely:
[ \frac{W_t}{D_t} - \text{Historical Hold}. ]
The more meaningful object is the conditional residual:
[ R_t = W_t - TW_t. ]
For loss analysis, define:
[ L_t = -R_t = TW_t - W_t. ]
A large (L_t) indicates that the casino underperformed theoretical expectation.
3. Why Standard Deviation Models Fail in VIP Gaming
Many casino reports implicitly assume a stable standard deviation around theoretical win. This assumption is frequently false. The conditional variance of actual win depends on the square of wager size.
Let (X_{jt}) be the unit outcome of wager (j), so that:
[ Y_{jt} = b_{jt}X_{jt}. ]
Assume:
[ E(X_{jt}) = h_{g(j)} ]
and
[ Var(X_{jt}) = v_{g(j)}. ]
Then:
[ Var(Y_{jt}) = b_{jt}^{2}v_{g(j)}. ]
If individual decisions are approximately independent conditional on game state, then:
[ Var(W_t \mid \mathcal{F}t) \approx \sum{j=1}^{n_t} b_{jt}^{2}v_{g(j)}. ]
Here is the structural problem with drop-based interpretation. Variance is driven by (\sum b_{jt}^{2}), not merely by (\sum b_{jt}) or by drop (D_t).
A mass-market table with 1,000 wagers of $100 has total exposure:
[ H = 1000 \times 100 = 100{,}000, ]
and squared exposure:
[ \sum b_j^2 = 1000 \times 100^2 = 10{,}000{,}000. ]
A VIP table with 10 wagers of $10,000 has the same total exposure:
[ H = 10 \times 10{,}000 = 100{,}000, ]
but squared exposure:
[ \sum b_j^2 = 10 \times 10{,}000^2 = 1{,}000{,}000{,}000. ]
The VIP table has the same handle but one hundred times the variance exposure. Its standard deviation is ten times larger. A loss that appears catastrophic under a drop-based model may be unremarkable under an exposure-weighted one.
4. Concentration Risk and Effective Number of Decisions
Wager concentration can be summarized by a Herfindahl-style exposure index:
[ C_t = \frac{\sum_{j=1}^{n_t} b_{jt}^{2}}{\left(\sum_{j=1}^{n_t} b_{jt}\right)^2}. ]
The effective number of equal-sized decisions is:
[ N_{\text{eff},t} = \frac{1}{C_t} = \frac{\left(\sum_{j=1}^{n_t} b_{jt}\right)^2}{\sum_{j=1}^{n_t} b_{jt}^{2}}. ]
If all bets are equal, (N_{\text{eff}}) equals the actual number of decisions. If one player or one wager dominates the exposure, (N_{\text{eff}}) collapses.
This has direct implications for VIP management. A shift may contain 500 recorded decisions, but if the economic outcome is dominated by 20 maximum bets, the effective sample size may be closer to 20 than 500. The Law of Large Numbers has not failed; management has simply misidentified the relevant sample size.
5. Conditional Heteroskedastic Model
Let (W_t) represent actual casino win during a shift, day, shoe, table period, or player session. Define:
[ W_t = TW_t + \varepsilon_t, ]
where:
[ E(\varepsilon_t \mid \mathcal{F}_t) = 0 ]
under fair conditions, and:
[ Var(\varepsilon_t \mid \mathcal{F}_t) = \sigma_t^2. ]
A basic conditional variance model is:
[ \sigma_t^2 = \sum_{j=1}^{n_t} b_{jt}^{2}v_{g(j)}
- 2\sum_{j<k} Cov(Y_{jt},Y_{kt} \mid \mathcal{F}_t). ]
The covariance term matters when results are not independent. In blackjack, card depletion, count-sensitive betting, splitting, doubling, insurance, and shoe conditions can introduce dependence. In baccarat, serial dependence is weaker from the player-decision perspective because drawing rules are fixed, but shoe composition, commission rules, side bets, player bet selection, and bet clustering still affect realized volatility.
A practical regression form is:
[ \sigma_t^2 = \alpha_0
- \alpha_1 H_t
- \alpha_2 \sum b_{jt}^{2}
- \alpha_3 C_t
- \alpha_4 P_t
- \alpha_5 S_t
- \alpha_6 M_t
- \alpha_7 R_t^{rule}
- u_t, ]
where:
- (H_t) = estimated total handle,
- (\sum b_{jt}^{2}) = squared wager exposure,
- (C_t) = concentration index,
- (P_t) = pace of play,
- (S_t) = side-bet or proposition-bet exposure,
- (M_t) = marker or credit intensity,
- (R_t^{rule}) = rule or procedure variation indicators.
This model does not replace game mathematics. It calibrates game mathematics to the casino’s actual operating environment.
6. Proposition: Why Drop-Based Z-Scores Misclassify VIP Risk
Proposition
Suppose two shifts have equal total handle (H), equal game mix, and equal theoretical win rate, but different wager concentration. Let Shift A consist of (n) equal wagers of size (b), and Shift B consist of (m) equal wagers of size (c), where (nb = mc = H), and (m < n). Then the conditional variance of Shift B exceeds that of Shift A by the ratio:
[ \frac{Var(W_B)}{Var(W_A)} = \frac{n}{m}. ]
Proof
For equal game type and approximately independent decisions:
[ Var(W_A) = nb^2v, ]
[ Var(W_B) = mc^2v. ]
Because (nb = mc = H), we have (c = H/m) and (b = H/n). Therefore:
[ \frac{Var(W_B)}{Var(W_A)} = \frac{m(H/m)^2v}{n(H/n)^2v} = \frac{H^2v/m}{H^2v/n} = \frac{n}{m}. ]
Fewer, larger wagers create greater variance even when total handle is identical. A drop-based Z-score that ignores this structure will overstate abnormality in concentrated VIP play and understate abnormality in dispersed low-limit play.
7. Extreme Value Theory for Catastrophic Loss Events
The heteroskedastic model estimates ordinary conditional variance. Extreme Value Theory addresses the tail — the rare losses that exceed normal operating thresholds.
Let (L_t = TW_t - W_t) denote the casino’s underperformance against theoretical win. Define a high threshold (u), such as the 95th or 97.5th percentile of historical loss residuals after conditioning for exposure and game mix.
The exceedance is:
[ Z_t = L_t - u \quad \text{given} \quad L_t > u. ]
Under the peaks-over-threshold framework, exceedances over a sufficiently high threshold can often be approximated by the Generalized Pareto Distribution:
[ P(Z_t \leq z \mid Z_t > 0) = 1 - \left( 1 + \frac{\xi z}{\beta} \right)^{-1/\xi}, ]
where:
- (\xi) is the shape parameter,
- (\beta > 0) is the scale parameter,
- (1 + \xi z/\beta > 0).
The survival function is:
[ P(Z_t > z) = \left( 1 + \frac{\xi z}{\beta} \right)^{-1/\xi}. ]
The shape parameter (\xi) determines tail heaviness. If (\xi > 0), the tail is heavy, meaning extreme losses decay slowly. If (\xi = 0), the GPD converges to the exponential case. If (\xi < 0), the tail has a finite upper endpoint.
For casino operations, (\xi) is not merely a statistical parameter. A high positive (\xi) indicates that the operation is exposed to rare but very large adverse results. This may be natural in VIP baccarat, but it may also signal poor game spread control, excessive maximum bets relative to bankroll tolerance, weak table-level exposure monitoring, or unrecognized procedural risk.
8. Return Levels and Executive Risk Language
EVT can translate technical tail probabilities into management language.
Let (p_u = P(L_t > u)), the empirical probability that the loss exceeds threshold (u). For a return period (T), the return level (q_T) is the loss magnitude expected to be exceeded once every (T) comparable periods.
A GPD-based approximation is:
[ q_T = u + \frac{\beta}{\xi} \left[ (Tp_u)^{\xi} - 1 \right], \quad \xi \neq 0. ]
If (\xi = 0):
[ q_T = u + \beta \log(Tp_u). ]
Rather than telling executives, “The shift was 3.2 standard deviations below theoretical,” the analyst can say:
Under the current VIP exposure profile, this loss is estimated as a once-in-38 comparable shifts event.
That language is more useful. It connects mathematics to operational expectation, budgeting, surveillance escalation, and executive risk appetite.
9. Bayesian Inference: From Rare Loss to Anomaly Probability
EVT estimates how rare a loss is under fair conditions. Rare does not mean suspicious by itself — a casino still needs a method for updating the probability that an anomaly is present.
Let:
- (A) = an active anomaly exists,
- (F) = fair game conditions,
- (E_t) = observed evidence at time (t), including loss magnitude, player behavior, dealer pairing, pace anomalies, marker patterns, shoe irregularities, procedural exceptions, and surveillance notes.
The Bayesian posterior probability is:
[ P(A \mid E_t) = \frac{P(E_t \mid A)P(A)} {P(E_t \mid A)P(A) + P(E_t \mid F)P(F)}. ]
Here:
- (P(A)) is the prior probability of anomaly,
- (P(E_t \mid A)) is the likelihood of observing the evidence if an anomaly exists,
- (P(E_t \mid F)) is the likelihood of observing the evidence under fair conditions,
- (P(A \mid E_t)) is the posterior probability after observing the evidence.
EVT contributes to (P(E_t \mid F)). If the loss is extreme but still plausible under the calibrated tail model, then (P(E_t \mid F)) is not negligible. If the loss is nearly impossible under the exposure-adjusted model, then (P(E_t \mid F)) becomes very small, causing the posterior probability of anomaly to rise sharply.
Financial loss alone is insufficient evidence. The signal becomes most meaningful when it coincides with operational irregularities:
[ E_t = {L_t, C_t, DLR_t, PPR_t, SPR_t, MCR_t, PER_t}, ]
where:
- (L_t) = loss residual,
- (C_t) = wager concentration,
- (DLR_t) = dealer-loss residual,
- (PPR_t) = player-pairing residual,
- (SPR_t) = shoe/procedure residual,
- (MCR_t) = marker-credit residual,
- (PER_t) = procedural exception residual.
The posterior should rise fastest when financial abnormality and operational abnormality coincide.
10. Sequential Updating During the Shift
Waiting until month-end to apply Bayesian analysis is too slow. Posterior probabilities can be updated continuously during the shift as new evidence accumulates.
Let (E_{1:t}) represent evidence accumulated from the start of the shift through time (t). Then:
[ P(A \mid E_{1:t}) \propto P(A) \prod_{s=1}^{t} P(E_s \mid A). ]
For fair conditions:
[ P(F \mid E_{1:t}) \propto P(F) \prod_{s=1}^{t} P(E_s \mid F). ]
The posterior odds form is especially useful:
[ \frac{P(A \mid E_{1:t})}{P(F \mid E_{1:t})} = \frac{P(A)}{P(F)} \prod_{s=1}^{t} \frac{P(E_s \mid A)}{P(E_s \mid F)}. ]
Each new signal contributes a likelihood ratio:
[ LR_s = \frac{P(E_s \mid A)}{P(E_s \mid F)}. ]
If (LR_s > 1), the evidence favors anomaly. If (LR_s < 1), the evidence favors fair variance. This structure allows management to design escalation thresholds.
Example:
| Posterior Probability | Operational Classification | Recommended Action |
|---|---|---|
| (P(A \mid E) < 5%) | Normal variance | Document only |
| 5%–15% | Watch condition | Supervisor review |
| 15%–30% | Elevated anomaly risk | Surveillance bookmark and pit review |
| 30%–50% | High anomaly risk | Immediate game-protection review |
| >50% | Critical anomaly probability | Executive notification and formal investigation |
The exact thresholds should be determined by the casino’s risk appetite, regulatory environment, game protection policy, staffing capacity, and historical false-positive cost.
11. A Practical Classification System
Classification matters as much as detection. An alert system that outputs a single binary flag leaves too much analytical work to the reviewer. Negative hold events need three distinct categories.
Class I: Variance-Consistent Loss
A loss is variance-consistent if:
[ P(L_t > x \mid \mathcal{F}_t, F) \geq \alpha_1, ]
where (\alpha_1) may be set at 5% or 10%.
Operational meaning: the result is uncomfortable but mathematically ordinary for the exposure profile.
Class II: Tail-Expected Loss
A loss is tail-expected if:
[ \alpha_2 \leq P(L_t > x \mid \mathcal{F}_t, F) < \alpha_1, ]
where (\alpha_2) may be 0.5% or 1%.
Operational meaning: the result is rare but still explainable by calibrated tail risk. Document it, but do not overreact.
Class III: Anomaly-Prioritized Loss
A loss is anomaly-prioritized if:
[ P(L_t > x \mid \mathcal{F}_t, F) < \alpha_2 ]
and at least one operational risk factor is present:
[ OR_t = 1. ]
Examples of operational risk factors include unusual dealer-player concentration, unexplained pace changes, abnormal credit behavior, repeated losses under the same procedural configuration, unresolved fill/credit discrepancies, or repeated threshold-adjacent losses.
Operational meaning: the financial result and operational context jointly justify investigation.
12. Dealer-Player Pairing Analysis
One of the most important applications is the detection of repeated abnormality under specific dealer-player combinations.
Let (W_{d,p,t}) be the casino result when dealer (d) and player (p) are jointly present during period (t). Define the residual:
[ R_{d,p,t} = W_{d,p,t} - TW_{d,p,t}. ]
The cumulative standardized pairing residual is:
[ Z_{d,p} = \frac{ \sum_t R_{d,p,t} }{ \sqrt{\sum_t \sigma_{d,p,t}^{2}} }. ]
A single negative result may be variance. Repeated negative residuals under the same pairing are more informative. A Bayesian model can assign each dealer-player pair a latent risk score (\theta_{d,p}):
[ R_{d,p,t} \sim N(\theta_{d,p}, \sigma_{d,p,t}^{2}). ]
Under fair conditions:
[ \theta_{d,p} = 0. ]
Persistent negative (\theta_{d,p}) suggests that the pairing produces results worse than expectation after adjusting for exposure.
A hierarchical prior prevents overreaction to small sample sizes:
[ \theta_{d,p} \sim N(0, \tau^2). ]
The posterior estimate shrinks noisy pairings toward zero while allowing repeated abnormal pairings to emerge.
13. Implementation Architecture
Implementation does not require a perfect historical dataset. What it requires is a disciplined decision about what must be recorded going forward.
13.1 Required Data Fields
Minimum viable fields:
- gaming date,
- shift,
- pit,
- table,
- game type,
- rule variation,
- dealer ID,
- supervisor ID,
- player ID where available,
- time in and time out,
- estimated average bet,
- maximum bet,
- decisions or hands per hour,
- buy-in/drop,
- fills,
- credits,
- markers issued,
- markers redeemed,
- chips in,
- chips out,
- actual win/loss,
- side-bet exposure,
- procedural exceptions,
- surveillance notes.
13.2 Derived Metrics
The system should calculate:
[ TW_t = \sum h_g b_j, ]
[ R_t = W_t - TW_t, ]
[ \sigma_t^2 = \sum b_j^2 v_g, ]
[ Z_t = \frac{R_t}{\sigma_t}, ]
[ C_t = \frac{\sum b_j^2}{(\sum b_j)^2}, ]
[ N_{\text{eff},t} = \frac{1}{C_t}. ]
For thresholds:
[ Z_t^{loss} = \frac{TW_t - W_t}{\sigma_t}. ]
13.3 Model Governance
The system should be maintained across three distinct layers. The mathematical layer covers game rules, house edge, variance constants, and exposure estimates. The statistical layer handles heteroskedastic variance, EVT tail fitting, and Bayesian posterior updating. The operational layer translates outputs into escalation policy: surveillance review, dealer rotation, shoe audit, documentation, and compliance reporting.
No alert should be treated as proof. The system identifies where professional review has the highest expected value.
14. Case Illustration
Two baccarat shifts each report $1,000,000 drop and a $200,000 casino loss.
Shift A: Grind-Like Premium Play
- 2,000 decisions,
- average wager $500,
- low player concentration,
- no unusual dealer-player pairing,
- ordinary pace.
Estimated standard deviation:
[ \sigma_A = $32{,}000. ]
Loss against theoretical:
[ L_A = $210{,}000. ]
Standardized loss:
[ Z_A^{loss} = \frac{210{,}000}{32{,}000} = 6.56. ]
A result at 6.56 sigma is statistically extraordinary. Here the accounting report and the risk model agree: the result warrants review.
Shift B: Concentrated VIP Play
- 80 decisions,
- average wager $12,500,
- one player dominates exposure,
- several maximum bets,
- ordinary procedure.
Estimated standard deviation:
[ \sigma_B = $180{,}000. ]
Standardized loss:
[ Z_B^{loss} = \frac{210{,}000}{180{,}000} = 1.17. ]
A result at 1.17 sigma is unremarkable. The accounting report shows a $200,000 loss and a negative 20% hold. The risk model shows a result that, given the actual exposure structure, could easily recur on any ordinary VIP shift without any operational cause.
The accounting report sees both shifts as “$1,000,000 drop, negative 20% hold.” The risk model sees two entirely different statistical realities.
15. Managerial Implications
Several practical consequences follow from this analysis.
Surveillance and game-protection teams should not spend hours reviewing mathematically ordinary VIP losses simply because the accounting hold looks dramatic. False positives consume investigative capacity, create staff distrust, and erode the credibility of the review process itself.
Escalation should be reserved for events that are genuinely abnormal after conditioning on exposure — and especially for events where financial abnormality and operational anomaly coincide. A large loss with clean procedural records and expected statistical behavior is a different event from a smaller loss accompanied by unusual dealer concentration, pace irregularities, or unexplained credit activity.
Dealer and pit manager evaluations should not be built on raw hold in high-volatility environments. A shift crew should be assessed against exposure-adjusted expected results and procedure compliance, not against the realized variance of a single maximum-bet player who happened to run well against the house.
Finally, if the casino offers high maximum bets to a small number of players, management must understand the tail risk that policy accepts. A negative shift may not indicate operational failure; it may mean the maximum-bet structure allows large downside swings by design. That is a risk-appetite question, and it belongs in the boardroom rather than on the shift supervisor’s performance review.
16. Limitations
The model depends on data quality in ways that matter operationally. Inaccurate average bet estimates, poorly recorded player time, or incomplete dealer assignments will produce unreliable inferences regardless of statistical sophistication.
EVT requires careful threshold selection. A threshold that is too low contaminates the tail model with ordinary observations; too high, and there are not enough exceedances for stable parameter estimation. In practice, threshold selection involves iterative diagnostics and expert judgment about the operating environment.
Bayesian priors must be actively governed and periodically reviewed. They should be calibrated from historical incident rates, surveillance findings, known vulnerabilities, and game-protection expertise. Miscalibration in either direction produces problems: too high a prior and the system over-alerts; too low, and genuine patterns go undetected. This is not a one-time calibration task — it requires ongoing adjustment as the property’s operating profile evolves.
Most importantly, an anomaly score is a triage instrument. It identifies where professional review has the highest expected value. It does not establish misconduct, and it should never be applied as though it does.
17. Conclusion
High-limit table-game volatility cannot be managed by static hold percentages or unconditioned standard deviation rules. The core error is treating drop as a proxy for risk exposure. Drop measures cash flow, not the distribution of wagered amounts that drives variance.
Variance is proportional to squared wager size. Player concentration, pace of play, and procedural context shape the conditional distribution of outcomes in ways that are invisible to any metric that looks only at chips in and chips out. EVT provides a principled approach to modeling catastrophic losses beyond ordinary variance. Bayesian inference provides a principled approach to updating anomaly probability as financial and operational evidence accumulates.
Together, they allow casino executives to distinguish three realities that standard reporting conflates: normal variance, rare but expected tail variance, and statistically meaningful anomaly risk. This distinction determines where investigative resources go, how staff are evaluated, and what risk the casino has agreed to accept when it sets its maximum-bet policy.
Negative hold is a symptom, not a diagnosis. Whether a result reflects ordinary variance, extreme but plausible bad luck, or something warranting investigation can only be determined after conditioning on the actual structure of play. For that, a model is required.
References
Adams, R. P., & MacKay, D. J. C. (2007). Bayesian Online Changepoint Detection. arXiv:0710.3742.
Balkema, A. A., & de Haan, L. (1974). Residual life time at great age. The Annals of Probability, 2(5), 792–804. doi:10.1214/aop/1176996548.
Coles, S. (2001). An Introduction to Statistical Modeling of Extreme Values. Springer.
Davison, A. C., & Smith, R. L. (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society: Series B, 52(3), 393–442. doi:10.1111/j.2517-6161.1990.tb01796.x.
Embrechts, P., Klüppelberg, C., & Mikosch, T. (1997). Modelling Extremal Events: For Insurance and Finance. Springer.
Ethier, S. N., & Stefanello, L. (2025). Long-term behavior of casino games. arXiv:2512.20818.
Pickands, J. (1975). Statistical inference using extreme order statistics. The Annals of Statistics, 3(1), 119–131. doi:10.1214/aos/1176343003.
White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48(4), 817–838.
Nevada Gaming Control Board. Minimum Internal Control Standards: Table Games. Relevant sections on statistical drop, statistical win, statistical hold percentage, and investigation of statistical fluctuations.
Wizard of Odds. Baccarat Basics: Eight-Deck Analysis. House-edge estimates for banker, player, and tie wagers.