Edges — Soccer Analytics

124 non-obvious advantages that separate elite practitioners from everyone else.

Conventional Wisdom Is Wrong(32)

Conventional Wisdom Is Wrong

A 2-Person Analytics Department Beats a 10-Person One If They're Solving the Right Problems

Analytics department size is not correlated with impact. A 2-person team embedded in tactical meetings, solving the coach's actual pain points, will have more influence on match outcomes than a 10-person team building sophisticated models in isolation. The constraint on analytics impact in football is almost never technical capability — it's organizational integration and problem selection.

What most people do
Scale analytics departments by adding more analysts and more sophisticated tools, assuming capability equals impact.
What the best do
Keep the team small and focused. One analyst embedded with the coaching staff, one analyst embedded with recruitment. Both attend meetings, identify the real decisions being made, and build tools that directly support those decisions. Add headcount only when the existing team's impact bottleneck is capacity, not integration.
Why it's an edge: Football clubs that over-invest in analytical headcount often produce MORE noise, not more signal. Each additional analyst feels pressure to produce output, leading to reports and dashboards that nobody asked for. A lean team with strong organizational integration avoids this trap.
How to exploit: Before hiring any analytics position, define the specific decision it will improve and the stakeholder who needs it. If you can't name the decision and the stakeholder, the hire will be wasted.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18; Sam Gregory, Inter Miami, StatsBomb Conference, 2022-09-29.
Conventional Wisdom Is Wrong

Most Football Models Are Never Validated Against Known Football Facts — And Many Fail Basic Sanity Checks

A model can have excellent log-loss on held-out data but produce results that violate basic football knowledge. If your event valuation model says a penalty area shot is worth less than a midfield pass, or that headers are more valuable than feet shots from the same location, the model has learned a statistical artifact, not football. Behavioral assertion tests — verifying that model outputs match known football truths — catch these failures that standard ML metrics miss.

What most people do
Validate models using statistical metrics (AUC, log-loss, calibration) on held-out data and declare them ready.
What the best do
After standard validation, run a suite of behavioral assertion tests: penalty area shots > midfield shots, free kicks > open play from same distance, 1v1s > congested shots. Any failure indicates the model has learned a spurious pattern. These tests are written like unit tests and run on every model update.
Why it's an edge: A model that passes statistical validation but fails behavioral assertions will produce recommendations that coaches immediately reject as nonsensical. This destroys trust in analytics — a single "the model says midfield passes are more valuable than shots" finding can undermine years of credibility-building.
How to exploit: Build a football-specific behavioral test suite (20-30 assertions). Run it on every model before deploying. Treat any failure as a blocker. This catches problems before they reach stakeholders.
StatsBomb CTO, StatsBomb Innovation in Football Conference, 2019-10-25. Behavioral assertion testing for SARSA model validation.
Conventional Wisdom Is Wrong

Positional Averages Hide 3-4 Distinct Archetypes Within Every Position

"Average center-back" or "average right-back" doesn't exist as a meaningful concept. Within each nominal position, 3-4 distinct archetypes exist with fundamentally different statistical profiles (e.g., progressive ball-playing CB vs. aerial-dominant CB vs. covering sweeper CB). Percentile rankings against "all center-backs" penalize specialists by diluting their elite dimensions with irrelevant comparisons. A ball-playing CB in the 40th percentile for aerial duels isn't bad — they're being measured against aerially-dominant CBs who play a different game.

What most people do
Normalize metrics within the full position group. Compare all CBs to all CBs.
What the best do
First cluster the position into archetypes. Then normalize within archetype. Evaluate a ball-playing CB against other ball-playing CBs, not against aerial monsters. The comparison population determines the conclusion.
Why it's an edge: Archetype-blind evaluation produces systematic errors: it underrates specialists and overrates generalists. The best player at a specific archetype may look average when measured against the full position group.
How to exploit: Define 3-4 archetypes per position from cluster analysis. For each recruitment target, identify their archetype first, then rank within that archetype. A player who is 95th percentile within their archetype is more valuable than a player who is 70th percentile across all position members.
Ted Knutson, multiple StatsBomb presentations. Archetype-based profiling as a core scouting methodology.
Conventional Wisdom Is Wrong

Hard Reset Dominates Mild Reset — More Reward AND Less Risk

tactical-analysisattack-reset-decision

Taking the ball all the way back to your goalkeeper ("hard reset") produces ~2x goal lift AND lower concession risk than a mild reset to the halfway line. Hard reset forces the opponent's press line up, creating exploitable space behind it.

What most people do
Reset to the halfway line, believing proximity to the opponent's goal is safer.
What the best do
Commit to full-depth hard resets, then exploit the space created when the press line advances.
Why it's an edge: Intuition says proximity = advantage. Data shows the opposite: mild resets don't disturb the defensive shape.
How to exploit: Track hard vs. mild reset frequency. Coach deliberate hard resets with a specific re-entry trigger when probing against a set defense for >15 seconds.
Cross-domain parallel
In poker, folding to wait for a better position is often more +EV than playing a marginal hand.
Perdomo & Zarrella, 23 Sports, StatsBomb Conference, 2019-10-28
Conventional Wisdom Is Wrong

Intelligent Backward Pass Rate Predicts Possession Quality Better Than Completion Rate

The rate at which a player's backward passes produce positive ΔEPV (creating better forward options) is a stronger predictor of possession quality than total pass completion rate. Intelligent backward passes are attacking tools, not retreats.

What most people do
Use pass completion rate as a possession quality proxy. Backward passes count against "progressive" metrics.
What the best do
Compute "intelligent backward pass rate" = backward passes with positive ΔEPV / total backward passes. Use as a possession quality signal.
Why it's an edge: Players with high intelligent backward pass rate are systematically undervalued because their backward passes look passive. But their 10-second ball path consistently shows forward progression.
How to exploit: In recruitment, identify midfielders with high intelligent backward pass rate — they see the whole field and use backward passes to manipulate defensive shape.
Javier Fernandez, FC Barcelona, 2019-10-22
Conventional Wisdom Is Wrong

Some Players Generate MORE Threat Under Pressure — Not Less

The default assumption is that pressure degrades performance. But data shows some players generate higher xT per combination UNDER pressure than when unpressed. The mechanism: bypassing pressure opens space that doesn't exist in settled possession. Busquets, Kroos, De Jong, and Modric show less than 5% degradation in combination success rate under pressure — their profiles barely change. Some even improve.

What most people do
Assume pressure always degrades performance and evaluate all players negatively when pressed.
What the best do
Measure the full 5-metric receiving profile (combination success, direct xT, facilitated xT, predictability, pressure degradation) and identify players whose threat INCREASES under pressure. These players should be given the ball in pressured situations deliberately.
Why it's an edge: Players with pressure-positive profiles are systematically undervalued because the market reads "high pressure faced" as a negative signal. A midfielder who receives under pressure and beats it consistently is creating space for the entire team — but standard metrics penalize them for being in pressured situations.
How to exploit: Build the 5-metric receiving profile for all midfield targets. Specifically recruit players with <10% degradation AND stable/increasing xT under pressure. Route the ball to them when pressed — the press becomes a tactical weapon for your team, not the opponent's.
Shou & Manus, StatsBomb Conference, 2021-11-04. Busquets archetype: low direct xT (plays backward) but highest threat facilitation (next player's action is dangerous).
Conventional Wisdom Is Wrong

Completed Long Balls Show Zero Effectiveness Advantage Over Short Combinations

Even when long balls are completed, buildups using them show no statistical advantage in shot or goal probability. The speed advantage is entirely offset by loss of team shape. This isn't about interceptions — even the ones that work don't produce better outcomes.

What most people do
A completed long ball is "successful." "He skipped the midfield" is treated as positive.
What the best do
Evaluate long balls by what happens after completion. Show coaches that even completed long balls produce no better downstream outcomes.
Why it's an edge: Don't pay a premium for "excellent long passing range" in buildup contexts where it provides no outcome advantage.
How to exploit: When scouting opponents relying on long-ball buildup, don't fear the completed long ball. Invest in short-combination speed (tempo) rather than long-ball range.
Benjamin (physicist), StatsBomb Conference, 2019-10-25
Conventional Wisdom Is Wrong

Pressure's Main Effect on Pass Completion Is Tiny — The Real Impact Is in Interactions

When building a contextual pass completion model, the pressure coefficient as a main effect is tiny — less than 1% raw completion difference. The model is correct: pressure alone barely changes completion rate. But pressure INTERACTS with distance and direction: pressure on a long forward pass degrades completion far more than pressure on a short lateral pass. The main effect is nearly zero while the conditional effects are substantial.

What most people do
Either overweight pressure as a blanket degradation factor or, upon finding the small main effect, dismiss pressure as unimportant for passing.
What the best do
Add interaction terms (pressure x distance, pressure x forward direction) to capture the conditional effects. Pressure is a modifier that amplifies the difficulty of already-hard passes while barely affecting easy ones. The analytical value of pressure data is in the interactions, not the main effect.
Why it's an edge: Analysts who find a near-zero pressure main effect and conclude "pressure doesn't matter for passing" will build inferior models. The interaction terms reveal that pressure is the MOST important feature for the hardest, most valuable passes (long forward passes through the lines) while being irrelevant for recycling passes. This distinction is critical for evaluating progressive passers.
How to exploit: Always include pressure x distance and pressure x direction interaction terms in pass completion models. When evaluating players, compute completion above expectation separately for pressured long forward passes vs. all passes — the former is a much stronger skill signal for midfield recruitment.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. Described the near-zero main effect and the interaction-dependent conditional effects.
Conventional Wisdom Is Wrong

The Best Passers in the League May Have Lower Completion Rates Than Average Ones

Pass completion rate is inversely correlated with pass ambition. The very best passers attempt harder passes — longer, more progressive, under more pressure — which mechanically lowers their raw completion rate. A midfielder with 78% completion who is +6% above expected on every pass is objectively better than a midfielder with 91% completion who is +1% above expected but only attempts safe passes. The market rewards the 91% player because the number looks better.

What most people do
Sort by raw completion rate. Sign the player with the highest percentage.
What the best do
Compute context-adjusted completion (xPass residual) alongside raw completion. The gap between the two is the ambition signal. High raw + low residual = safe passer. Lower raw + high residual = elite passer taking on harder passes.
Why it's an edge: This is the most persistent market inefficiency in football analytics. Despite xPass models being available for years, most clubs still use raw completion rate as a first-pass filter, eliminating the most ambitious passers from consideration.
How to exploit: Invert the filter. Sort by xPass residual, not raw completion. The players who surface at the top of this list but are filtered out by raw completion rate are the market inefficiency.
Will Morgan, StatsBomb Conference, 2022-10-03; consistent finding across multiple StatsBomb presentations.
Conventional Wisdom Is Wrong

Different Data Providers Count the Same Match Differently — And Most Analysts Don't Know Which One They're Using

data-infrastructurecounting-scheme-bias

Event data providers use fundamentally different counting conventions. One provider's "pressure" event requires physical proximity; another's includes distant angle-blocking. One counts a failed dribble as a "dribble attempted + failed"; another doesn't record the attempt at all unless the dribble succeeded. These aren't measurement errors — they're different definitions of the same concept. Cross-provider comparisons without accounting for counting-scheme differences produce meaningless results.

What most people do
Treat event data as ground truth regardless of provider. Compare metrics built on Provider A's data against benchmarks built on Provider B's data.
What the best do
Document every counting convention used in their analytical pipeline. When comparing to external benchmarks or published research, verify the counting scheme matches. Build conversion factors between providers where possible.
Why it's an edge: Most published "norms" and "benchmarks" in football analytics are provider-specific. If your data comes from a different provider, those benchmarks may be systematically off. A team that looks "average" on Provider A's metrics might look "elite" on Provider B's — or vice versa.
How to exploit: When building any metric, document the provider and counting conventions used. When comparing to external research, check the source provider. When switching providers, rebuild all benchmarks rather than assuming continuity.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. Explicit discussion of StatsBomb's unique pressure event definition and counting conventions.
Conventional Wisdom Is Wrong

Optimal Defense Sometimes Says Leave the Man, Protect the Space

A position optimizer minimizing spatial xT sometimes recommends leaving a marked player to protect central space. The math says the space is more valuable than the man.

What most people do
Default to man-marking. Evaluate defenders by whether they tracked their man.
What the best do
Run counterfactual position analysis. Accept that optimal positions sometimes contradict man-marking. Use as a coaching conversation, not a directive.
Why it's an edge: Man-marking is deeply ingrained. Specific situations exist where zonal coverage produces better outcomes.
How to exploit: For recurring high-TAx situations, run the optimizer post-match. Build training scenarios around the 2-3 per match where space > man.
Gregory Everett, StatsBomb Conference, 2022-10-03
Conventional Wisdom Is Wrong

Burnley Is Actually the 4th Most Pressing Team (Opponent-Strength Confound)

After Bayesian decomposition removes opponent quality and home/away effects, Burnley is the 4th most aggressive pressing team. Their low-block reputation comes from playing mostly against much stronger opponents. Man City's pressing intensity is partly an artifact of their talent advantage.

What most people do
Use raw pressing metrics (PPDA, pressures/90) to classify teams.
What the best do
Decompose into baseline + opponent strength + home/away using Bayesian hierarchical models.
Why it's an edge: Preparing for "low-block Burnley" when they press against similar-ranked opponents means your game plan is wrong 60% of the time.
How to exploit: Build opponent-strength-adjusted pressing profiles. Condition on YOUR team's xGD relative to theirs.
StatsBomb Conference 2021, 2021-11-04
Conventional Wisdom Is Wrong

Ball Progression Persists Through Brief Turnovers — The "Teleportation Fallacy"

EPV treats turnovers as value-destroying. But after most turnovers, the ball stays in roughly the same zone. A headed clearance from a cross keeps the ball in the attacking third. The "high-water mark" persists through brief possession losses.

What most people do
Use EPV, which drops sharply on any turnover. Penalize players for turnovers even when the ball stays in the attacking zone.
What the best do
Track ball position persistence through turnovers. Distinguish "ball stays in zone" turnovers from "ball exits zone" turnovers.
Why it's an edge: Teams that play aggressively in the final third generate many "turnovers" that don't actually lose territorial advantage. They're systematically undervalued by EPV.
How to exploit: Build a "possession-zone persistence" metric tracking where the ball is 5 seconds after turnover. Don't penalize turnovers that keep the ball in the attacking zone.
StatsBomb CTO, 2019-10-25
Conventional Wisdom Is Wrong

Pass Completion Rate Is Almost Entirely Explained by Pass Difficulty — Completion Above Expected Is the Real Metric

Raw pass completion rate is dominated by the difficulty distribution of passes attempted. A player who attempts 90% short passes will show 88% completion. A player who attempts 50% progressive passes will show 72% completion. They look 16 percentage points apart, but the difference is entirely explained by pass selection, not execution quality. Expected pass completion (xPass) models, which predict completion probability from pass features, reveal that the residual — completion above expected — is the actual skill signal, and it's much smaller than raw completion suggests.

What most people do
Compare raw pass completion rates across players and conclude higher = better.
What the best do
Build or use an xPass model. Compute completion above expected per player. The player with 72% raw but +4% above expected is actually a better passer than the player with 88% raw but +1% above expected.
Why it's an edge: Raw pass completion is the single most misleading metric in football analytics. It penalizes ambitious passers and rewards conservative ones. Using it for recruitment or evaluation leads to systematically selecting the wrong players.
How to exploit: Never present raw pass completion. Always present completion above expected. For recruitment: sort by completion above expected, not raw completion. A player with -2% completion above expected despite 85% raw completion is actually a poor passer making easy passes.
Will Morgan, StatsBomb Conference, 2022-10-03. xPass model with gender-aware calibration. Ted Knutson, multiple presentations, consistently emphasizes this point.
Conventional Wisdom Is Wrong

Intentional Negative-EPV Plays Can Be Strategically Positive

Kicking the ball out for an opponent throw-in near their corner to set up a press is EPV-negative but strategically correct. EPV drops to zero at dead-ball moments but the team expects to win back from the restart in favorable shape.

What most people do
Treat all negative-EPV actions as mistakes. Penalize players who register them.
What the best do
Catalog "strategically negative-EPV" play types. Exclude from player penalization. Separately measure post-dead-ball value.
Why it's an edge: Analysts who naively use EPV undervalue tactically sophisticated teams. Coaching staff will distrust analytics that penalizes actions they know are correct.
How to exploit: Build a "post-dead-ball value" model estimating expected EPV from the restart. A conceded throw-in deep in the opponent's half may have net-positive value once the restart is modeled.
StatsBomb CTO, 2019-10-25
Conventional Wisdom Is Wrong

High-Tackle Defenders Are Usually the Worst Positional Defenders

The Maldini principle — "if you have to make a tackle, you've already made a mistake" — is quantifiable via xT denied. Defenders who make the most tackles and interceptions are typically doing so in high-xT zones near the box, meaning they allowed the opponent to penetrate that far. The best defenders never need to tackle because opponents never reach dangerous zones.

What most people do
Evaluate defenders by tackle + interception counts, rewarding high-volume tacklers as "aggressive" or "committed."
What the best do
Evaluate defenders by opponent xT potential when they're on the pitch — the best defenders minimize threat by positioning, not by intervention.
Why it's an edge: Clubs consistently overpay for flashy tacklers while undervaluing boring positional defenders who suppress opponent xT without visible actions. This creates a systematic market inefficiency in defensive recruitment.
How to exploit: Build a "defensive suppression" metric: opponent xT potential when defender is on-pitch vs. off-pitch. Recruit defenders who score high on suppression but low on tackle volume — they'll be cheaper because the market rewards action counts.
Cross-domain parallel
In basketball, "good defense doesn't show up in the box score" is the same principle — defensive plus-minus captures what blocks/steals miss.
PhD student, StatsBomb Innovation in Football Conference, 2019-10-30. xT denied framework shows proactive defenders minimize opponent access to high-threat zones.
Conventional Wisdom Is Wrong

Lineup Optimization Is a Constraint Satisfaction Problem, Not a Best-11-Players Problem

The best 11 individuals do not form the best team. Lineup optimization is a mixed-integer programming problem where player-role fit, pair synergy, positional coverage constraints, and game-model compliance interact. A mathematically optimal lineup may exclude the team's highest-rated individual player because their inclusion creates a worse collective configuration.

What most people do
Select the best player at each position independently, then hope the combination works.
What the best do
Model lineup selection as a constraint-satisfaction optimization: define the game model's positional requirements, include pair synergy scores as interaction terms, add hard constraints (minimum defensive coverage, minimum distribution quality from the back), and solve for the lineup that maximizes expected team output — not individual ratings.
Why it's an edge: A team with a 95th-percentile player who doesn't fit the system will underperform a team of 75th-percentile players who fit together. The interaction effects dominate the individual effects for most positions.
How to exploit: Build a lineup optimizer that takes individual ratings, pair synergy scores, and game model constraints as inputs. Run it before every match. Compare its output to the coach's selection — the divergences are the analytical insight.
James, University of Southampton, StatsBomb Innovation in Football Conference, 2019-10-30. Pair synergy scoring shows interaction effects.
Conventional Wisdom Is Wrong

Pass-Only Pressure Analysis Misclassifies the Best Press-Beaters

Dembele at Tottenham is the canonical example: his pass radar collapses under pressure (looks like a liability), but his carry/dribble response drives the ball 20 yards forward. Analyzing only passing behavior under pressure produces false negatives for players whose primary pressure escape mechanism is ball-carrying. Most pressure analysis is pass-only and systematically misclassifies elite dribblers as pressure-negative.

What most people do
Evaluate pressure response by looking at pass direction changes under pressure — forward pass spike = good, backward shift = bad.
What the best do
Compute separate directional profiles for passes, carries, AND dribbles under pressure, then combine into a weighted composite. The action type the player switches to under pressure is the actual response mechanism.
Why it's an edge: Players who are pressure-positive via carries are cheaper than they should be because standard pass-based pressure analysis flags them as liabilities. Any club using pass-only pressure metrics will systematically avoid exactly the players who are best at beating the press.
How to exploit: For every player, compute the forward-gain metric across ALL action types (pass + carry + dribble) under pressure. Specifically target players where carry forward gain compensates for pass conservatism. These players will be underpriced by pass-only evaluation systems.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. Dembele pass radar collapse vs. carry/dribble forward drive as the defining example.
Conventional Wisdom Is Wrong

Shot Probability Goes Up With Time in the Zone — Goal Probability Goes Down

After entering the final quarter, goal probability initially rises but then DECLINES after ~20 seconds as the defense organizes. Teams that dwell too long shoot more but convert less. For buildups from own half, possessions taking >10 seconds to reach the offensive zone show NO edge over average — the defense has had time to set. This creates a clear decision framework: attack within the window or reset.

What most people do
Equate "more time in the offensive zone" with "better attacking" and encourage patient probing. Measure shot volume as the attacking quality indicator.
What the best do
Track dwell time vs. goal probability, not just shot probability. Implement a "regroup trigger" at ~15-20 seconds: if no clear chance has emerged, recycle possession and re-enter fresh. Treat buildup speed (under 10 seconds from own box to opponent box) as a key performance indicator.
Why it's an edge: The 20-second zone dwell threshold and 10-second transit threshold are concrete, actionable numbers that most teams don't track. Teams that implement the regroup trigger avoid the "late, low-quality shot" trap that inflates shot counts without improving goal conversion.
How to exploit: Build dwell-time tracking into real-time match analysis. Alert coaching staff when average dwell time exceeds 20 seconds (defense is set, shots will be low quality). Compare your team's dwell-to-goal curve against league average — if your curve decays faster, your opponents organize quickly and you need even faster decisions.
Benjamin (physicist), StatsBomb Innovation in Football Conference, 2019-10-25. Shot probability vs. goal probability divergence over dwell time demonstrated.
Conventional Wisdom Is Wrong

Raw EPV Delta Systematically Undervalues Defenders and Overvalues Attackers

A defender whose possession-maintenance play produces zero EPV delta is NOT performing at average level — in contexts where 90% of historical outcomes were negative (turnovers, backward passes), breaking even is 90th percentile performance. Raw EPV delta doesn't account for opportunity: what was achievable given the situation? Context-relative scoring (measuring what a player achieved versus the distribution of what was historically achievable in similar situations) eliminates position bias. Z-scores produce extreme outliers; percentile ranks bounded to [-1, +1] are more robust.

What most people do
Rank players by raw EPV delta, which creates a leaderboard dominated by attackers in high-reward zones and penalizes defenders who competently navigate low-reward zones.
What the best do
Use LSTM autoencoders to encode possession sequences, find similar historical sequences via nearest-neighbor matching, and compute percentile rank of the player's action within that distribution. A defender's break-even play in a terrible context maps to a high percentile.
Why it's an edge: Clubs using raw EPV for player evaluation will never identify their best defenders or deep midfielders because those players are structurally capped by the reward landscape of their pitch zones. Opportunity-normalized scoring puts defenders and attackers on a comparable scale for the first time.
How to exploit: Build the opportunity-normalization pipeline. Produce player rankings where the top 20 includes defenders and midfielders alongside attackers. When these rankings surface a DM or CB as elite, cross-reference with traditional metrics — if traditional metrics rank them average but opportunity-normalized metrics rank them elite, you've found an undervalued player.
StatsBomb CTO, StatsBomb Innovation in Football Conference, 2019-10-25. Full pipeline described: LSTM autoencoder, K-NN matching, percentile scoring.
Conventional Wisdom Is Wrong

Elite Players Appear to Fail Easy Short Passes at Impossible Rates — It's a Data Artifact

When a pass is blocked or intercepted, it registers in the data as a failed completion with a very short distance (the distance the ball actually traveled before interception, not the intended distance). Standard xPass models see these as "failed 2-yard passes" — which should be 99% completion — and conclude the player can't complete trivial passes. The real story: the player attempted a 20-yard progressive pass that was intercepted after 2 yards. Without imputing the intended target, xPass models systematically penalize the best progressive passers.

What most people do
Build xPass models on observed pass distance, which conflates blocked progressive passes with failed short passes.
What the best do
Model xPass as a two-stage problem: first predict the intended target location for passes with unknown recipients (blocked, intercepted), then compute completion probability against the imputed target. This eliminates the "short failed pass" artifact.
Why it's an edge: Any club using a standard xPass model without intent imputation is systematically undervaluing their most progressive passers and overvaluing conservative recyclers. The model literally cannot distinguish a blocked 20-yard through ball from a botched 2-yard layoff.
How to exploit: If your data provider supports pass intent (StatsBomb does for ~5% of passes), use it. For the rest, build the two-stage imputation model. When evaluating progressive passers, check if their "failed short pass" rate is anomalously high — if so, the model is likely penalizing blocked progressive passes.
Dr. Will Morgan, StatsBomb Conference, 2022-10-03. Two-step xPass approach eliminating the short-distance completion paradox.
Conventional Wisdom Is Wrong

Recruitment Analytics Is Now Commoditized — The Edge Has Moved to Tactical and Development

Sports analytics adoption follows a consistent three-phase sequence: (1) Recruitment (Moneyball), (2) Tactical, (3) Player Development. Football is in the Phase 1 to 2 transition. Phase 1 investment is now table stakes in top leagues — "Moneyball stopped working when everyone read Moneyball." The competitive advantage follows the least-efficient frontier, which has moved to Phase 2 (tactical optimization) and Phase 3 (biomechanical player development). Clubs still consolidating Phase 1 are investing in a market that has already adjusted.

What most people do
Continue investing heavily in Phase 1 recruitment analytics, which has diminishing returns as competitors have caught up.
What the best do
Maintain Phase 1 as infrastructure while pushing resources into Phase 2 (tactical analytics — press compliance, set-piece optimization, game model measurement) and Phase 3 (player development — biomechanical feedback, training data integration). These phases are where competitor investment is thinnest.
Why it's an edge: The expected return on marginal analytical investment is highest where the fewest competitors have invested. Most clubs' analytics departments are still recruitment-focused. Shifting resources toward tactical and development analytics buys competitive advantage for years before the market catches up.
How to exploit: Diagnose your club's current phase. If Phase 1 is mature (recruitment decisions are data-driven), invest in Phase 2 capabilities: real-time tactical analytics, press compliance modeling, set-piece optimization, and halftime data-to-coach pipelines. Begin scoping Phase 3: training event data capture, biomechanical feedback systems.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Three-phase model using baseball velocity revolution as Phase 3 example.
Conventional Wisdom Is Wrong

Most Analytics Research Over-Indexes on Finishing and Ignores Where 80% of Possessions Actually Break Down

Football analytics disproportionately focuses on the finishing phase (xG, shot quality, conversion rates) while the majority of possession breakdowns occur in the build-up and progression phases. A team that never reaches the finishing phase has no use for shot quality analysis. The bottleneck is almost always progression (moving past the organized defensive block), not finishing — but progression analytics is dramatically underinvested.

What most people do
Measure attacking quality via xG and shot-based metrics, which only capture the 20-30% of possessions that reach the final third.
What the best do
Decompose every possession into phases (build-up, progression, finishing) and measure transition rates between phases. The diagnostic question becomes "what % of possessions successfully transition from progression to finishing?" — which isolates the bottleneck more precisely than any shot-based metric.
Why it's an edge: A team with poor finishing but excellent progression can improve by acquiring a finisher. A team with excellent finishing but poor progression won't improve by acquiring ANOTHER finisher — they need a progression-phase solution. Phase decomposition makes the right investment obvious.
How to exploit: Build phase-transition rates as your primary possession quality metric. Filter by phase before analyzing any tactical question. When a team "can't score," first check whether the breakdown is in progression or finishing — the prescription is completely different for each.
Javier Fernandez, FC Barcelona, StatsBomb Innovation in Football Conference 2019, 2019-10-22. "If you only analyze finishing, you're like a chess player who only analyzes their last 10 moves."
Conventional Wisdom Is Wrong

High-Pressing Teams Pay a Measurable, Unavoidable Shot-Concession Tax

Pressing teams create shorter opponent possessions but also concede more shots when beaten — the defense is committed forward, leaving space behind. Leeds is the extreme outlier: highest pressing intensity AND highest shot concession in the Premier League. Liverpool and Man City also show elevated shot concession relative to non-pressing teams. This is a fundamental tradeoff, not a fixable problem. The question isn't "how do we press without conceding shots" — it's "how much shot-concession risk do we accept for pressing intensity?"

What most people do
Treat high pressing as purely beneficial and try to minimize the shot concession "problem" without recognizing it as a structural tradeoff inherent to the style.
What the best do
Quantify the press-shot tradeoff explicitly: plot adjusted short-possession probability (pressing success) vs. adjusted shot-possession probability (pressing failure cost). Accept the tradeoff as structural and optimize within it — e.g., invest in a GK who excels at the specific shot types your pressing style concedes.
Why it's an edge: Understanding this tradeoff means you can build a complete system around it: press high, accept you'll concede shots, recruit a GK who excels at those shot types (see goalkeeper-shot-type-matching), and accept the structural variance. Teams that try to press AND not concede shots are fighting physics.
How to exploit: If you press high, profile the shot types you concede when beaten and match your GK to that profile. If you're evaluating a pressing team, don't penalize them for high shot concession — check whether their GK compensates. For opponents: if they press high, plan to bypass the press and exploit the space behind. The shots will be there.
StatsBomb internal research, StatsBomb Conference, 2021-11-04. Leeds as extreme outlier. Bayesian decomposition revealing the tradeoff.
Conventional Wisdom Is Wrong

Liverpool's Press and Burnley's Press Are So Different They Shouldn't Share a Name

"High press" is used to describe radically different defensive strategies. Liverpool's press is a coordinated trap (force to one side, collapse passing lanes, win the ball in the opponent's half). Burnley's "high press" under Dyche was individual effort-based aggression without coordinated lane-cutting. The taxonomy needs at least 4-5 distinct pressing styles, each with different spatial signatures, or the term "press" becomes analytically meaningless. Coaching "press higher" without specifying which style produces chaotic defending.

What most people do
Classify pressing as "high" or "low" based on average defensive line height, treating all high presses as equivalent.
What the best do
Build a multi-dimensional pressing taxonomy: (1) trigger mechanism (ball arrival in zone vs. opponent body orientation vs. specific player), (2) coordinated vs. individual, (3) forcing direction (left, right, back), (4) intended outcome (win ball vs. force error vs. slow progression). Map each team to their specific style combination.
Why it's an edge: Preparing for "a team that presses high" without understanding their specific pressing style is like preparing for "a team that attacks" — the statement is too vague to be actionable. The specific pressing style determines the escape route.
How to exploit: Build pressing style profiles for each opponent from goal kick data + open play pressure maps. Identify their specific forcing direction, trigger mechanism, and coordination level. Prepare press-escape routes specific to their style, not generic anti-press training.
Ted Knutson & Siqur Arshad, WFS 2019; Nicole Kuzlova, StatsBomb Conference, 2021-11-04. Multiple distinct pressing styles identified from spatial analysis.
Conventional Wisdom Is Wrong

33% Cross Completion Rate Isn't Bad — The Question Is Where They Land

Crosses have a league-average completion rate of ~33%, which sounds wasteful. But a 33% cross to the far post that generates 0.12 xG when completed is better expected value than a 60% cross to the near post that generates 0.02 xG when completed: 0.33 x 0.12 = 0.040 vs 0.60 x 0.02 = 0.012. Raw completion rate is the wrong metric for evaluating crossing — expected value per cross (completion probability x reward if completed) is the correct one. Additionally, crosses that lead to shots two actions later (second-ball conversions) are undervalued by immediate-shot-chain metrics.

What most people do
Evaluate crossing by completion rate. "Low completion = bad crosser."
What the best do
Compute expected value per cross = P(completion) x E(xG|completion) for each target zone. Optimize target zone by game state: when protecting a lead, target high-completion zones; when chasing, target high-xG zones despite lower completion.
Why it's an edge: Teams that eliminate crossing because of "low completion rate" are removing one of their highest expected-value actions. The math says: keep crossing, but cross to the right zone for the situation.
How to exploit: Build a cross target-zone optimizer with a tunable risk tolerance parameter. Conservative (protecting lead) = near post, high completion. Aggressive (chasing) = far post/cutback zone, higher xG. Present to coaches with game-state-specific recommendations.
Caitlan Krasinski, StatsBomb Conference, 2022-10-03. Risk-tolerance cross optimization model.
Conventional Wisdom Is Wrong

Q-Learning Is Conceptually Invalid for Football — Only SARSA Works

Q-learning tries to find optimal strategy by controlling agents. You cannot control football players retrospectively. SARSA evaluates the strategy that already exists — the only valid approach for historical match data.

What most people do
Apply off-policy RL from game AI literature.
What the best do
Use SARSA with LSTM temporal context and three-outcome probability output, treating the match as one continuous sequence.
Why it's an edge: Q-learning's agent-control assumption is violated in observational sports data. SARSA also eliminates the possession-boundary problem.
How to exploit: Implement SARSA with 10-event LSTM sequences. Validate with behavioral assertion tests (penalty area shots > outside-box shots).
StatsBomb CTO, StatsBomb Conference, 2019-10-25
Conventional Wisdom Is Wrong

Speed Without Pressure Hurts Against Set Defenses

Speed WITHOUT pressure decreases goal probability against organized blocks — unnecessary errors while defense stays set. Speed only helps UNDER pressure. Optimal unpressed speed is ~6 m/s.

What most people do
"Play fast to break down low blocks."
What the best do
Patient circulation when unpressed. Accelerate only when the opponent commits to pressing.
Why it's an edge: This conditional interaction is invisible to raw speed metrics.
How to exploit: Compute ball speed against set defenses segmented by pressure state. If speed is high when unpressed, coach patience.
Perdomo & Zarrella, 23 Sports, StatsBomb Conference, 2019-10-28
Conventional Wisdom Is Wrong

Man City's Metrics Overrate Their Players — Team Structure Is the Cause

Man City players cluster at the top of positional metrics because the system creates structural spacing that inflates everyone. Low within-team SD proves it's a team effect. Transfer valuations based on raw metrics overrate players leaving elite possession systems.

What most people do
Rank by league-wide metrics and conclude top = most talented.
What the best do
Compute within-team z-scores. Use transfer natural experiments to separate team and individual contributions.
Why it's an edge: Clubs overpay every season for players leaving possession-dominant systems whose metrics collapse.
How to exploit: Before signing from a top possession team: check within-team SD, check historical transfer metric declines, check national team context.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23
Conventional Wisdom Is Wrong

The Teams That Top Raw Ball Speed Rankings Are NOT Playing at High Tempo

tactical-analysistempo-quantification

Tempo defined as raw ball speed produces absurd rankings — Stoke City under Pulis topped raw speed because of long clearances. True tempo is actual speed minus EXPECTED speed for each pass's context. Barcelona's short combinations in tight spaces register as "high tempo" because expected speed is low but they execute quickly. Stoke's long clearances register as average because expected speed for those situations is already high.

What most people do
Measure tempo via raw ball speed, possession speed, or passes per minute — all of which conflate meaningless speed (long clearances) with meaningful speed (quick combinations through pressure).
What the best do
Build a context-adjusted tempo metric: mean(actual_speed - expected_speed) across all passes. This separates meaningful tempo (faster than the context demands) from meaningless speed (long balls that happen to travel fast). Slice by pitch zone to distinguish tempo in buildup (less important) from tempo in the final third (more important).
Why it's an edge: Teams with genuinely high tempo in the final third create chances before defenses organize — the 20-second zone dwell principle connects directly. But this can't be measured without the expected-speed model. Raw speed metrics rank the wrong teams at the top.
How to exploit: Build the expected ball speed model, compute tempo scores, and identify which players and teams play faster than context demands in the zones that matter (final third, progression phase). Recruit players with high tempo in those specific zones. For opponents: identify where their tempo drops (likely against organized low blocks) and prepare to force them into that context.
Cross-domain parallel
In chess, tempo is "accomplishing an objective with fewer moves than expected" — the same relative-to-expectation concept.
Devin Pleuler, Toronto FC, StatsBomb Conference, 2021-11-04. Stoke vs. Barcelona example.

🔑Hidden Causal Lever(62)

🔑 Hidden Causal Lever

Naming a Metric Poorly Guarantees Rejection Regardless of Analytical Quality

data-infrastructureanalyst-translator-role

The linguistics of metric naming directly determines adoption. "Expected Goals" and "Walks + Hits per Innings Pitched" tell you what they measure — they succeed. Corsi, Fenwick, PDO (hockey metrics named after people or meaningless acronyms) block adoption regardless of quality because nobody knows what they mean. Proprietary "roll-up grades" (single-number player grades) are the "bane of the professional analyst's existence" — they undermine trust and prevent deeper engagement. The name IS the adoption strategy.

What most people do
Name metrics after their inventor, use opaque acronyms, or create proprietary composite grades — then wonder why coaches don't adopt them.
What the best do
Put the calculation in the name whenever possible. Avoid names that imply more evaluation than is happening. Avoid proprietary roll-up grades that promise to solve everything. Accept that naming well doesn't ensure adoption, but naming poorly ensures rejection.
Why it's an edge: This is the cheapest, highest-leverage intervention in analytics adoption. Renaming an existing metric costs nothing and can transform its uptake. Every metric name should pass the test: "Can someone who's never seen this metric guess roughly what it measures from the name alone?"
How to exploit: Audit every metric name in your analytics stack against this principle. Rename opaque metrics. Kill proprietary roll-up grades. When developing new metrics, spend as much time on the name as on the model. Run the naming test with non-analysts before launch.
Seth Partnow, StatsBomb Innovation in Football Conference, 2019-10-28. Applied linguistics framework to analytics naming. Corsi/Fenwick/PDO as failure cases.
🔑 Hidden Causal Lever

Descriptive Statistics Are Incredibly Underrated — Skip Them and Your Models Will Be Ignored

Analytics adoption follows a linguistic framework: metrics must be culturally transmitted, discrete (clearly bounded), and productive (combinable). Skipping straight to complex models before establishing shared vocabulary for basic descriptive stats is the #1 reason analytics departments fail. When a coach and analyst argue about a finding, they're usually arguing about definitions, not data. Building the shared vocabulary takes 3+ years.

What most people do
Jump to complex models (WAR-equivalents, EPV, xG-based composite grades) because they're analytically interesting, expecting coaches to adopt them based on accuracy.
What the best do
Start with simple, well-named counting stats that coaches recognize from watching the game. Build incrementally: once basic vocabulary is shared, combine building blocks into models. The apparent complexity decreases because stakeholders understand the components. Budget 3+ years for this process.
Why it's an edge: Clubs that rush to sophisticated models waste analytical investment because coaches can't use what they don't understand. The clubs that dominate are the ones that built vocabulary first — their coaches trust the data because they understand the ingredients. This is a competitive advantage measured in years of organizational learning.
How to exploit: Before deploying any model, ensure the component metrics are understood by all stakeholders. Name metrics so the calculation is in the name (like "Expected Goals"). Avoid opaque acronyms and proprietary roll-up grades. Test adoption: "Can the coach explain what this metric means in football terms?" If not, step back to simpler building blocks.
Seth Partnow, StatsBomb Innovation in Football Conference, 2019-10-28. Applied linguistics framework to analytics adoption. Cited Bill James: "statistics have acquired the power of language."
🔑 Hidden Causal Lever

Analytics Departments Fail Because They Solve Problems Nobody Asked About

The most common failure mode for analytics departments is not analytical quality but problem selection. Analysts build sophisticated models that answer questions coaches never had, then are surprised when their work is ignored. The root cause: analysts optimize for analytical impressiveness rather than coach/decision-maker pain points. The fix is not better models but better problem discovery — sit in tactical meetings, listen to what coaches argue about, and solve THOSE problems.

What most people do
Build the most analytically sophisticated work they can and present it, hoping the coach will see the value.
What the best do
Spend the first month in a new role attending every tactical meeting without presenting anything. Catalog the recurring arguments, the information gaps that coaches work around, and the decisions made with insufficient data. Then build tools that close THOSE gaps — not the gaps the analyst finds interesting.
Why it's an edge: A simple tool that answers a question the coach asks every week has 100x the impact of a sophisticated model that answers a question nobody has. The analyst's skill ceiling is bounded by their ability to identify the right problems, not their modeling capability.
How to exploit: Before building any new analytical tool, verify that a specific decision-maker has the problem you're solving. If you can't name the person and the decision, the tool will be ignored.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18; Sam Gregory, Inter Miami, StatsBomb Conference, 2022-09-29. Both emphasize problem selection over model sophistication.
🔑 Hidden Causal Lever

Static-Destination EPV Systematically Undervalues Through-Balls

Most EPV computes destination value using the receiver's position at pass release. For through-balls, the receiver is running toward the destination — the model sees an "empty space" pass and assigns low value. Through-balls, runs in behind, and diagonal balls into space are systematically undervalued.

What most people do
Use static-destination EPV. Accept that through-balls look low-value.
What the best do
Implement velocity-based trajectory prediction. Compute destination EPV at estimated arrival time, not at pass release time.
Why it's an edge: Classic #10s and inside forwards who specialize in through-balls are undervalued. Their most creative passes register as low-value because the model doesn't see where the receiver WILL be.
How to exploit: Build anticipatory pass value as a separate metric. Use it to identify players invisible in standard EPV.
Javier Fernandez, FC Barcelona, 2019-10-22
🔑 Hidden Causal Lever

The Team's Collective Response to Pressure Is More Important Than the Individual's — And It's Visible in the 10-Second Ball Path

The 10-second ball path after a pressure event captures not just the pressured player's decision but the entire team's collective response — where teammates move, who offers support, how the second and third passes route the ball. A team with consistently forward ball paths after pressure has a collective press-beating system, not just press-resistant individuals. Signing a press-resistant player into a team without collective press-beating movement won't change the ball path.

What most people do
Evaluate individual players' press resistance in isolation and assume team press resistance follows.
What the best do
Compute team-level ball paths after pressure by zone. If the team's collective ball path is consistently forward regardless of which individual is pressed, the system is the cause, not the individual. If only specific players' pressure events lead to forward paths, the system depends on those players.
Why it's an edge: Teams that rely on 1-2 press-resistant individuals are fragile — injure or press those players specifically and the system collapses. Teams with collective press-beating movement are robust — the ball path is forward regardless of who receives the pressure.
How to exploit: Compute player-specific vs. team-average ball paths after pressure. If the variance across players is low (everyone's ball path is similar), the team has a systemic solution. If variance is high (only 2-3 players have forward paths), the team is fragile — target those players with specific pressing.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. 10-second ball path averaging technique across team-level contexts.
🔑 Hidden Causal Lever

xT Misses the Best Deep Playmakers — Threat Facilitation Is the Missing Metric

Busquets consistently plays backward passes that generate low direct xT. But the NEXT action after his pass is consistently dangerous — high "threat facilitated." Standard xT evaluation ranks him poorly because it only measures the delta of HIS action, not what his action enables. This pattern applies to all deep-lying playmakers who set up the next progressive action rather than executing it themselves.

What most people do
Use xT or EPV delta per action to rank players, which systematically undervalues facilitators who play backward-but-enabling passes.
What the best do
Add "threat facilitated" (xT of the NEXT action after the player's outgoing pass) as a separate metric. Facilitators like Busquets score low on direct xT but elite on facilitated xT.
Why it's an edge: The market systematically undervalues players whose contribution is in the setup, not the execution. Clubs seeking deep playmakers who evaluate only xT will miss the best candidates.
How to exploit: Compute threat facilitated per 100 combinations alongside direct xT. Players with low direct xT but high facilitated xT are deep playmaker archetypes — recruit them before clubs using direct xT only identify them.
Shou & Manus, StatsBomb Conference, 2021-11-04. Busquets explicitly cited as the archetype facilitator missed by standard xT evaluation.
🔑 Hidden Causal Lever

Carries Into the Center Are Twice as Dangerous as Passes to the Wing — But You Can't Tell if It's Causal

Carries account for only 27% of offensive zone entries but produce shots 36% of the time. Passes are 73% of entries but only 26% produce shots. The critical distinction: carries compress toward the center (sideline to center), while passes expand toward the sidelines. Central carries ending in the middle 20m produce shots ~50% of the time. But the causal ambiguity is unresolved: do central carries CAUSE better outcomes, or do they only HAPPEN when the defense is already disorganized?

What most people do
Treat zone entries as binary (entered or didn't) without distinguishing method, direction, or length.
What the best do
Segment entries by method (carry vs. pass), starting position, ending position, and length. Recognize that the 50% shot rate for central carries may be partially selection bias (only possible when defense is already broken) and flag this limitation rather than treating the number as a causal prescription.
Why it's an edge: Knowing this distinction means you can diagnose WHY a team has high zone entry rates but low shot rates (answer: too many wide pass entries, not enough central carries). The fix isn't "do more carries" — it's "create the defensive disorganization that ALLOWS central carries."
How to exploit: Track carry vs. pass entry ratio and entry ending position for your team and opponents. If your team over-indexes on wide pass entries, investigate whether the issue is personnel (no dribbler to carry centrally) or tactical (not creating the preconditions for central carries). For opponent analysis: if they rely on central carries, force them wide — their shot creation drops dramatically.
Benjamin (physicist), StatsBomb Innovation in Football Conference, 2019-10-25. Conditional probability analysis across 5 major leagues.
🔑 Hidden Causal Lever

Central Carries Are the Most Dangerous Action in Football — 50% Shot Rate

Carries ending in the middle 20m of the pitch produce shots 50% of the time — roughly double the shot rate of passes entering the same zone. The spatial pattern is the mechanism: carries start on the sideline and cut to center, arriving with momentum and face-on orientation that passes cannot replicate. The carrier has already committed defenders laterally, creating the shot opportunity through the movement itself.

What most people do
Treat carries and passes as interchangeable zone-entry methods. Value progressive passes more than progressive carries in player evaluation because pass data is richer.
What the best do
Separately track central carry entries and compute their shot conversion rate. Recruit players who can execute sideline-to-center carries under pressure. Design tactical plans that create 1v1 carry opportunities in wide zones with inside-cutting angles.
Why it's an edge: Carries are underrepresented in most metrics (only 27% of entries) but disproportionately productive. A player with 3 central carries per game is contributing more zone entries than a player with 8 wide passes.
How to exploit: Build a "central carry entry rate" metric. Scout for wingers and fullbacks with high central carry frequency AND high shot-generation rate from carries. Prioritize tactical schemes that create isolation carry opportunities over crossing schemes.
Benjamin (physicist), StatsBomb Innovation in Football Conference, 2019-10-25. Carries ending in middle 20m showed 50% shot rate vs 26% for pass entries overall.
🔑 Hidden Causal Lever

Emery's Third-Season Curse Is Predictive, Not Coincidental

betting-intelligencecoach-tendency-profiling

Unai Emery has a documented pattern across Sevilla, PSG, Arsenal, and Villarreal where performance peaks in season 1-2 then collapses in season 3. This isn't random variance — it reflects a tactical approach that opponents decode and a motivational style that has diminishing returns. By season 3, the press patterns are scouted, the in-game adjustments are anticipated, and the dressing room dynamic shifts.

What most people do
Evaluate Emery based on his most recent season's results.
What the best do
Overlay his tenure-by-tenure trajectory at every club. The pattern is consistent enough to bet against season-3 Emery at any club.
Bet The Process / The Chaps on Film podcast analysis, 2024-2025.
🔑 Hidden Causal Lever

Moyes Is a Competence Floor Raiser, Not a Ceiling Raiser — Bet Accordingly

betting-intelligencecoach-tendency-profiling

David Moyes consistently overperforms expectations at lower-quality squads (Everton, first West Ham stint) and underperforms at higher-quality ones (Manchester United, second West Ham stint after spending). His value is as a "competence amplifier" — he brings organization and defensive solidity to chaotic squads but lacks the tactical sophistication to maximize elite talent. Market odds tend to treat him as a single-quality manager regardless of squad level.

What most people do
Price Moyes as "solid mid-table manager" everywhere.
What the best do
Back Moyes heavily when he's at a team punching below its organizational weight; fade him when he's given elite resources and expected to compete at the top.
Bet The Process podcast analysis, multiple episodes 2024-2025.
🔑 Hidden Causal Lever

Your Counting Scheme Determines Who Looks Valuable — From Identical Events

data-infrastructurecounting-scheme-bias

Re-categorizing the same play-by-play data flipped player value rankings entirely. The NBA's "contested shot" definition classified 90% of 3-pointers as "open." How you define "progressive pass" determines the leaderboard. All statistics are representative abstractions.

What most people do
Treat published metrics as objective. Accept one definition without testing alternatives.
What the best do
Test alternative counting schemes. If small definitional changes shuffle rankings by >30%, the metric is fragile.
Why it's an edge: Organizations build strategies around untested definitions. Once counted, people assume it matters.
How to exploit: For any decision-driving metric, build at least one alternative definition. If the top-10 changes dramatically, the decision is unreliable.
Cross-domain parallel
Backtesting results are sensitive to index construction — same stocks, different weights, different "best" strategies.
Seth Partnow, StatsBomb Conference, 2019-10-28
🔑 Hidden Causal Lever

Percentile Rank Profiles Transfer Across Leagues Better Than Absolute Values

When a player transfers between leagues, their percentile-rank profile within position peer groups tends to be more stable than absolute values. Ronaldo's Juventus season was statistically near-identical to his Real Madrid season across all key metrics. Guendouzi from Ligue 2 to Premier League showed nearly identical per-90 rate profiles. But some metrics transfer better than others: technical/passing profiles are more stable than finishing rates; physical metrics are less stable.

What most people do
Compare absolute values across leagues (10 progressive carries in Ligue 2 = 10 in the Premier League) or apply a crude blanket league discount.
What the best do
Compare percentile ranks within position groups pre- and post-transfer. Build metric-specific league adjustment coefficients based on historical transfer evidence. Track which metrics are most and least stable across league moves. Use stability data to set confidence levels on pre-transfer predictions.
Why it's an edge: Building metric-specific league adjustment factors — not a single blanket discount — dramatically improves post-transfer performance prediction. A player who is 95th percentile in Ligue 2 progressive passing might translate to 80th percentile in the Premier League, while a 95th percentile finisher in Ligue 2 might only translate to 50th percentile. The metric matters as much as the league.
How to exploit: Build a historical transfer database tracking pre/post-transfer percentile profiles by metric. Compute per-metric, per-league-pair adjustment factors. Apply these when evaluating cross-league transfer targets instead of a single discount. Weight more stable metrics (passing profile, pressing rate) higher than less stable ones (finishing rate) in cross-league evaluation.
Ted Knutson & Siqur Arshad, WFS 2019. Ronaldo Real→Juventus near-identical; Guendouzi Ligue 2→PL stable; Jovic Frankfurt→Real validation.
🔑 Hidden Causal Lever

Some Skills Transfer Across Leagues and Some Don't — The Transfer Risk Is In the Skill Profile, Not the League Gap

Cross-league transfers fail not because of a blanket "league quality gap" but because specific skills have different transferability. Technical skills (pass completion above expected, dribble success rate) transfer well across leagues. Tactical positioning skills transfer moderately (some adaptation needed). Physical-dependent skills (aerial duel win rate, sprint-based pressing) transfer poorly because the physical baseline shifts. A player whose value comes primarily from physical-dependent skills is a high-risk cross-league transfer. A player whose value comes from technical-tactical skills is a low-risk one.

What most people do
Apply a flat "league discount" to cross-league transfers (e.g., "Eredivisie to Premier League means -20%").
What the best do
Decompose the player's value into skill categories with known transferability coefficients. A technically-driven midfielder from the Eredivisie is a much lower-risk transfer than a physically-driven striker from the same league.
Why it's an edge: The flat league discount is applied uniformly, which means technically-driven players from lower leagues are over-discounted (opportunity) and physically-driven players are under-discounted (risk). Skill-specific transferability analysis reveals which players are safe bets and which are dangerous ones.
How to exploit: Build a skill-category transferability matrix from historical cross-league transfer data. For each target, compute a skill-weighted transfer risk score rather than applying a blanket league discount. Prioritize technically-driven players from lower leagues — they're systematically over-discounted.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Cross-league stability analysis for recruitment risk assessment.
🔑 Hidden Causal Lever

Good Decisions That Produce Bad Outcomes Get Punished — Destroying Future Decision Quality

Without systematic decision process logging, organizations evaluate on outcomes, not process quality. A signing that works out masks a flawed process that will fail next time. A signing that fails despite a sound process gets punished, discouraging the correct approach. The compounding effect of consistently good processes produces structural optionality, but only if the organization evaluates process separately from outcome.

What most people do
Run post-mortems on outcomes (did the signing work?) rather than process (did we consider the right options with the right data?). Good outcomes validate the process; bad outcomes condemn it.
What the best do
Log every major decision with: options considered, criteria used, rejected alternatives and why, expected outcome. Retrospectively evaluate whether the process was sound regardless of outcome. Build institutional memory of decision quality that compounds over time.
Why it's an edge: This is the "decision scientist" role from big tech (Meta/Google) applied to football. Clubs that separate process from outcome make better decisions over time because they're not punishing correct-but-unlucky choices or rewarding incorrect-but-lucky ones.
How to exploit: Implement a decision log for all recruitment, tactical, and lineup decisions. Review quarterly: was the process sound? Track the ratio of sound-process decisions to outcomes. Over 3+ years, the process-outcome correlation will improve because bad processes get identified before they produce catastrophic outcomes.
Ravi Ramineni, Source Football, StatsBomb Conference, 2023-10-26. Decision scientist role from big tech applied to football operations.
🔑 Hidden Causal Lever

Defensive Value Is 3x Harder to Measure Than Attacking Value — And Clubs Overpay for Attack as a Result

Defensive contributions are structurally harder to quantify because great defense often means nothing happens — no shot, no chance, no event to record. A center-back who positions perfectly so the opposition never attempts the through ball creates enormous value that generates zero data points. Attacking contributions (goals, assists, chances created) are directly observable and quantifiable. This measurement asymmetry causes a systematic market pricing error: clubs overpay for attackers (whose value is fully captured in data) and underpay for defenders (whose value is mostly invisible).

What most people do
Evaluate defenders using event-based metrics (tackles, interceptions, clearances) which only capture reactive defending — the things that happen AFTER the positioning has already failed.
What the best do
Use tracking data to measure proactive defensive value: how much opposing xT was suppressed by defender positioning alone (no tackle or interception needed)? This requires 360 data and spatial threat models but captures the majority of defensive value that event data misses.
Why it's an edge: The most valuable defenders are the ones whose metrics look least impressive because they prevent events from happening. A CB who makes 0 tackles per game because nobody dribbles at them is likely better positioned than a CB who makes 4 tackles per game.
How to exploit: Build "threat suppressed" metrics from 360/tracking data. Identify defenders whose presence reduces opposing xT more than their event-based metrics suggest. These players are systematically underpriced.
Gregory Everett, StatsBomb Conference, 2022-10-03. TAx (Threat Above Expected) as a measure of positioning quality.
🔑 Hidden Causal Lever

Against Set Defenses, the Byline Is the Exploit — Not the Center

The threat landscape differs fundamentally by game state. Against set (organized) defenses, locations near the halfway line have almost zero threat (no space to attack into), but threat increases sharply near the byline because cutbacks penetrate organized blocks. Against counters, the pattern inverts: high threat near the halfway line (space to run into), declining toward the byline. Near the byline, the threat surfaces CONVERGE across game states — cutbacks are dangerous regardless of defensive organization.

What most people do
Use a single, unconditional expected threat map for all game states, treating every zone as equally valuable regardless of whether the team is countering or probing a set defense.
What the best do
Compute separate threat surfaces per game state. Against set defenses, prioritize byline penetration and cutbacks. Against disorganized defenses, prioritize direct central progression. The tactical prescription is game-state-dependent.
Why it's an edge: Man City's half-space-to-byline cutback strategy is specifically designed to exploit the set-defense threat landscape. Teams that understand this can replicate the principle: against organized blocks, the valuable zone is the byline, not the center of the box. Most teams waste possession probing the center against set defenses when the exploit is at the edges.
How to exploit: When facing a set defense (>20 seconds in possession, opponent organized), route attacks to the byline rather than trying to penetrate centrally. Measure byline entry rate against set defenses as a tactical KPI. For opponent analysis: check if they use cutbacks disproportionately against set defenses — if so, defend the byline, not the center.
Perdomo & Zarrella, 23 Sports, StatsBomb Innovation in Football Conference, 2019-10-28. Man City's pattern explicitly identified.
🔑 Hidden Causal Lever

Expected Ball Speed Reveals Intent Independently of Outcome

A pass played at 20 m/s when the expected speed for that context is 12 m/s reveals urgent intent — the player was trying to execute quickly, regardless of whether the pass completed. Conversely, a pass at 8 m/s when expected was 12 m/s suggests hesitation or a deliberate tempo change. The actual-minus-expected speed delta is an intent signal that event data doesn't capture, because event data only records what happened, not how urgently the player tried to make it happen.

What most people do
Analyze pass outcomes (complete/incomplete) and destinations without considering execution speed relative to context.
What the best do
Use ball speed data (from tracking or estimated from timestamps) to compute speed delta per pass. Aggregate by player and situation to reveal who plays with urgency in key moments and who hesitates.
Why it's an edge: A player who consistently exceeds expected speed in the progression phase is a tempo-setter — they're forcing the defense to react faster. A player who consistently underperforms expected speed is a tempo-breaker. Neither shows up in standard pass metrics.
How to exploit: Build expected ball speed models from tracking data. Identify tempo-setters (consistently positive speed delta in progression/final third) and recruit them for pressing systems that demand fast ball circulation.
Devin Pleuler, Toronto FC, StatsBomb Conference, 2021-11-04. Expected ball speed as the foundation for tempo quantification.
🔑 Hidden Causal Lever

EPV Horizon Blindness to Slow-Developing Plays

EPV with fixed horizons (10 events, 10 seconds) assigns zero credit to initiating actions of slow-developing plays. A midfield pass triggering a goal 30 events later gets no credit. Corners leading to second-phase goals are invisible.

What most people do
Accept the fixed horizon and treat EPV as a complete valuation.
What the best do
Track "beyond-horizon goal contribution" alongside EPV. Build extended attribution models that trace full causal chains regardless of possession boundaries.
Why it's an edge: Teams excelling at slow buildup or second-phase set pieces are systematically undervalued by any standard EPV implementation.
How to exploit: Build complementary "extended attribution" tracing back from goals to the full causal chain. Use as a second lens for players with mediocre EPV but whose teams consistently score from slow sequences.
StatsBomb CTO, 2019-10-25
🔑 Hidden Causal Lever

EPV Surfaces Value in Actions That Generate Zero xG

EPV values every action by its impact on goal probability within the full possession, not just at the shot. This means a backward pass that opens a channel, a decoy run that pulls a defender, or a ball receipt that draws pressure all have measurable EPV deltas — even though they generate zero xG and zero xT. The majority of valuable actions in football produce no shots and no zone progression, making them invisible to xG and xT frameworks.

What most people do
Evaluate midfielders and deep-lying players using xG contribution or xT delta, which only captures value at the extremes (shots or zone changes).
What the best do
Use EPV to measure the value of maintaining or creating options — the "option value" of holding the ball in a threatening position even without advancing it. A player who maintains 0.15 EPV for 8 seconds while teammates reposition is providing value that neither xG nor xT records.
Why it's an edge: The transfer market systematically underprices players whose primary contribution is creating and maintaining possession value without direct goal involvement. These are the players who make everyone else better but lack impressive xG or xA numbers.
How to exploit: Compute per-player EPV maintenance (average EPV during possessions they're involved in) separately from EPV progression (delta per action). Players with high maintenance but low delta are system enablers — cheap to acquire, expensive to replace.
Javier Fernandez, FC Barcelona, StatsBomb Innovation in Football Conference, 2019-10-22. "Even backward passes can have positive EPV if they create better forward options."
🔑 Hidden Causal Lever

The December Cliff Is the Most Predictable Market Inefficiency in Premier League Betting

betting-intelligencefixture-congestion-impact

Teams with thin squads entering European competition show a systematic performance decline in December-February that the betting market consistently underestimates. Newcastle 2023-24 entered the Champions League with essentially a 13-player squad and collapsed domestically from December. Villa 2024-25 showed the same pattern. The market adjusts slowly because early-season results look strong (before congestion bites).

What most people do
Price teams at their early-season form level through the entire season.
What the best do
Identify thin-squad European teams in August. Wait until December. Bet against them systematically in domestic matches after midweek European fixtures.
Bet The Process podcast, 2024-2025. Newcastle and Villa case studies.
🔑 Hidden Causal Lever

Travel Distance After European Away Matches Predicts Weekend Domestic Underperformance

betting-intelligencefixture-congestion-impact

Not all European fixtures are equal. A Tuesday home match in the Champions League has minimal impact on Saturday domestic performance. A Thursday away match in Eastern Europe in the Conference League — with longer travel, later time zone, and less recovery time — has a massive impact. The combination of Thursday kickoff + long travel + Sunday domestic fixture is the worst-case scenario.

What most people do
Apply a generic "European hangover" penalty to all post-European domestic matches equally.
What the best do
Weight the congestion penalty by: (a) day of European match (Thursday worse than Tuesday), (b) travel distance, (c) time zone difference, (d) days until next domestic fixture. Thursday away in Eastern Europe + Sunday 2pm domestic kickoff = maximum congestion penalty.
Bet The Process podcast, fixture analysis 2024-2025.
🔑 Hidden Causal Lever

Build Game Models Bottom-Up from Video Clips, Not Top-Down from Interviews

tactical-analysisgame-model-definition

Coaches can't articulate their game model verbally but can instantly identify what they want from video. Asking "describe your model" produces platitudes. Showing clips and asking "is this what you want?" produces precise descriptions.

What most people do
Interview the coach, build analysis around the stated model. When the coach rejects it, blame "coach resistance."
What the best do
Collect 20-30 clips of desired and undesired behaviors. Build the model from the coach's reactions to clips, not their verbal descriptions.
Why it's an edge: The gap between the stated model and actual model is where analytics departments waste 80% of effort.
How to exploit: Before building any tactical analysis for a new coach, spend two weeks on clip-based model extraction. This front-loaded investment saves months of rejected work.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18
🔑 Hidden Causal Lever

Goal Kicks Are the Cleanest Window Into How a Team Actually Wants to Press

Open-play pressing analysis is contaminated by chaotic, constantly changing game states. Goal kicks provide a standardized starting state: both teams have time to set their shape, the ball is in a known location, and pressing decisions are deliberate rather than reactive. This controlled environment makes pressing style differences most visible. Additionally, the 2019 rule change (allowing goal kicks inside the 18-yard box) fundamentally changed pressing dynamics — teams like Liverpool can now press inside the box.

What most people do
Analyze pressing from all possessions indiscriminately, mixing chaotic open-play situations with standardized restart situations.
What the best do
Use goal kicks as the primary analysis context for pressing style identification. Over a season, this provides hundreds of comparable observations with a consistent starting state. Reserve open-play analysis for specific game-state questions.
Why it's an edge: Goal kick pressing data is higher signal-to-noise than open play pressing data because both teams have chosen their positions deliberately. The patterns you find from goal kick analysis are the team's INTENDED press, not their reactive one. This is the difference between analyzing strategy and analyzing improvisation.
How to exploit: Filter all pressing analysis to goal kick sequences first. Build per-team pressing profiles from goal kick data (pressure initiation locations + post-pressure pass destinations). Use this as the "ground truth" pressing style, then check whether open-play pressing is consistent or varies. Aggregate over 10+ matches before drawing conclusions.
Nicole Kuzlova, StatsBomb Conference, 2021-11-04. Goal kick standardization approach for pressing analysis.
🔑 Hidden Causal Lever

Shot Concession Profile Is R-squared 0.7 Season-Over-Season — You Can Predict What Shots Your GK Will Face

A team's shot concession profile (% of xG from 1v1s, headers, long-range, etc.) is highly repeatable season after season (R-squared = 0.7 in the Premier League). Liverpool consistently conceded ~42% of xG via 1v1 across multiple seasons. Burnley consistently had the highest long-range shot percentage. This means the TEAM's defensive style — not opponent randomness — determines what shots the GK faces. You can recruit a GK optimized for your specific concession profile and know the profile will persist.

What most people do
Recruit GKs on aggregate save percentage or GSAA without considering which shot types they'll actually face.
What the best do
Compute the team's shot concession profile over 2+ seasons. Decompose GK candidates' save quality by shot type. Match the GK whose strengths align with the shots your system produces.
Why it's an edge: A GK elite at saving 1v1s but poor on long-range shots is perfect for Liverpool's system (42% 1v1 xG) but wrong for Burnley's system (highest long-range %). This matching is deterministic, not probabilistic — the profile will persist.
How to exploit: Build a "GK-system fit score" = correlation between team shot concession profile and GK shot-type save quality. Rank GK targets by fit score, not aggregate GSAA.
Max Odenheim & John Harrison, LAFC, StatsBomb Conference, 2021-11-04. R-squared = 0.7 demonstrated for Premier League shot concession profiles.
🔑 Hidden Causal Lever

Goalkeepers' Positioning Habits Are Correctable Within Weeks — But Most Clubs Take Months to Identify Them

GK positioning biases (near-post hugging, standing too deep, consistent lateral offset) are detectable within 10-15 matches of tracking data but typically take coaching staff 2+ seasons to identify through video alone. The positioning deviation is sub-meter — invisible to the naked eye in real-time but clearly visible in aggregate tracking data plots. Once identified and shown to the GK with data, correction is fast (4-8 weeks of targeted training) because it's a positioning habit, not a physical limitation.

What most people do
Rely on GK coaches' subjective assessment of positioning, which takes hundreds of observed shots to form a reliable opinion.
What the best do
Compute positioning deviation vectors from tracking data after 10-15 matches. Present the GK with their deviation density plot and clip-linked outliers. Target the 2-3 worst-case deviations first. Reassess after 4-8 weeks.
Why it's an edge: The speed difference between data-identified and subjectively-identified positioning correction is the edge. Clubs using tracking data fix positioning biases 6-12 months earlier than clubs relying on video review alone. Over a season, this is worth several goals prevented.
How to exploit: Implement GK positioning tracking from day 1 of any new GK signing. Run the deviation analysis at the 10-match mark. Present to GK coach with clip links. This front-loaded investment pays off immediately.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Positioning deviation density plots for GKs.
🔑 Hidden Causal Lever

GSAA Conflates Two Independent Skills That Require Opposite Training

Standard GSAA (Goals Saved Above Average) lumps positioning quality and shot-stopping reflexes into one number. A goalkeeper with elite reflexes and poor positioning can produce the same GSAA as one with elite positioning and average reflexes — but the training prescriptions are opposite, the sustainability is different (reflexes decline with age, positioning improves), and the team-building implications are different.

What most people do
Evaluate goalkeepers on aggregate GSAA and assume high GSAA means "good goalkeeper" without decomposing the source.
What the best do
Train two post-shot xG models — one with GK position, one without — and use the difference to isolate positioning value from shot-stopping value. Then match the GK's strength profile to the team's needs.
Why it's an edge: A GK whose value comes from positioning is more sustainable (positioning improves with coaching and age) than one whose value comes from reflexes (decline after ~30). Recruiting a 28-year-old reflex-dependent GK is buying a depreciating asset. Recruiting a 28-year-old positioning-elite GK is buying an appreciating one.
How to exploit: Decompose every GK target's GSAA into positioning and shot-stopping components. Favor positioning-dominant GKs for long-term contracts. Use reflex-dominant GKs as short-term solutions only.
Dr. Dinesh Vatvani, StatsBomb Conference, 2022-10-04. Two-model approach enables counterfactual "what if the keeper stood here" analysis.
🔑 Hidden Causal Lever

Overall GSAA Is Misleading — Team Shot Concession Profile Is the Recruitment Key

A team's shot concession profile — what percentage of xG comes from each shot type (1v1, headers, long-range, etc.) — is highly repeatable season-over-season (R-squared = 0.7). This means the defensive style produces a structural distribution of shot types that persists regardless of opponent. Liverpool consistently concedes ~42% via 1v1s; Burnley leads in long-range shot percentage. A goalkeeper's value is therefore determined by their performance in the specific bins the team needs, not their overall GSAA.

What most people do
Recruit GKs by overall GSAA ranking, treating all shot types as fungible.
What the best do
Decompose GSAA into 7 shot-type bins, profile the team's concession distribution, and match GK strengths to team needs. Alisson at Liverpool is the archetype: #1 in 1v1 GSAA matched to a team that concedes 42% from 1v1s.
Why it's an edge: Non-obvious GK targets become available. A GK ranked 15th overall in GSAA but 2nd in the specific bin your team needs is dramatically underpriced because the market uses aggregate rankings.
How to exploit: Profile your team's shot concession by type over 2-3 seasons. For each GK candidate, compute per-bin GSAA. Rank by fit-weighted GSAA, not overall. Also use for training: focus drill time on the bin that dominates your concession profile.
Max Odenheim & John Harrison, LAFC, StatsBomb Conference, 2021-11-04. R-squared = 0.7 season-over-season. Matty Ryan and Dubravka identified as non-obvious Alisson alternatives for 1v1-heavy teams.
🔑 Hidden Causal Lever

A GK's GSAA Can Swing 2+ Goals Per Season Just By Changing the Team's Defensive System

GSAA (Goals Saved Above Average) is confounded by the team's shot concession profile. A GK who is elite at saving 1v1s but average at everything else will show different GSAA depending on whether their team concedes 20% or 50% of xG from 1v1s. By decomposing GSAA into shot-type components, you can predict how a GK's GSAA would change under a different defensive system — and the swing can be 2+ goals per season, which is often the difference between relegation and safety.

What most people do
Treat GSAA as a stable individual metric, independent of the team's defensive style.
What the best do
Decompose GSAA by shot type. Compute counterfactual GSAA: "If this GK faced Team B's shot concession profile instead of Team A's, their GSAA would change by X." Use this for recruitment: a GK who looks average in one system may be elite in yours.
Why it's an edge: A GK transfer from a team with a different defensive system will show a GSAA change that has nothing to do with their ability — only the system changed. Clubs that don't account for this make systematic evaluation errors.
How to exploit: Before signing a GK, recompute their GSAA under your team's shot concession profile. If the counterfactual GSAA is significantly better (or worse) than their current GSAA, the system difference — not the GK — explains the gap.
Max Odenheim & John Harrison, LAFC, StatsBomb Conference, 2021-11-04. Shot-type GSAA decomposition with R-squared = 0.7 profile repeatability.
🔑 Hidden Causal Lever

Failed Passes With Known Intent Are More Analytically Valuable Than Completed Safe Passes

When the intended recipient of a failed pass is known (from the ball receipt event on incomplete passes), the pass reveals the player's decision quality even though execution failed. A progressive pass into the correct pocket that was slightly overhit tells you the player SAW the opportunity — execution is more coachable than vision. Filtering to only completed passes for decision analysis discards the most diagnostic data: the ambitious attempts that didn't quite work.

What most people do
Filter to completed passes for analysis, discarding failed passes as "errors."
What the best do
Use intent-tracked failed passes to evaluate decision quality separately from execution quality. A player who attempts 10 progressive passes and completes 6 may have better vision than a player who attempts 6 and completes 6 — they see more opportunities.
Why it's an edge: Decision quality and execution quality are separate skills with different development trajectories. A young player with elite decision quality (attempts the right passes) but poor execution (only completes 60% of them) will improve as execution develops. A player with poor decision quality but good execution will not.
How to exploit: For development players, compute "intent quality" (xT or EPV of the intended pass, regardless of completion). Prioritize developing players whose intent quality is high — execution improves with practice, vision doesn't.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. Receipt events on incomplete passes for intent tracking.
🔑 Hidden Causal Lever

Manager Impact Is Visible in xG Within 5 Matches — Not 15

When a manager changes, the team's xG creation and concession profiles shift measurably within 3-5 matches, not the 15-20 match "settling in" period that conventional wisdom assumes. The reason: the new manager immediately changes pressing triggers, defensive line height, and buildup routing — all of which show up in spatial xG patterns well before results stabilize. The results lag because variance is high in small samples, but the PROCESS shift is immediate.

What most people do
Wait 10-15 matches before evaluating a new manager's impact, using results as the signal.
What the best do
Track xG creation and concession by shot zone starting from match 1. The spatial pattern of chances (not the total xG) reveals the tactical shift immediately. Compare the shot-zone distributions pre and post change rather than total xG.
Why it's an edge: Early detection of whether a new manager's tactical changes are working (in process terms) gives the sporting director a 10-match head start on personnel decisions. If the xG spatial pattern hasn't shifted by match 5, the new manager is not implementing meaningful tactical changes — regardless of results.
How to exploit: Build an automated manager-change xG spatial comparison. Trigger it at match 3 and match 5 of any new appointment. Flag to the sporting director whether the process has changed.
Ted Knutson & Siqur Arshad, WFS 2019. Real Madrid xG trend lines showing xG shifts correlated with manager changes.
🔑 Hidden Causal Lever

Off-Ball Value Shows Up in Teammates' Metrics, Not the Contributor's Own

A player whose primary contribution is off-ball (drawing defenders, occupying space) creates value that appears in teammates' on-ball metrics. When removed from lineup, teammates' ΔEPV, completion, and xG creation all decline. The attribution is systemically misplaced.

What most people do
Evaluate on own touch-based metrics. Low-touch players are considered dispensable.
What the best do
Compute with/without splits for teammates' metrics when a specific player is present vs. absent. Build off-ball contribution scores.
Why it's an edge: Off-ball contributors are the most underpriced players in football. Their market value is set by touch-based metrics that structurally can't see their contribution.
How to exploit: Before selling a "low stats" player, run with/without analysis on teammates' metrics. Before buying, identify candidates whose teams' metrics drop disproportionately when absent.
Javier Fernandez, FC Barcelona, 2019-10-22
🔑 Hidden Causal Lever

80% of Football Is Off-Ball But 95% of Metrics Measure On-Ball Actions

A typical outfield player has the ball for 60-90 seconds per match out of 90+ minutes. Their off-ball movement — creating space, drawing defenders, maintaining positional structure — constitutes the vast majority of their contribution but is almost entirely invisible to event-data metrics. 360 data and tracking data enable measuring off-ball advantage (how much space a player creates for teammates by their movement and positioning), but most analysis still defaults to on-ball actions because the data is easier to work with.

What most people do
Evaluate players using on-ball metrics (passes, shots, tackles, interceptions) which capture ~5% of their time on the pitch.
What the best do
Use 360/tracking data to compute off-ball space creation: how often does a player's movement create an open passing lane for a teammate? How much does their positioning reduce the pressure on teammates' receipts? How effectively do their runs pull defenders out of position?
Why it's an edge: The market prices players on on-ball production because that's what's measurable. Players whose primary value is off-ball (elite movement, intelligent positioning, defensive organization) are systematically underpriced because their contribution doesn't show up in standard metrics.
How to exploit: Build off-ball contribution metrics from 360 data. Identify players with high off-ball space creation but low on-ball metrics — they're being undervalued. In recruitment, weight off-ball metrics alongside on-ball metrics for positions where movement is the primary contribution (strikers, pressing forwards).
Javier Fernandez, FC Barcelona, 2019-10-22. Off-ball positional advantage as a key EPV component.
🔑 Hidden Causal Lever

Spending More Than 20 Seconds in the Offensive Zone Against a Set Defense Has Diminishing Returns — The Defense Outpaces You

After approximately 20 seconds of continuous possession in the offensive zone, goal-scoring probability plateaus because the defense has had time to fully organize. The first 5-10 seconds of zone occupation are the highest-value window — defenders are still adjusting their shape, gaps exist, and the pressing response hasn't fully formed. Attacking teams that fail to create a chance within the first 15-20 seconds of zone entry should consider resetting the possession rather than continuing to probe a fully set defense.

What most people do
Maintain possession in the offensive zone as long as possible, believing that sustained pressure eventually creates chances.
What the best do
Track "time in zone before chance creation." If no chance emerges within 15-20 seconds, reset the possession (backward pass to midfield) to create a new entry and reset the defensive structure. Sustained probing against a fully set defense has lower expected value than a reset-and-re-enter approach.
Why it's an edge: The intuition that "more time in the zone = more chances" is wrong once the defense is set. Resetting feels like giving up pressure, but it actually creates higher-value re-entry opportunities because the defense must re-expand from their compact shape.
How to exploit: Compute zone dwell time before chance creation. If your team's average exceeds 20 seconds with low xG production, coach the reset trigger at 15-18 seconds. Compare xG per offensive zone possession for resets vs. extended probes.
Perdomo & Zarrella, 23 Sports, StatsBomb Conference, 2019-10-28. 20-second duration threshold for set defense detection derived from goal-scoring rate plateau.
🔑 Hidden Causal Lever

OBV Sub-Metric Decomposition Reveals Defensive Midfielders' Hidden Value

DMs often show low total OBV because their contribution is reducing conceding probability (defensive OBV), not increasing scoring probability. A DM with -0.05 attacking OBV but +0.15 defensive OBV is net positive but looks negative on attacking sub-metrics alone.

What most people do
Report total OBV or attacking sub-metrics. Evaluate all midfielders on attacking-biased scales.
What the best do
Build position-specific OBV dashboards weighting sub-metrics by positional role. For DMs: defensive OBV is primary.
Why it's an edge: DMs are systematically undervalued because popular models emphasize attacking contribution. Correct defensive OBV valuation creates transfer arbitrage.
How to exploit: Build a "defensive value" leaderboard using defensive OBV. Cross-reference with market value. The biggest gaps are the arbitrage opportunities.
Hudl StatsBomb, 2025-01-27
🔑 Hidden Causal Lever

The Same Team Looks Like Two Different Teams Against High vs. Low Blocks

Context filtering by opponent block type reveals that a single team's statistical profile bifurcates dramatically. A team averaging 60% possession may have 72% against low blocks and 48% against high presses — and the tactical problems in each context are fundamentally different. Analyzing aggregate possession quality without conditioning on block type produces conclusions that apply to neither situation specifically.

What most people do
Analyze team statistics across all matches, producing averages that represent no specific context.
What the best do
Filter every analysis by opponent block type before drawing conclusions. Accept that small samples per context combination are expected and correct. A team's "true" attacking profile is 3-4 separate profiles, not one average.
Why it's an edge: Scouting reports and pre-match preparation that use aggregate statistics prepare for a team that doesn't exist. The team you'll face Saturday depends on how YOU set up — your block type determines which version of them appears.
How to exploit: Build opponent profiles conditioned on defensive context. Before each match, select the profile that matches your planned defensive setup. If you press high, you face their counter-attack profile, not their organized possession profile.
Javier Fernandez, FC Barcelona, 2019-10-22; Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Context filtering as foundational analytical principle.
🔑 Hidden Causal Lever

Second-Half Decision Profiles Shift Toward Risk-Seeking — This Is Exploitable

Across a full season of Barcelona data, risk decision parameter decreases and gain decision parameter increases in the second half. Players systematically take more chances as the game progresses — even controlling for score line. This temporal shift is measurable and exploitable.

What most people do
Analyze decisions as static across the match. Use full-match aggregates.
What the best do
Split decision analysis by half. Identify which players shift most toward risk-seeking. Time defensive intensity accordingly.
Why it's an edge: Pressing harder when opponents are taking more chances (and making more errors) is more efficient than constant pressing.
How to exploit: Build per-player temporal decision profiles. Time your pressing intensity to coincide with opponents' decision quality decline. Monitor your own team's second-half risk escalation.
Complex Systems Group, Madrid, StatsBomb Conference, 2021-11-04
🔑 Hidden Causal Lever

P3% Separates Vision from Team-Created Opportunity

player-evaluationp3-percentage-kpi

A player leading in raw penetrative passes may be a poor converter. They get 100 opportunities (team-created) and convert 30 (30%). A player with 50 opportunities converting 25 (50%) has better vision. Raw counts confuse team-created opportunity with individual skill.

What most people do
Rank by raw penetrative pass count. Pay a premium for high-volume passers from dominant teams.
What the best do
Use P3% to separate opportunity from conversion. Identify high-P3% players on weaker teams as undervalued targets.
Why it's an edge: A player moving from Barcelona (100 P3 opportunities) to a mid-table team (40 opportunities) will see raw count collapse — but P3% predicts performance with fewer chances.
How to exploit: When scouting from dominant teams, use P3% not raw count. The market prices volume; P3% corrects for opportunity.
Hadi Sotude, StatsBomb Conference, 2021-11-04
🔑 Hidden Causal Lever

Penetrative Pass Percentage Is a Leading Indicator of Creative Decline Before xA Shows It

player-evaluationp3-percentage-kpi

P3% (ratio of actual penetrative passes to P3-model-predicted opportunities) declines 4-6 weeks before expected assists (xA) shows a drop, because the player stops seeing or attempting the penetrative pass before their overall assist output reflects it. Fatigue, confidence loss, or tactical adjustment first manifests as reduced penetrative ATTEMPTS before it manifests as reduced assists. P3% is a leading indicator of creative performance change.

What most people do
Monitor xA or actual assists as the primary creative output metric, reacting to declines after they've persisted for several matches.
What the best do
Monitor P3% on a rolling 5-match window. A downward trend signals creative fatigue or tactical adjustment before assists decline. This provides a 4-6 week early warning for rotation decisions or tactical intervention.
Why it's an edge: By the time assists decline, the creative drought has already cost goals. Detecting the decline at the P3% stage allows intervention (rest, rotation, tactical adjustment) before the team-level output is affected.
How to exploit: Build a rolling P3% dashboard for key creative players. Set alert thresholds for decline. When P3% drops 10%+ from baseline over 5 matches, flag to coaching staff for rotation or intervention.
Derived from Hadi Sotude, StatsBomb Conference, 2021-11-04. P3 model as the foundation for measuring penetrative pass opportunity exploitation.
🔑 Hidden Causal Lever

Pressure Barely Changes Completion (<1%) — It Changes What Players Attempt

Raw pass completion drops less than 1% under pressure because players self-select shorter, safer passes. Pressure doesn't reduce execution quality — it changes attempt type. The real signal is the shift in what players attempt, not whether they complete it.

What most people do
Compare completion under pressure vs. not and conclude "pressure doesn't matter much."
What the best do
Analyze the shift in attempt type (distance, direction, zone) separately from completion. Build difficulty-adjusted models where pressure interacts with pass type.
Why it's an edge: Selection effects are invisible in aggregate stats. A player who "maintains completion under pressure" may be going conservative, while another maintaining completion on through-balls is genuinely elite.
How to exploit: Add interaction terms (pressure × distance, pressure × forward direction) to passing models. Scout for players whose attempt PROFILE doesn't change under pressure.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23
🔑 Hidden Causal Lever

Pass Execution Quality Is Separate From Pass Decision Quality — And They Require Different Interventions

player-evaluationpass-execution-rating

A player's pass execution quality (did the ball go where they intended, at the right speed and weight?) is a separate skill from pass decision quality (was that the right pass to attempt?). A player who makes perfect decisions but executes at 70% accuracy needs technical coaching. A player who executes at 95% accuracy but makes poor decisions needs tactical coaching. Conflating these into "pass completion rate" makes both problems invisible.

What most people do
Use pass completion rate as a single metric, which blends decision quality and execution quality into one number.
What the best do
Decompose passing into decision quality (was the intended pass the optimal choice given available options?) and execution quality (given the intended pass, how well was it executed?). Use xPass residual for execution and EPV optimality gap for decision.
Why it's an edge: The coaching intervention for poor decisions is completely different from the intervention for poor execution. Treating a decision problem as an execution problem (or vice versa) wastes coaching time and frustrates the player.
How to exploit: For each player, compute both metrics. Present to coaching staff as a 2x2: Good Decision/Good Execution (keep), Good Decision/Poor Execution (technical training), Poor Decision/Good Execution (tactical training), Poor Decision/Poor Execution (development priority or recruitment replacement).
Derived from the intersection of option-aware pass decision evaluation and xPass models. Javier Fernandez, FC Barcelona, 2019-10-22; Will Morgan, StatsBomb Conference, 2022-10-03.
🔑 Hidden Causal Lever

The "Empty Quadrant" Proves No Safe High-Gain Passes Exist

In risk × gain 2D space, the low-risk/high-gain quadrant is nearly empty. Defenses are organized specifically to ensure safe passes don't produce high gain. There is no free lunch: high gain requires high risk. Any model claiming "safe and dangerous" passes is miscalibrated.

What most people do
Search for "safe, high-value" patterns. Criticize players for not finding the "easy" dangerous pass.
What the best do
Accept the tradeoff as structural. Focus on execution quality at the risk frontier rather than trying to move it. Build the team around players who can complete at the frontier (Messi: highest risk AND highest completion).
Why it's an edge: Accepting the empty quadrant changes the framework from "find better passes" to "execute harder passes better."
How to exploit: Profile each player's execution in the high-risk/high-gain quadrant. Build the attacking plan around getting the ball to whoever can complete there.
Complex Systems Group, Madrid, StatsBomb Conference, 2021-11-04
🔑 Hidden Causal Lever

Back Injuries Are Career Threats That Models Underweight — They Affect Movement Quality Before Output

Back injuries (disc herniations, stress fractures, chronic lower back pain) are systematically more dangerous to a footballer's career than knee injuries because they affect core stability and explosive movement quality gradually rather than catastrophically. A player returning from a back injury may pass fitness tests and play 90 minutes, but their ability to sprint, change direction, and jump is permanently compromised. This shows up in movement data 6-12 months before it shows up in goals or assists.

What most people do
Treat back injuries as equivalent to other muscle injuries — a few weeks out, then full recovery.
What the best do
Flag back injuries in recruitment as a major red flag, especially for pace-dependent positions. When evaluating a player with a back injury history, demand movement tracking data (sprints, accelerations, deceleration profiles) rather than just match fitness clearance.
Bet The Process podcast, injury impact analysis, 2024-2025.
🔑 Hidden Causal Lever

The COVID Generation Has a Hidden Workload Time Bomb

Players who were 17-19 during COVID-affected seasons (2020-2022) accumulated senior minutes earlier than normal because squads were depleted and fixture schedules were compressed. These players are now 22-24 and showing elevated soft-tissue injury rates compared to historical cohorts at the same age. The mechanism is cumulative workload: their bodies weren't physically mature enough for the minutes they played, and the damage is emerging now. This cohort is a recruitment risk that historical aging models don't capture because the input conditions were unprecedented.

What most people do
Apply standard aging curves to COVID-generation players without adjusting for early workload.
What the best do
Track total senior minutes accumulated before age 20 as an injury risk predictor. Players with >5000 senior minutes before 20 from the COVID era should have a durability discount applied in recruitment valuations.
Bet The Process podcast, COVID generation workload analysis, 2024-2025.
🔑 Hidden Causal Lever

Robertson-Mane Was the Best Partnership at Liverpool — But Neither Was the Best Individual

Pair synergy scoring reveals partnerships that outperform the sum of their parts. Robertson-Mane at Liverpool had elite pair synergy despite neither player dominating individual metrics. Conversely, two individually elite players can have negative synergy if they occupy the same tactical space. The pair metric captures emergent value that individual metrics structurally cannot.

What most people do
Evaluate players individually and assume combining two top-rated individuals produces a top-rated pair.
What the best do
Compute pair co-appearance frequencies across outcome types (goals, shots, turnovers). Identify which pairs drive positive outcomes together that neither drives individually. When replacing an injured player, prioritize pair synergy with the remaining players, not individual quality.
Why it's an edge: The transfer market prices individual quality. A player whose value is primarily in pair synergies (like Fernandinho at Man City) looks replaceable based on individual metrics but is irreplaceable based on pair metrics. Clubs that understand pair synergy can predict performance drops from specific injuries and find the right replacement.
How to exploit: Before any transfer, compute pair synergy of the target with every likely co-starter. A player with 3+ top-ranked pair synergies is more valuable than one with higher individual metrics but weak pair connections. When a key player is injured, search for replacements that maximize pair synergy with the injured player's existing partners, not just individual similarity.
James, University of Southampton, StatsBomb Innovation in Football Conference, 2019-10-30. Robertson-Mane pair and Fernandinho centrality examples.
🔑 Hidden Causal Lever

The Same Data With Different Reference Frames Recommends Opposite Actions

Two perfectly valid visualizations of identical data can lead to opposite conclusions depending on the reference frame. A player who shoots above average from mid-range RELATIVE TO THAT LOCATION looks good on a player-vs-location chart (useful for scouting where they're dangerous). But the same shots are below average RELATIVE TO LEAGUE-AVERAGE EFFICIENCY because mid-range shots are inherently less efficient than other shot types (useful for deciding whether to allow those shots).

What most people do
Choose one reference frame and present it as the truth, unaware that a different baseline would tell a different story.
What the best do
Always ask "What question am I answering?" before choosing the reference frame. Explicitly state the baseline in every visualization. When the stakes are high, show both reference frames and explain why they disagree.
Why it's an edge: Analytical disagreements that seem data-driven are often reference-frame disagreements. A scouting department and a coaching staff can look at the same player's shot chart and reach opposite conclusions — both correctly — because they're answering different questions. Making the reference frame explicit eliminates this source of organizational friction.
How to exploit: For every visualization you build, document the reference frame. When presenting to stakeholders, show the alternative frame briefly: "Against location average, he's elite from here. Against league-average efficiency, this zone is still low-value. Both are true — which question are we answering?"
Seth Partnow, StatsBomb Innovation in Football Conference, 2019-10-28. D'Angelo Russell shot chart example showing opposite recommendations from identical data.
🔑 Hidden Causal Lever

Player Tendencies Are More Exploitable Than Player Weaknesses

A weakness is something a player does poorly. A tendency is something they do predictably. Tendencies are MORE exploitable than weaknesses because you know WHAT will happen, even if the execution is competent. Alexander-Arnold's post-pressure ball path consistently goes infield — this isn't a weakness (the passes are often accurate) but a tendency that can be trapped. A player who always turns left under pressure can be funneled into a defensive trap even if they execute the turn well.

What most people do
Focus opposition preparation on weaknesses (poor under pressure, slow in transition, bad in the air). These are important but often overweighted.
What the best do
Map tendencies separately from weaknesses. A tendency is high-probability directional behavior that is correct for the player but exploitable if the opponent knows about it. Build "tendency traps" — defensive setups that exploit the opponent's most predictable behavior, even when that behavior is executed well.
Why it's an edge: Weaknesses require the opponent to make an error. Tendencies only require the opponent to be predictable. You can trap a tendency-based behavior without the opponent making any mistake — they simply do what they always do, into the space you've prepared.
How to exploit: For each opponent player, compute directional consistency under pressure. High-consistency players are tendency targets. Design defensive positioning that covers their most predictable path, then press them to trigger the tendency.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. Alexander-Arnold infield tendency as the canonical example.
🔑 Hidden Causal Lever

Unpressured Final-Third Action Rate Measures Intelligence, Not Laziness

Özil, Messi, De Bruyne, and Silva top unpressured attacking action rates. Özil's "walking" was him being in position before defenders arrived. High unpressured rate = elite spatial anticipation, not disengagement.

What most people do
Judge low-activity players as lazy. Use distance covered as effort metric.
What the best do
Measure unpressured final-third action rate as a positional intelligence proxy.
Why it's an edge: "Lazy" players who find unpressured pockets demonstrate elite spatial anticipation. Selling them for "not running" replaces a space-finder with a runner who arrives late.
How to exploit: Add unpressured attacking action rate to scouting profiles for #10s and forwards. Cross-reference with team context.
Cross-domain parallel
Buffett's "I sit here all day and read" looks lazy. The edge is positioning before the opportunity.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23
🔑 Hidden Causal Lever

Every RL Player Leaderboard Is a Zone Heatmap in Disguise

Value delta sums rank strikers highest because they play in high-reward zones, not because they're most valuable. A hypothetical identical player scores differently at striker vs. DM. Without opportunity normalization, every RL-based valuation is just zone access ranking.

What most people do
Sum value deltas per player and rank.
What the best do
Apply opportunity-normalized action value — compare output to the historical distribution in similar contexts.
Why it's an edge: Without normalization, you'll overpay for strikers and undervalue midfielders who create conditions for those zones.
How to exploit: Compute positional baseline expected value. Player value = actual minus baseline. Validate that top-10 includes multiple positions.
StatsBomb CTO, StatsBomb Conference, 2019-10-25
🔑 Hidden Causal Lever

A Player's Location Heatmap and Their Value Map Are Almost Always Different

Where a player spends time (location heatmap) and where they create value (positional value map — position filtered by off-ball advantage moments) are almost always different. A midfielder may inhabit low-value central congestion 80% of the time but create all their value during brief forays into half-spaces. The location heatmap describes habit; the positional value map describes contribution. Coaching interventions should target the gap between the two.

What most people do
Use location heatmaps as the primary positional analysis tool, concluding that a player who occupies the right zones is well-positioned.
What the best do
Build both maps and compare. The gap between "where they are" and "where they create value" is the coaching point. A player with a great location heatmap but poor value map is in the right areas at the wrong times. A player with a poor heatmap but concentrated value map knows when to make runs into valuable spaces.
Why it's an edge: Location heatmaps are the most common visualization in football analytics and they're misleading for positional evaluation. "Heat" (frequency) doesn't equal "value" (contribution). Cold zones can have more value if the player arrives at the right moments. Coaching from heatmaps alone often produces players who occupy correct zones passively rather than timing their movements to create value.
How to exploit: For every player you're evaluating or developing, generate both maps. When the location map and value map diverge, the coaching prescription is timing, not positioning. "You're in the right place — but not at the right time."
Javier Fernandez, FC Barcelona, StatsBomb Innovation in Football Conference 2019, 2019-10-22. Explicit side-by-side comparison shown as the primary coaching insight.
🔑 Hidden Causal Lever

The Most Valuable Possession Actions Are Often Not the Final Pass or the Shot

Possession value decomposition reveals that the highest-EPV-delta action in a goal-scoring possession is often the 3rd or 4th action from the end, not the assist or the shot. A press-breaking carry that advances 30 yards often has a higher EPV delta than the final through ball, because it shifted the entire possession from low-value to high-value territory. Credit assignment that weights only the terminal actions (goal, assist, key pass) misses the player who actually created the scoring opportunity.

What most people do
Assign credit using goals, assists, and key passes — the terminal actions in the sequence.
What the best do
Decompose EPV across the entire possession chain. Identify the action with the highest EPV delta (the "value creation point") regardless of where it falls in the sequence. Credit the player who created the value, not just the player who finished it.
Why it's an edge: The transfer market prices goals and assists because they're visible. The player who consistently provides the 3rd-from-last action with the highest EPV delta is creating the goals but getting none of the credit. These players are systematically underpriced.
How to exploit: Compute per-player EPV delta rankings. Cross-reference with assists and key passes. Players who rank high on EPV delta but low on assists are the underpriced value creators — their contribution is in the setup, not the finish.
Javier Fernandez, FC Barcelona, 2019-10-22. Possession value decomposition showing highest-EPV actions often occur mid-sequence.
🔑 Hidden Causal Lever

The Action Before the Shot Determines xG More Than Shot Location

Two shots from the same location can have wildly different xG depending on the preceding action. A shot after a cutback from the byline (defender disorganized, GK out of position) has 2-3x the xG of a shot from the same location after receiving a sideways pass (defense set, GK positioned). The pre-shot action — cutback, through ball, dribble past last defender, set piece delivery — is a stronger predictor of goal probability than shot location alone, but most analysis focuses on where the shot was taken, not how the player got there.

What most people do
Analyze shot quality primarily by location (distance and angle from goal).
What the best do
Classify shots by the preceding action type and compute action-specific xG adjustments. A "cutback shot" from 12 yards is a fundamentally different proposition than a "cross header" from 12 yards. Optimize the team's attacking system to maximize the proportion of high-value pre-shot action types.
Why it's an edge: If your team takes 15 shots per game but 12 are from set defense after lateral passes, your xG will be low despite high volume. If your team takes 10 shots per game but 7 are after cutbacks or through balls, your xG will be higher despite lower volume. The pre-shot action mix is the lever.
How to exploit: Classify all shots by pre-shot action type. Compute team pre-shot action mix. Compare to optimal mix (what proportion of cutbacks, through balls, set pieces produces the highest aggregate xG per shot?). Coach toward higher-value pre-shot action types.
Perdomo & Zarrella, 23 Sports, StatsBomb Conference, 2019-10-28. Cutbacks from the byline as disproportionately effective against set defenses.
🔑 Hidden Causal Lever

Bournemouth's Ceiling Change Was Permanent — Models Kept Expecting Regression That Never Came

betting-intelligencepromoted-team-evaluation

After promotion, Bournemouth under Andoni Iraola established a permanently higher performance ceiling that models kept expecting to regress. Each season, models projected them as relegation candidates based on squad value and promoted-team priors. Each season, they performed as a solid mid-table team. The ceiling change was structural: Iraola's tactical system extracted performance above the squad's market value, and the club's recruitment was well-targeted. When a promoted team's overperformance persists for 2+ seasons, update the prior permanently.

What most people do
Keep applying a "promoted team penalty" long after the team has established itself.
What the best do
Recognize structural ceiling changes within 1-2 seasons and permanently update the team's baseline. Back these teams when the market still treats them as relegation candidates.
Bet The Process podcast, Bournemouth sustained performance analysis, 2024-2025.
🔑 Hidden Causal Lever

Long Throws Are the Last Undefended Frontier in Elite Football

betting-intelligenceset-piece-economic-value

Long throws into the box are functionally equivalent to corners but teams do not drill to defend them. A team with a long-throw specialist (Rory Delap at Stoke, more recently Ipswich Town) gains an additional 15-20 set-piece delivery opportunities per match that the opposition has not specifically prepared for. The xG per long throw delivery is comparable to corners, but the defensive preparation against them is near zero at most clubs.

What most people do
Dismiss long throws as a lower-division tactic unworthy of elite football.
What the best do
Identify or develop a long-throw specialist and build specific second-phase routines around long-throw clearances. Treat every throw-in in the final third as a set-piece opportunity.
Bet The Process podcast, Ipswich Town set-piece analysis, 2024-2025.
🔑 Hidden Causal Lever

A Team With 1.0 xG From One Shot Has Completely Different Win Probability Than 1.0 xG From Ten Shots

Total xG comparison understates variance. A team with 1.0 xG from a single penalty (0.76 xG shot) has much higher variance than a team with 1.0 xG from ten 0.10 xG shots. The latter will score approximately 1 goal in almost every simulation; the former will score 0 or 1 in wildly different proportions. Shot-by-shot simulation captures this distribution difference that aggregate xG comparison misses.

What most people do
Compare total xG between teams and conclude "Team A deserved to win because they had higher xG."
What the best do
Run 10,000 simulations treating each shot as an independent Bernoulli trial with its individual xG. Report win/draw/loss probability distributions. "This match would be won by Team A 58% of the time, drawn 22%, lost 20%." This distinguishes "dominated" (one team wins 90%+ of simulations) from "competitive" (both teams win 40-50% of simulations) — even when total xG looks similar.
Why it's an edge: Shot distribution matters enormously for understanding match quality. Two teams with 1.5 xG each look equal on aggregate. But if Team A's xG comes from two 0.75 shots and Team B's from fifteen 0.10 shots, Team A is the higher-variance team — more likely to score 2 but also more likely to score 0. The simulation reveals which team had the more robust attacking process.
How to exploit: Automate shot-by-shot simulation for every match. Present simulation results (not just total xG) to coaches. Use the distribution to distinguish genuine process advantages from variance — a team that consistently wins 60%+ of their match simulations is genuinely good, regardless of actual results.
Ted Knutson & Siqur Arshad, WFS 2019. El Clasico simulation: Real Madrid 58% win probability but actual result was 2-2 draw.
🔑 Hidden Causal Lever

Chelsea's Turnover Problem — Squad Spending Without Compounding Destroys Value

Chelsea's post-Boehly spending spree (~1B+ in 3 windows) produced worse results than the pre-spending baseline because constant player turnover prevents tactical compounding. A team's performance compounds when the same players repeat tactical patterns together over multiple seasons. When you replace 8-10 players every summer, you reset the compounding clock to zero. The depth looks great on paper but the players can't execute rehearsed patterns because they haven't rehearsed together long enough.

What most people do
Assume squad spending translates linearly to squad quality.
What the best do
Value squad continuity as a multiplier on quality. A 70M squad with 3 years of tactical compounding can outperform a 200M squad in its first year together.
Bet The Process podcast, Chelsea rebuild analysis, 2024-2025.
🔑 Hidden Causal Lever

PSR Amortization Games Create Phantom Transfer Fees — Learn to Read the Real Price

Under Premier League PSR rules, transfer fees are amortized over the contract length. A 50M fee on a 5-year contract costs 10M/year on the books. Clubs exploit this by structuring deals with inflated headline fees and add-ons that will never trigger. The reported fee is not the economic fee. When evaluating a league's transfer spending or a competitor's recruitment budget, strip out the amortization games to see the real prices being paid.

What most people do
Take reported transfer fees at face value and assume they reflect true market valuations.
What the best do
Adjust reported fees for: (a) contract length (shorter contracts mean less amortization benefit, so the buying club paid a "premium" in PSR terms), (b) add-on likelihood (most add-ons never trigger), (c) sell-on clauses. The economic value of a transfer is often 20-30% below the headline number.
Bet The Process podcast, PSR analysis, 2024-2025.
🔑 Hidden Causal Lever

Salah's Statistical Halving at 33 Is the Template for Pace-Dependent Forward Decline

Mohamed Salah's per-90 output at age 33 was approximately half his peak-season numbers across xG, successful dribbles, and sprint frequency. This isn't unique — it's the standard aging curve for pace-dependent forwards. The market consistently overvalues pace-dependent forwards in their early 30s because of name recognition and recent memory. The statistical template: at 33, expect ~50% of peak output from pace-reliant attackers.

What most people do
Assume elite forwards maintain their level until a sudden cliff.
What the best do
Apply position-specific aging curves and sell/don't renew when the curve predicts the decline, not when the decline is visible in results.
Bet The Process podcast, Salah aging curve analysis, 2025.
🔑 Hidden Causal Lever

Standard xT Says Passing Across Your Own Box Is Neutral. It's Extremely Dangerous.

Standard xT assigns near-zero to lateral passes across your penalty area. Risk-adjusted xT makes these strongly negative — correctly reflecting extreme danger if intercepted.

What most people do
Use standard xT, blind to defensive risk.
What the best do
Implement turnover penalty and risk adjustment to reveal which defenders put the team at risk.
Why it's an edge: Defender recruitment on standard xT misses the risk dimension. A CB who progresses well but creates dangerous turnovers looks good on standard xT.
How to exploit: Rank CBs by net xT (progression minus risk). Target defenders who progress WITHOUT negative-xT turnovers.
Cross-domain parallel
Risk-adjusted returns in finance — raw return without volatility adjustment overvalues high-variance strategies.
PhD student, StatsBomb Conference, 2019-10-30
🔑 Hidden Causal Lever

xG at Extreme Angles Is Contaminated by Non-Shot Events

At very low goal-mouth angles, xG increases counterintuitively because events from extreme angles only enter training data as "shots" when they accidentally result in goals. Selection bias toward goals contaminates the peripheral-angle training data.

What most people do
Accept xG at face value at all angles.
What the best do
Plot xG vs. angle, check for inversion, extrapolate the monotonic curve, downsample contaminated data.
Why it's an edge: Uncorrected models overvalue shots from extreme angles, distorting evaluation for cutback-heavy play styles.
How to exploit: Run the angle-xG diagnostic. If inversion exists, correct and revalidate.
Dr. Dinesh Vatvani, StatsBomb Conference, 2022-10-04
🔑 Hidden Causal Lever

Shot Differential Beats xG as an Early-Season Predictor Because It Has Less Measurement Noise

In the first 6 matches of a season, raw shot differential (shots for minus shots against per match) is a more reliable predictor of end-of-season finishing position than xG differential. This is because xG models add noise through shot quality estimation in small samples — a team might have 5 high-xG shots that were actually well-defended, or 15 low-xG shots that were genuinely dangerous. Shot differential strips out the quality estimation and measures the more stable underlying driver: territorial dominance.

What most people do
Use xG from matchday 1 and trust it as the superior metric immediately.
What the best do
Use shot differential for the first 6-8 matches, transition to xG differential once sample sizes make the quality estimation reliable (10+ matches).
Bet The Process podcast, early-season prediction methodology, 2024-2025.
🔑 Hidden Causal Lever

Teams That Overperform xG Through Set Pieces Don't Regress — But the Market Doesn't Distinguish

Generic "regression to xG" advice treats all overperformance as luck. But overperformance driven by elite set-piece coaching is structural and persistent — it doesn't regress because it's a genuine repeatable skill advantage. The market applies a blanket regression adjustment, creating value on teams whose overperformance is set-piece driven.

What most people do
Apply uniform regression expectations to all teams overperforming xG.
What the best do
Decompose overperformance into open-play vs. set-piece components. Open-play finishing overperformance regresses aggressively. Set-piece overperformance persists. Bet accordingly.
Bet The Process podcast, set-piece persistence analysis, 2024-2025.

💎Elite-Only Behavior(30)

💎 Elite-Only Behavior

"Tools Not Answers" — Users Trust Their Own Conclusions More Than Analyst-Delivered Findings

Analytics departments that deliver answers create bottlenecks and distrust. Departments that deliver self-service tools let decision-makers explore data themselves and draw their own conclusions. The psychological mechanism: people trust conclusions they reached themselves more than conclusions handed to them, even if the underlying data is identical.

What most people do
Build sophisticated models and deliver reports — "our analysis says you should sign Player X."
What the best do
Build queryable tools that let coaches and sporting directors explore the data space themselves. "Here's the tool — filter by what you care about and see what comes out." When the user discovers the answer themselves, adoption is dramatically higher.
Why it's an edge: This shifts the analytics department from a bottleneck (everything goes through the analyst) to a multiplier (every decision-maker has data access). It also solves the trust problem: coaches who discover patterns in their own exploration trust the data more than any presentation could achieve.
How to exploit: Invest in self-service dashboards and exploration tools rather than report generation. Build "tools not answers." Measure adoption by how many queries non-analysts run per week, not by how many reports the analytics team produces.
Sam Gregory, Inter Miami, StatsBomb Conference, 2022-09-29. "When in doubt, database" principle.
💎 Elite-Only Behavior

Every "Stupid Number" a Coach Reports Should Become a Permanent Test Case

The most valuable model validation isn't top-line accuracy metrics — it's ordered assertions that verify the model respects known football truths. "A penalty-area shot should always have higher xG than an identical shot from 30 yards." These behavioral tests catch failures that aggregate accuracy misses, automate the "eye test," and create a permanent contract between the model and domain expertise. The key practice: every time an analyst or coach says "that number is stupid," the fix goes in as a permanent test.

What most people do
Validate models on aggregate accuracy (log-loss, AUC) and spot-check a few outputs visually.
What the best do
Build a growing library of ordered assertions ("all else equal, A > B") written by domain experts and run on every retrain. Treat every "stupid number" report as a test case to be codified permanently. Never train on the test cases. The library only grows — it never shrinks.
Why it's an edge: Models degrade silently during retraining. A well-maintained assertion suite catches regressions that aggregate metrics miss because the regression might be confined to a specific scenario (like direct corners) that barely moves the aggregate number but produces absurd individual outputs. The assertion library is institutional knowledge that prevents the same mistake twice.
How to exploit: Start an assertion library today. Ask coaches and analysts: "What should always be true about this model's output?" Write each answer as a test. Run the full suite on every model update. Track which tests fail most often — those reveal systematic model weaknesses.
StatsBomb CTO, StatsBomb Innovation in Football Conference, 2019-10-25. "Keep your hand up if you always write unit tests" — nobody does, and that's the problem.
💎 Elite-Only Behavior

Slot's Defensive Transformation Is Immediate and Portable

betting-intelligencecoach-tendency-profiling

Arne Slot's defensive metrics improve within 5 matches at every club he manages. At Feyenoord, he inherited a leaky defense and immediately became one of the best defensive units in the Eredivisie. At Liverpool, the same pattern emerged. This is a genuine coaching signal, not squad quality — the players are the same, the defensive output changes immediately.

What most people do
Attribute defensive improvement to new signings or pre-season preparation.
What the best do
When Slot is appointed, immediately bet unders on his team's goals conceded markets. The defensive transformation precedes the market's adjustment by 8-12 matches.
Bet The Process podcast, Slot at Feyenoord and Liverpool analysis, 2024-2025.
💎 Elite-Only Behavior

Creativity = Novel + Useful — Separate Conception from Execution

A player with high creative decision rating but low completion has high ambition and low execution — these are different coaching problems. Penalizing completion suppresses the creative decisions themselves.

What most people do
Use assists or chance creation counts. Penalize creative passers with low completion.
What the best do
Separate decision quality ("they saw it") from execution quality ("they couldn't deliver it"). Train execution without suppressing decisions.
Why it's an edge: Creative players with low completion are systematically undervalued. The coaching intervention is technical, not tactical.
How to exploit: Target high creative decision rating in recruitment regardless of raw completion. Diagnose conception vs. execution before intervening.
Pieter Robberechts, KU Leuven, StatsBomb Conference, 2022-10-05
💎 Elite-Only Behavior

The Gap Between What a Player Chose and What Was Available Is the Purest Skill Signal

When EPV of the best available action is known, comparing it to the EPV of the player's chosen action reveals decision quality independent of execution. A player who consistently chooses actions within 5% of the optimal available action has elite vision — even if their execution sometimes fails. A player who chooses actions 30% below optimal is making poor decisions regardless of their completion rate. This "decision gap" metric is the closest available proxy for football intelligence.

What most people do
Evaluate decisions by outcomes (did the pass complete? did the shot score?).
What the best do
Evaluate decisions by optimality gap: how close was the chosen action to the best available action? A bad decision that succeeds is still a bad decision. A good decision that fails is still a good decision.
Why it's an edge: Decision quality is the most stable and least coachable component of player performance. A player with a small optimality gap will perform well across systems and contexts because they consistently identify the highest-value action. This is the closest data gets to measuring "football IQ."
How to exploit: Compute per-player optimality gap from EPV option analysis. Use as a primary filter in recruitment — it's system-independent and predicts adaptation to new teams better than any on-ball metric.
Javier Fernandez, FC Barcelona, 2019-10-22. EPV-based decision quality as the core of the option-aware evaluation framework.
💎 Elite-Only Behavior

DDI Captures Coaching Philosophy — Same Players, 4x Different DDI Under Different Managers

Bielsa-era Leeds had DDI values 4x higher than their successors with the same core players. DDI changed because coaching changed, not players. DDI measures the coach's defensive system, not individual quality.

What most people do
Attribute defensive improvements to player recruitment. Use pressing metrics that measure individual actions.
What the best do
Use DDI to isolate coaching contribution. Compare before/after manager changes. Use DDI in manager recruitment.
Why it's an edge: Manager evaluation based on results conflates squad quality with coaching quality. DDI provides a coaching-specific defensive metric.
How to exploit: Build a manager DDI database. A manager with consistently high DDI across different squads is a genuine defensive tactician.
Ricardo Furbino, StatsBomb Conference, 2022-10-05
💎 Elite-Only Behavior

The Cheapest Elite Players Are the Ones With One Correctable Flaw

Players who fit a game model profile in 7 of 8 key metrics but have one identifiable, correctable weakness are systematically underpriced because the market evaluates current performance, not development potential. The analytical edge is distinguishing correctable weaknesses (positional habits, specific technical adjustments, decision speed in certain zones) from structural ones (physical limitations, psychological, age-related).

What most people do
Recruit players who fit all criteria and pay premium prices, or settle for players who fit poorly across multiple dimensions.
What the best do
Specifically search for players who are 90% profile fit with one data-identifiable weakness, then classify whether that weakness is habitual (fixable) or structural (permanent). Require an explicit development hypothesis and coaching plan before acquisition. Track correction at 3-month intervals.
Why it's an edge: The market discount for one visible weakness is disproportionate to the actual cost of fixing it. A positional tendency that takes 6 months of coaching to correct might reduce a player's transfer fee by 40-60%. The ROI on development is dramatically higher than the premium for a "complete" player.
How to exploit: Build a "fixable weakness playbook" calibrated to your club's development capability. For each transfer window, run the search for 7-of-8 profile matches with one fixable weakness alongside the standard search. Compare transfer fee savings against development investment cost.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Positioned as Phase 3 bridge between recruitment and development analytics.
💎 Elite-Only Behavior

Ederson's Champions League Failures Are a Measurable, Repeatable Decision Error

Ederson consistently rushes out on long-range 1v1s, turning 80-90% save-probability situations into 50/50s. This specific decision error has cost Manchester City in multiple high-stakes moments (CL final vs. Chelsea, QF vs. Spurs, Copa America final). The error is identifiable in the data: his engagement rate on long-range 1v1s is far above the optimal threshold.

What most people do
Accept rushing out as "aggressive goalkeeping" and treat the goals conceded as individual bad luck.
What the best do
Map save probability by distance for the specific GK, identify the distance threshold where rushing out becomes net-negative, and train the GK to hold position beyond that threshold.
Why it's an edge: This is a correctable habit, not a talent limitation. Showing the GK their own save-probability curve by distance reveals the exact threshold where their behavior becomes suboptimal. Most GK coaches work from intuition, not from decision-boundary data.
How to exploit: For every GK, compute save probability by engagement distance for 1v1 situations. Identify the crossover point where holding beats rushing. Put pitch markings in training at that threshold. If the GK is an opponent, exploit by manufacturing long-range 1v1s (chip passes over the midfield into the channel).
John Harrison, via Max Odenheim, StatsBomb Conference, 2021-11-04. Ederson CL final, QF, Copa America examples cited.
💎 Elite-Only Behavior

360 Data Makes Player Positioning Visible — But Only 500 Manual Labels Are Needed to Classify All Phases of Play

The bottleneck in tactical phase classification isn't model complexity — it's labeled training data. With GCN embeddings from 360 data, only ~500 manually labeled actions (about 30 minutes of analyst time) are sufficient to train a simple classifier that accurately labels ALL remaining actions into phases of play. The embedding captures the spatial structure; the classifier just needs a few examples of each phase. This 500-label approach is 100x more efficient than traditional manual video coding.

What most people do
Manually code phases of play from video, requiring hundreds of hours of work per season.
What the best do
Train GCN embeddings on 360 data (unsupervised), then label ~500 actions across all phase types, train a lightweight classifier, and auto-classify the entire season. The time investment is 30 minutes of labeling + a few hours of compute, vs. hundreds of hours of manual coding.
Why it's an edge: The speed difference makes tactical phase analysis feasible for every match rather than a select few. Clubs with this capability can run phase-specific analysis for every opponent, while competitors manually code only priority matches.
How to exploit: Build the GCN embedding pipeline. Have your analyst label 500 actions at the start of each season. Auto-classify all matches. Run phase-specific analysis for every opponent preparation, not just the top-6 matches.
StatsBomb Conference presentations on GCN action embeddings with 360 data.
💎 Elite-Only Behavior

The Alpha Is No Longer in Having a Better xG Model — It's in Knowing Where ALL Models Are Blind

As xG-based models have proliferated, the betting market has priced in most model-derived edges. The remaining alpha comes not from building a more accurate model, but from identifying systematic gaps where ALL widely-used models share the same blind spots — features like shot power, ball-striking quality, and situational factors outside the training data.

What most people do
Build incrementally better xG models and assume the improved accuracy translates to betting or analytical edge.
What the best do
Map the feature space that dominant models DON'T capture (shot velocity, player technique quality, crowd effects, specific situational factors). Find situations where these missing features systematically bias predictions. The edge is in the limitations, not the predictions.
Why it's an edge: Model improvement is incremental and converges across competitors. Model gap exploitation is structural — it identifies situations where the consensus is wrong in the same direction.
How to exploit: Catalog the features your models and the market's models cannot capture. For each missing feature, identify the situations where its absence creates the largest systematic bias. Focus analytical resources on measuring those missing features, even imperfectly — any information the market doesn't have is edge.
Ted Knutson, Bet The Process, 2026-02-26. "Everyone has a similar xG model now. The edge was in having a model; now the edge is in finding where models are blind."
💎 Elite-Only Behavior

Your Competitors' Model Gaps Are Your Biggest Edge — Reverse-Engineer What They Can't See

Every analytics model has blind spots (xG ignores pre-shot movement quality, xT ignores defensive positioning, xPass ignores pass intent). If you know your competitors' model stack, you can identify what their evaluation systematically misses and exploit those gaps in the transfer market. A player undervalued by xT-based evaluation because xT doesn't capture defensive risk is a buying opportunity if you have risk-adjusted xT. The model gap is the market gap.

What most people do
Build the best model they can and assume competitors are doing the same.
What the best do
Map the known limitations of each publicly available model (xG: no pre-shot context; xT: no defensive risk; GSAA: no shot-type decomposition). Identify which player attributes each model systematically undervalues. Search for players who are undervalued SPECIFICALLY because of those model limitations.
Why it's an edge: If 80% of clubs use basic xG and xT, then attributes invisible to those models (defensive spatial control, press-breaking carries, threat facilitation) are systematically underpriced. The edge isn't just having a better model — it's knowing what your competitors' models miss.
How to exploit: Catalog your competitors' likely model stack (from hiring patterns, published research, conference presentations). Identify the gaps. Build metrics that specifically measure what their models miss. Recruit players who score highly on your gap-specific metrics.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Model gap exploitation as a meta-strategy for competitive advantage.
💎 Elite-Only Behavior

The Best Recruitment Filter Is the Game Model — Not the Position

Most recruitment pipelines filter candidates by position first, then evaluate within position. Elite recruitment pipelines filter by game-model skill requirements first, which sometimes surfaces candidates from unexpected positions. A wide midfielder who profiles identically to your game model's fullback requirements — but has never played fullback — is a legitimate candidate that position-first filtering would eliminate. The skill profile is the constraint, not the positional label.

What most people do
Search for "right-backs" and evaluate all right-backs against the requirements.
What the best do
Define the skill requirements from the game model (progressive carrying, cross-field distribution, pressing intensity, defensive 1v1 rate) without specifying position. Search across all positions for players matching those requirements. Then check positional feasibility as a secondary filter.
Why it's an edge: Position labels are historical artifacts of where a player was deployed, not necessarily a description of their capabilities. Players are regularly "discovered" in new positions (Kimmich at right-back, Alaba at left-back, Walker at center-back) — data-first recruitment finds these transitions before they happen.
How to exploit: For each position to fill, define the skill vector from the game model. Run similarity search across ALL positions, not just the target position. Flag any cross-position candidates whose skill profile matches >80% of requirements.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Game model as the anchor for all recruitment decisions.
💎 Elite-Only Behavior

In-Game Weaknesses Emerge Within 15 Minutes But Most Clubs Wait Until Halftime

Statistical patterns of opponent weakness — a defender being consistently beaten on one side, a pressing trigger being bypassed, a specific passing lane being available — typically become detectable from event data within 10-15 minutes if you know what to look for. The conventional halftime analysis delay means 30+ minutes of missed exploitation opportunity. Real-time weakness detection from the bench, communicated to players during natural stoppages (throw-ins, goal kicks), can shift the match before the opponent adjusts.

What most people do
Collect data during the first half, analyze at halftime, adjust for the second half.
What the best do
Run real-time pattern detection during the match. By minute 15, flag emerging weaknesses to the coaching staff. Communicate specific exploits ("their left CB turns slowly — overload his left shoulder on carries") via substituted warm-up patterns, set-piece positioning, or direct sideline communication during stoppages.
Why it's an edge: A 30-minute information advantage in a 90-minute game is enormous. If you detect and exploit a weakness from minute 15 rather than minute 45, you get 30 additional minutes of targeted attacking against a specific vulnerability.
How to exploit: Build a real-time dashboard showing rolling pressure response, direction of first touch, and defensive recovery time by opponent player. Flag anomalies (significantly worse than their season average) within 15 minutes.
Multiple StatsBomb Conference presentations on real-time tactical analytics capability.
💎 Elite-Only Behavior

Some Teams Require Different Defensive Strategies Under High Block vs. Low Block

Using probabilistic verification on buildup MDPs, the optimal defensive disruption strategy (which side to force, which block height) differs per team AND per block structure. 3 of 20 La Liga/Bundesliga top teams required switching forcing sides between high and low blocks. A single "force them right" instruction may be correct under one block but wrong under the other. Barcelona, uniquely, doesn't care which side you force them to — the exploit is blocking their central players' passing options instead.

What most people do
Give one defensive instruction for the whole match: "force them to the right side."
What the best do
Prepare block-dependent defensive plans. Under high block, force direction X; under low block, force direction Y. For symmetrical teams like Barcelona, focus on teammate-blocking (isolating specific central players' passing options) rather than directional forcing. Have the second-best strategy ready for when the opponent adapts.
Why it's an edge: The nuance of block-dependent strategy switching is invisible to standard scouting. Most opposition analysis identifies one weakness and builds a single plan. The best teams have plans that adapt to their own tactical state, not just the opponent's.
How to exploit: Build buildup MDPs per opponent per defensive setup. Use probabilistic verification to find the optimal forcing direction under each block height. Prepare two defensive briefings: one for high block, one for low block. If the opponent adapts mid-match, switch to the pre-prepared alternative strategy.
Micah, KU Leuven, StatsBomb Conference, 2021-11-04. 3/20 teams requiring block-dependent strategy switching. Barcelona's center-first, symmetrical buildup.
💎 Elite-Only Behavior

The Optimal Defensive Strategy Differs Per Team AND Per Your Block Structure — A Single "Force Them Right" Instruction Is Dangerously Oversimplified

Analysis of 20 teams showed that 3 of them require switching forcing direction between high block and low block. A team that's best forced right under your high block may need to be forced LEFT under your low block — the optimal disruption strategy depends on the interaction between the opponent's buildup patterns AND your defensive structure, not just the opponent alone. A single "force them right" instruction that doesn't account for your own block structure is wrong for at least some configurations.

What most people do
Identify the opponent's weaker buildup side and instruct "force them right" regardless of own defensive setup.
What the best do
Compute disruption probabilities per opponent per defensive setup (high block/low block x force left/force right). For 15% of teams analyzed, the optimal forcing direction SWITCHES between high and low block — meaning a single instruction is correct for one setup but wrong for the other.
Why it's an edge: The interaction between your block and their buildup is invisible without the MDP analysis. Coaches who give blanket forcing instructions based only on the opponent's tendencies will be suboptimal for every game state where they change their own block height.
How to exploit: For each opponent, compute disruption efficiency under all combinations of your block structure and their buildup. If the optimal strategy differs between your high and low block, prepare TWO forcing instructions and communicate the switch trigger.
Micah, KU Leuven, StatsBomb Conference, 2021-11-04. 3 of 20 teams analyzed required different forcing directions under different block structures.
💎 Elite-Only Behavior

The Added Value of a Decision Is Measured Against What Was Available — Not Against Zero

EPV of a chosen action minus EPV of the best alternative gives the "added value" of the decision. If a player's best alternative was a 0.10 EPV pass and they chose a 0.12 EPV pass, their added value is 0.02 — not 0.12. A player who consistently finds the 0.02 improvement over the obvious option is elite. A player whose choices match the best obvious option has adequate but not exceptional decision-making. A player who consistently chooses below the best alternative is costing the team.

What most people do
Measure action value as the EPV delta of what happened — ignoring what COULD have happened.
What the best do
Model all available options at each decision point. Compute added value = chosen action EPV - best alternative EPV. This reveals whether the player's contribution was truly creative or merely adequate.
Why it's an edge: A player on a great team may have high EPV per action because the team creates good options — the best available option is already high, and the player just has to take it. Their added value may be near zero. A player on a weak team who consistently finds 0.03 above the best alternative in difficult situations is genuinely creative — but their raw EPV will be low because the team context is weak.
How to exploit: Compute added value per player. Use it as a team-context-independent creativity metric. Identify players on weak teams with high added value — they're demonstrating individual quality in adverse conditions.
Javier Fernandez, FC Barcelona, 2019-10-22. Option-aware evaluation as the pinnacle of the EPV framework.
💎 Elite-Only Behavior

Pass Originality Is Measurable — And It's the Strongest Signal of Creative Quality

By computing how surprising each pass is relative to the expected pass distribution at that moment, you get a pass originality score. Players who consistently make high-originality passes that succeed — passes that the model wouldn't predict but that work — are demonstrating creative vision that no standard metric captures. This is distinct from pass completion, progressive passes, or xA, all of which can be generated by volume.

What most people do
Use xA (expected assists) or progressive pass volume as proxies for creativity. These conflate opportunity (team context, position) with quality.
What the best do
Train a model to predict pass destination from context (field position, player positions, pressure state). Compute surprise = -log(P(actual_destination)). Players with high mean surprise AND high completion in high-surprise passes are genuinely creative.
Why it's an edge: Creativity is the hardest attribute to quantify, so the market prices it via subjective scouting consensus. A data-driven originality metric finds creative players that scouting misses — especially those on weaker teams whose creativity doesn't translate to assists because teammates don't finish.
How to exploit: Build the pass prediction model, compute surprise scores, filter to high-surprise + high-completion. These players are creative AND precise — the rarest combination. Cross-reference with xA: players with high originality but low xA are being wasted by their teammates.
Derived from option-aware pass decision evaluation framework, StatsBomb Conference presentations on pass intent and decision quality.
💎 Elite-Only Behavior

A 0.4% Probability Pass That Succeeds Is Your Best Recruitment Signal

The Penetrative Pass Probability (P3) model predicts whether a penetrative pass is AVAILABLE at any moment — regardless of whether the player actually plays it. The gap between probability and execution is the analytical insight: players who convert low-probability penetrative moments (0.4% probability passes that succeed) demonstrate exceptional passing ability visible nowhere else in the data. These are the signatures of genuinely special creative players.

What most people do
Evaluate passers by completion rate, progressive pass volume, or xA — all of which are dominated by volume and opportunity rather than quality of vision.
What the best do
Identify moments where the P3 model says no penetrative pass should be possible (compact defensive hull, controlled space) but the player finds one anyway. These rare moments (maybe 2-3 per match for elite players) are the strongest signal of creative passing quality available.
Why it's an edge: The market evaluates passers on volume metrics that correlate with team quality and playing time. P3-based evaluation isolates individual vision from team context — a player on a mediocre team who consistently finds penetrative passes that the model says shouldn't exist is a hidden gem.
How to exploit: For each passing target, compute the ratio of actual penetrative passes to P3-predicted opportunities. Players with high actual/predicted ratios in low-probability situations are elite creators. This metric is team-independent and identifies talent on weaker teams that volume-based metrics miss.
Hadi Sotude, StatsBomb Conference, 2021-11-04. 0.4% probability successful pass cited as recruitment signal.
💎 Elite-Only Behavior

Risk Preference Is a Stable Trait — If You Want a Different One, You Have to Recruit Differently

Using EPV, each player's risk preference in decision-making can be quantified: do they consistently choose high-variance/high-reward actions or low-variance/low-reward ones? The key finding: risk preference is a stable individual trait, not a coaching-adjustable behavior. Messi, Arthur, and Puig have distinct, measurable risk profiles that persist across game states. If your game model requires risk-seeking midfield play and your current midfielders are all risk-averse, coaching won't fix it — you need different players.

What most people do
Try to coach risk tolerance ("be braver," "take more risks") without measuring it or recognizing it as a stable trait.
What the best do
Quantify each player's risk preference from EPV action choice data. Match risk preference to game model role requirements. Accept that mismatches between player risk preference and role requirements are structural, not fixable through coaching.
Why it's an edge: Risk preference profiling separates "the decision was right but failed" from "the decision was wrong but succeeded." A coach who penalizes all failed risky passes teaches players to avoid risk, destroying the team's ability to create. Understanding that some failures are correct decisions that happened to fail preserves the positive-expected-value behaviors.
How to exploit: Profile every player's risk preference. Map your game model's risk requirements per position. Recruit specifically for risk-preference fit, not just skill fit. When evaluating in-game decisions, separate decision quality (was the EPV of the chosen action positive?) from execution quality (did it succeed?). Coach execution, not decision preference.
Javier Fernandez, FC Barcelona, StatsBomb Innovation in Football Conference, 2019-10-22. Messi/Arthur/Puig distinct risk profiles demonstrated.
💎 Elite-Only Behavior

Elite Strikers Have Spatial Sweet Spots That Persist Across Seasons

Individual player shot maps show persistent spatial patterns — zones where a specific striker converts at 2-3x the average rate for that zone, and other zones where they underperform. These sweet spots reflect biomechanical preference (dominant foot, body shape, preferred shot type) and are remarkably stable across seasons. A striker's conversion rate at their sweet spot is a genuine repeatable skill, not variance.

What most people do
Evaluate finishing quality using aggregate xG overperformance, which conflates shot selection, sweet-spot frequency, and luck.
What the best do
Map individual sweet spots over multi-season data. Evaluate a striker's value partly by how often the team's system delivers the ball to their sweet spot. A system-striker mismatch (the system creates chances in zones that aren't the striker's sweet spots) depresses conversion rates that look like poor finishing.
Why it's an edge: A striker who "can't finish" in one system may be elite in another that delivers to their spatial preference zones. This is invisible in aggregate xG analysis.
How to exploit: Map each striker's sweet spot from 3+ seasons of data. Before signing, check whether your team's chance creation zones overlap with the striker's sweet spots. If not, either adjust the system or look elsewhere.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Coutinho long-range example; multi-season data required for sweet-spot reliability.
💎 Elite-Only Behavior

Player Similarity Search Surfaces Candidates That Scouts Would Never Identify

Similarity search across multi-dimensional player profiles (20+ metrics, position-adjusted, weighted by game-model importance) systematically identifies candidates from leagues and clubs that scouts don't cover. The most valuable output of similarity search isn't confirming the scouting shortlist — it's surfacing players from lower leagues, smaller clubs, or unfashionable positions who profile identically to the target but are invisible to the scouting network. These are consistently the highest-ROI transfers.

What most people do
Use similarity scores to validate an existing shortlist built by scouts.
What the best do
Run similarity search FIRST, present the results alongside the scouting shortlist, and specifically highlight candidates the scouts hadn't identified. These unknown candidates are where the informational edge exists — by definition, if the scout already knows about them, competitors likely do too.
Why it's an edge: Scouts are geographically and reputationally biased. They watch the leagues they know, follow the players they've heard of. Data-driven similarity search has no geographic bias — it finds the best statistical match regardless of league, nationality, or reputation.
How to exploit: When filling a position, run similarity search on the departing player's profile across ALL leagues with available data. The candidates who appear in the similarity results but are absent from the scout's shortlist are your highest-value targets.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18; referenced as Phase 1 recruitment analytics core capability.
💎 Elite-Only Behavior

The Breakdown Point in a Possession Chain Is More Diagnostic Than the Outcome

A possession that ends with a turnover in zone X tells you the outcome. The breakdown POINT — where in the chain the possession deviated from the game model's intended sequence — tells you the cause. These are often different: the turnover may happen in the final third, but the breakdown was a missed pressing trigger in midfield that forced a long ball, which was won but led to a rushed attack. Diagnosing at the breakdown point rather than the failure point changes the coaching intervention entirely.

What most people do
Analyze possessions from their terminal event (shot, turnover, etc.) and work backward.
What the best do
Define the game model's expected sequence for each possession type, then identify the FIRST deviation from that sequence. That deviation is the breakdown point — not the eventual failure.
Why it's an edge: Coaching the terminal event (e.g., "don't lose the ball in the final third") is treating the symptom. Coaching the breakdown point (e.g., "the #8 didn't trigger the press, which forced the long ball sequence that eventually failed") treats the cause. The same terminal failure may have 5 different root causes at different breakdown points.
How to exploit: For each possession type in the game model, define the expected sequence. Build automated detection of the first deviation point. Present to coaches as: "The possession failed in zone X, but the breakdown was in zone Y at event Z."
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18. Zone progression analysis with player attribution at breakdown nodes.
💎 Elite-Only Behavior

Outcome-Masked Bad Decisions Are Regression Time Bombs

A player making consistently poor decisions but getting lucky outcomes will be rated well by any outcome-based metric. When luck regresses, performance collapses "suddenly" — but decision quality was always poor. Decomposing value into decision, execution, and outcome quality catches this before regression.

What most people do
Evaluate on outcomes. Good xG contribution = good player. Decision quality only questioned after bad results.
What the best do
Decompose action value into decision, execution, and outcome quality separately. Flag players where outcomes >> decisions as regression candidates.
Why it's an edge: Identifying "luck-masked bad decisions" before regression gives 3-6 months lead time. Avoid buying in lucky phases; coach decisions proactively.
How to exploit: Build "decision quality gap" = outcome rank - decision rank. Large positive gaps = regression candidates. Large negative gaps = breakout candidates.
Javier Fernandez, FC Barcelona, 2019-10-22
💎 Elite-Only Behavior

Pass-Only Pressure Analysis Misses the Best Press-Breakers

Players like Dembélé look like pressure liabilities on passing radars but their carry/dribble response is dramatically forward-positive. They draw the press and drive 20 yards upfield. Pass-only pressure analysis gives systematic false negatives for carry-positive press-breakers.

What most people do
Classify pressure response from passing data alone.
What the best do
Compute separate directional distributions for passes, carries, AND dribbles under pressure. Identify action-type substitution patterns.
Why it's an edge: The most common player evaluation error in pressure analysis. Carry-positive press-breakers are systematically undervalued by pass-only models.
How to exploit: Build "pressure action-type substitution" profiles. Scout specifically for carry-positive pressure responders — they're underpriced.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23
💎 Elite-Only Behavior

"Dead Teams" Can Be Identified by Match 3 — Don't Wait for the Model to Catch Up

betting-intelligencepromoted-team-evaluation

Southampton 2024-25 were identifiable as a dead team from their opening matches. The xG against was catastrophic, the defensive structure was non-existent, and the manager showed no ability to adapt. Yet the betting market and most models still gave them reasonable survival odds for weeks. The information was available immediately; the market was slow to incorporate it because models rely on accumulated data rather than pattern-matching against the obvious.

What most people do
Wait for 10+ matches of data before drawing strong conclusions about promoted teams.
What the best do
Apply rapid classification after matches 1-3 using a combination of xG metrics and tactical observation. Dead teams show a distinctive signature: they can't hold their defensive shape for more than 2-3 phases of opponent possession. This is visible immediately.
Bet The Process podcast, Southampton 2024-25 analysis.
💎 Elite-Only Behavior

Second-Phase Set-Piece Planning Is Where the Real Goals Are

betting-intelligenceset-piece-economic-value

Most set-piece goals don't come from the initial delivery — they come from the second phase (the scramble after the initial header/clearance). Elite set-piece coaches like Nicolas Jover (Arsenal) plan not just the delivery but the positioning for the clearance, the second ball, and the recycled cross. Teams that only rehearse first-phase deliveries leave 60% of set-piece value on the table.

What most people do
Rehearse corner deliveries and free kick routines. Treat the clearance as the end of the set piece.
What the best do
Design specific second-phase positioning — where should players be when the ball is cleared? Who attacks the second ball? Where does the recycled cross go? The second phase is more rehearsable and less defended than the first.
Bet The Process podcast, Arsenal set-piece analysis under Jover, 2024-2025.
💎 Elite-Only Behavior

Same Zone Swings 30+ Points in Threat Based on Game Situation

Man City's zone-16 effectiveness comes from line-breaking passes (63% of value in one situation cluster), not generic zone dominance. Two situations in the same zone swing 30+ percentage points based on defensive distance and teammates ahead.

What most people do
Use zone-based xT and coach to zones.
What the best do
Discover situation clusters via representation learning. Compute per-situation xT. Coach to the situation, not the zone.
Why it's an edge: Zone-level analysis averages across fundamentally different contexts. A team average in a zone may be elite in one situation and poor in another.
How to exploit: Train a multi-task autoencoder on 360 data. Identify which situations your team excels or struggles with per zone.
Zitian Tang, Tsinghua/Brown, StatsBomb Conference 2023
💎 Elite-Only Behavior

Carbon-Copy Recruitment Protects the Depth Gap Better Than "Adding Competition"

The most effective squad-building strategy for maintaining depth is "carbon-copy recruitment" — signing players who are stylistically near-identical to the starters, not just positionally compatible. When the backup plays the same way as the starter, the tactical system doesn't degrade. When the backup has a different profile (e.g., replacing a ball-playing CB with an aggressive header-of-the-ball CB), the team's tactical structure changes and compounds the quality drop.

What most people do
Sign backups based on general positional need and available budget, often getting a "different option" rather than a like-for-like replacement.
What the best do
Profile starters precisely and sign backups who replicate the same tactical behaviors — same passing profile, same defensive positioning style, same movement patterns. The tactical system stays constant; only the execution quality drops slightly.
Bet The Process podcast, recruitment strategy analysis, 2024-2025.
💎 Elite-Only Behavior

The Same Data Needs Three Different Framings for Three Different Stakeholders

Each club stakeholder has a different time-horizon discount factor: head coach (this week), sporting director (this season/next), academy director (3-5 years). The same underlying data and analysis supports all three framings, but presenting a long-term recruitment insight to a coach in long-term terms guarantees it will be ignored. The framing must match the stakeholder's discount factor, not the analyst's time horizon.

What most people do
Present analysis in one framing (usually the analyst's preferred time horizon) to all stakeholders.
What the best do
Map each stakeholder to their discount factor and reframe the same finding accordingly. For the coach: "this player helps us win Saturday." For the sporting director: "this player's value increases over 2 years." For the academy director: "this player profiles as first-team by age 22." Same data, three presentations.
Why it's an edge: Most analytics departments lose influence not because their analysis is wrong but because their communication doesn't match the audience's time horizon. Mastering the framing multiplies the impact of existing analytical capability without any new models or data.
How to exploit: Before every presentation, identify the stakeholder's discount factor. Rewrite findings in that time horizon. The data doesn't change — only the first sentence and the call to action change.
Sam Gregory, Inter Miami, StatsBomb Conference, 2022-09-29.
💎 Elite-Only Behavior

Scout by Passing Signature Cluster, Not Position

Two fullback clusters: progressive (wing-to-box, high xT) and conservative (backline recycling, low xT). Signing a conservative recycler to replace a progressive wing-back creates a system mismatch aggregate metrics don't predict.

What most people do
Scout by nominal position and aggregate metrics.
What the best do
Cluster by zone-to-zone xT-weighted passing vectors. Rank within the departing player's cluster.
Why it's an edge: Nominal position is a poor proxy for tactical role. Two "right backs" can have opposite passing functions.
How to exploit: Build passing signature clusters. Identify the departing player's cluster. Rank within-cluster by xT per action.
PhD student, StatsBomb Conference, 2019-10-30