Home/Soccer Analytics/Expected Ball Progression Model

Expected Ball Progression Model

Expected Value ModelsLevel 3 — Advanced

Prerequisites

What It Is

An alternative to goal-based and shot-based models that measures how far the ball advances toward goal during each possession. Every possession has a "high-water mark" — the closest point to goal the ball reaches. With ~200 possessions per game, this produces hundreds of meaningful training signals on a continuous scale, compared to ~2-3 goals. The insight: "when a team gets the ball two-thirds of the way up the field, that means something — they might not score, but it means something." This model captures buildup quality independently of whether it converts to shots.

Correct Execution

For each possession, record the maximum ball penetration (closest distance to goal). Each event in the possession is trained against this high-water mark — events in the early phase of a possession that reaches deep into the opponent's third are labeled as higher-value than events in possessions that stall at midfield. Two-phase possessions (build up deep, recycle, build up again) use separate high-water marks for each phase: the first phase trains on its own deepest point, the recycled phase trains on its deepest point.

Key advantage over EPV: turnovers don't necessarily destroy value. If a team chips the ball into the box and the defender heads it out, the ball is still in the attacking third — the high-water mark persists. The ball rarely teleports back to halfway on a turnover. This makes ball progression less sensitive to the possession-definition problem than EPV.

Progression Levels

Diagnostic Tree

Coaching Cues

"Did the ball get closer to goal? Over the whole possession, not just on the shot." — StatsBomb CTO, 2019
"200 possessions per game, each with a high-water mark — that's so much richer than 2-3 goals."
"Some turnovers don't move the needle. The ball comes straight back."

Common Errors

Treating high-water mark as a substitute for goal probability: Ball progression measures buildup quality, not scoring likelihood. A team can consistently advance to the final third and still not score. Use it as a complementary signal.
Ignoring the pitfall case: "Getting close but no closer is rubbish" — Burnley with 11 behind the ball, the ball reaches the final third but there's no penetration. Pure distance doesn't capture this. Consider weighting by defensive density at the high-water point.

Edges

⚡ Conventional Wisdom Is Wrong

Ball Progression Persists Through Brief Turnovers — The "Teleportation Fallacy"

xg-modelsexpected-ball-progression →

EPV treats turnovers as value-destroying. But after most turnovers, the ball stays in roughly the same zone. A headed clearance from a cross keeps the ball in the attacking third. The "high-water mark" persists through brief possession losses.

What most people do

Use EPV, which drops sharply on any turnover. Penalize players for turnovers even when the ball stays in the attacking zone.

What the best do

Track ball position persistence through turnovers. Distinguish "ball stays in zone" turnovers from "ball exits zone" turnovers.

Why it's an edge: Teams that play aggressively in the final third generate many "turnovers" that don't actually lose territorial advantage. They're systematically undervalued by EPV.

How to exploit: Build a "possession-zone persistence" metric tracking where the ball is 5 seconds after turnover. Don't penalize turnovers that keep the ball in the attacking zone.

StatsBomb CTO, 2019-10-25

Sources

StatsBomb CTO, StatsBomb Innovation in Football Conference, YouTube, 2019-10-25 — described expected ball progression as an experimental alternative to EPV, using possession high-water marks as training labels; emphasized the signal richness (hundreds of data points per game) and resilience to the possession-definition problem