Home/Soccer Analytics/Counting Scheme Measurement Bias

Counting Scheme Measurement Bias

Data InfrastructureLevel 2 — Intermediate

What It Is

The categories you choose to count determine which players, teams, and actions look valuable — even when the underlying events are identical. An "alternative box score" exercise demonstrates this: take the exact same game events and categorize them differently, then run the same regression model. Different counting schemes produce different player value rankings. In basketball, traditional counting (rebounds, assists, steals, blocks) favors small forwards; an alternative counting scheme using the same play-by-play data favors centers. No change in underlying performance — only the way events were categorized changed. This is directly transferable to football: how you define "progressive pass," "chance creation," or "defensive action" determines who looks good.

Correct Execution

When designing or evaluating a counting system: (1) Recognize that all statistics are arbitrary — they are representative abstractions, not discovered truths. The only non-arbitrary numbers are the score and the result. (2) Test whether your counting scheme produces different rankings than plausible alternatives. If small definitional changes dramatically shuffle the rankings, the scheme is fragile. (3) Ask: does this counting scheme describe the game in a way a knowledgeable observer would recognize? If the "best" players by your scheme don't match expert intuition at all, the scheme may be measuring an artifact. (4) When defining a metric, be precise about boundaries — NBA's "contested shot" definition (body-to-body distance) classified 90% of three-pointers as "open," which is obviously wrong. Hand position and closing speed matter more than center-of-mass distance.

Progression Levels

Diagnostic Tree

Coaching Cues

  • "If your stat says he's the best but every coach says he's not, check your definition before you check the coaches."
  • "If 90% are 'open,' your definition of open is wrong."
  • "Once you count something, people assume it matters. Be careful what you count."

Common Errors

  1. Assuming counted stats are objective: All counting stats are definitional choices. Different choices produce different conclusions from the same events.
  2. Not testing alternative schemes: If you've only tried one definition, you don't know how fragile your conclusions are.
  3. Confusing precise measurement with accurate measurement: A metric can be computed to many decimal places and still measure the wrong thing.

Edges

🔑 Hidden Causal Lever

Your Counting Scheme Determines Who Looks Valuable — From Identical Events

data-infrastructurecounting-scheme-bias

Re-categorizing the same play-by-play data flipped player value rankings entirely. The NBA's "contested shot" definition classified 90% of 3-pointers as "open." How you define "progressive pass" determines the leaderboard. All statistics are representative abstractions.

What most people do
Treat published metrics as objective. Accept one definition without testing alternatives.
What the best do
Test alternative counting schemes. If small definitional changes shuffle rankings by >30%, the metric is fragile.
Why it's an edge: Organizations build strategies around untested definitions. Once counted, people assume it matters.
How to exploit: For any decision-driving metric, build at least one alternative definition. If the top-10 changes dramatically, the decision is unreliable.
Cross-domain parallel
Backtesting results are sensitive to index construction — same stocks, different weights, different "best" strategies.
Seth Partnow, StatsBomb Conference, 2019-10-28
Conventional Wisdom Is Wrong

Different Data Providers Count the Same Match Differently — And Most Analysts Don't Know Which One They're Using

data-infrastructurecounting-scheme-bias

Event data providers use fundamentally different counting conventions. One provider's "pressure" event requires physical proximity; another's includes distant angle-blocking. One counts a failed dribble as a "dribble attempted + failed"; another doesn't record the attempt at all unless the dribble succeeded. These aren't measurement errors — they're different definitions of the same concept. Cross-provider comparisons without accounting for counting-scheme differences produce meaningless results.

What most people do
Treat event data as ground truth regardless of provider. Compare metrics built on Provider A's data against benchmarks built on Provider B's data.
What the best do
Document every counting convention used in their analytical pipeline. When comparing to external benchmarks or published research, verify the counting scheme matches. Build conversion factors between providers where possible.
Why it's an edge: Most published "norms" and "benchmarks" in football analytics are provider-specific. If your data comes from a different provider, those benchmarks may be systematically off. A team that looks "average" on Provider A's metrics might look "elite" on Provider B's — or vice versa.
How to exploit: When building any metric, document the provider and counting conventions used. When comparing to external research, check the source provider. When switching providers, rebuild all benchmarks rather than assuming continuity.
Thom Lawrence, StatsBomb Data Launch, 2018-05-23. Explicit discussion of StatsBomb's unique pressure event definition and counting conventions.

Sources

  • Seth Partnow, StatsBomb Innovation in Football Conference, YouTube, 2019-10-28 — presented alternative box score exercise showing how counting scheme changes player value rankings; demonstrated NBA's flawed "contested shot" definition; argued that all statistics are representative abstractions with real adoption consequences