The categories you choose to count determine which players, teams, and actions look valuable — even when the underlying events are identical. An "alternative box score" exercise demonstrates this: take the exact same game events and categorize them differently, then run the same regression model. Different counting schemes produce different player value rankings. In basketball, traditional counting (rebounds, assists, steals, blocks) favors small forwards; an alternative counting scheme using the same play-by-play data favors centers. No change in underlying performance — only the way events were categorized changed. This is directly transferable to football: how you define "progressive pass," "chance creation," or "defensive action" determines who looks good.
When designing or evaluating a counting system: (1) Recognize that all statistics are arbitrary — they are representative abstractions, not discovered truths. The only non-arbitrary numbers are the score and the result. (2) Test whether your counting scheme produces different rankings than plausible alternatives. If small definitional changes dramatically shuffle the rankings, the scheme is fragile. (3) Ask: does this counting scheme describe the game in a way a knowledgeable observer would recognize? If the "best" players by your scheme don't match expert intuition at all, the scheme may be measuring an artifact. (4) When defining a metric, be precise about boundaries — NBA's "contested shot" definition (body-to-body distance) classified 90% of three-pointers as "open," which is obviously wrong. Hand position and closing speed matter more than center-of-mass distance.
Re-categorizing the same play-by-play data flipped player value rankings entirely. The NBA's "contested shot" definition classified 90% of 3-pointers as "open." How you define "progressive pass" determines the leaderboard. All statistics are representative abstractions.
Event data providers use fundamentally different counting conventions. One provider's "pressure" event requires physical proximity; another's includes distant angle-blocking. One counts a failed dribble as a "dribble attempted + failed"; another doesn't record the attempt at all unless the dribble succeeded. These aren't measurement errors — they're different definitions of the same concept. Cross-provider comparisons without accounting for counting-scheme differences produce meaningless results.