In any action-value model (EPV, SARSA, expected threat), the reward landscape is spatially non-uniform: strikers play in zones with steep value gradients near goal, while defenders play in zones where the value surface is nearly flat. This means a striker completing an ordinary 5-yard forward pass gets a large positive ΔEPV (the value curve is steep), while a defender completing an identical 5-yard forward pass gets almost zero ΔEPV (flat value surface). Naively summing EPV deltas per player will always rank attackers highest — not because they're better players, but because they have access to high-reward zones. This is a systematic measurement artifact, not a player quality finding.
Recognize that the "team of Messis" thought experiment reveals the problem: 11 identical players at different positions would produce wildly different EPV delta totals. Striker-Messi gets high-reward, low-risk deltas. Defender-Messi gets low-reward, high-risk deltas. Midfield-Messi gets nothing — stuck in the "trough of meh." Before trusting any EPV leaderboard, verify that the ranking isn't simply a position-access artifact. The fix is opportunity normalization: compare each player's output to the distribution of outcomes available in their specific context, not raw deltas.
Value delta sums rank strikers highest because they play in high-reward zones, not because they're most valuable. A hypothetical identical player scores differently at striker vs. DM. Without opportunity normalization, every RL-based valuation is just zone access ranking.