Home/Soccer Analytics/xG Intent Ambiguity Correction

xG Intent Ambiguity Correction

Expected Value ModelsLevel 3 β€” Advanced

What It Is

Identifying and correcting the goal-mouth-angle inversion artifact in xG models: at very low angles (near the byline), apparent xG increases because unintended events (clearances, long passes) are only labeled as "shots" when they accidentally result in goals. This creates a selection bias where the training data at extreme angles is contaminated by non-shot events. The fix: extrapolate the expected monotonic angle-xG relationship and downsample over-represented "goal" observations in peripheral regions.

Correct Execution

Plot xG vs. goal-mouth angle. If xG increases at very low angles (where it should monotonically decrease), the model has learned from contaminated data. Fix by: (1) identifying the angle threshold where the inversion begins, (2) extrapolating the monotonic curve, (3) downsampling or reweighting peripheral-angle goals.

Diagnostic Tree

Edges

πŸ”‘ Hidden Causal Lever

xG at Extreme Angles Is Contaminated by Non-Shot Events

At very low goal-mouth angles, xG increases counterintuitively because events from extreme angles only enter training data as "shots" when they accidentally result in goals. Selection bias toward goals contaminates the peripheral-angle training data.

What most people do
Accept xG at face value at all angles.
What the best do
Plot xG vs. angle, check for inversion, extrapolate the monotonic curve, downsample contaminated data.
Why it's an edge: Uncorrected models overvalue shots from extreme angles, distorting evaluation for cutback-heavy play styles.
How to exploit: Run the angle-xG diagnostic. If inversion exists, correct and revalidate.
Dr. Dinesh Vatvani, StatsBomb Conference, 2022-10-04

Sources

  • Dr. Dinesh Vatvani, StatsBomb Conference 2022, YouTube, 2022-10-04