Home/Soccer Analytics/Representation Learning for Game States

Representation Learning for Game States

Data InfrastructureLevel 4 — Expert

What It Is

Using an autoencoder trained simultaneously on soft diagrams (pitch control differentials), pass success probability surfaces, and action-prediction tasks to produce a low-dimensional bottleneck representation of any pitch frame. This enables geometry-aware clustering of game situations — 95 discovered situations across 30 pitch zones, each defined by combinations of defensive distance, number of teammates ahead, and ball positioning relative to defensive lines.

Correct Execution

Multi-task autoencoder: input = spatial surfaces from 360 data; output = reconstruct inputs + predict next action. The bottleneck layer = situation representation. Cluster the representations to discover distinct game situations. Then compute per-situation metrics (xT, pass success, etc.) for disaggregated team evaluation.

Sources

  • Zitian Tang, Tsinghua/Brown University, StatsBomb Conference 2023, YouTube, 2023-11-01