Home/Soccer Analytics/Zone-to-Zone Passing Signature Clustering

Zone-to-Zone Passing Signature Clustering

ScoutingLevel 3 — Advanced

What It Is

Clustering players by their zone-to-zone passing xT signature — a vector representing how much xT a player generates from passes between each pair of pitch zones. Players with similar passing signatures play similar roles in buildup (e.g., fullbacks who progress down the wing vs. fullbacks who recycle across the backline). Within each cluster, ranking by total xT generated identifies the best performers at that specific passing style. This is more useful for scouting replacements than generic player similarity scores because it groups players by HOW they create threat, not just how much.

Correct Execution

(1) Divide the pitch into N zones (15 macro-zones for computational feasibility, or 150 micro-zones for higher fidelity). (2) For each player, compute a zone-to-zone passing matrix: how much xT they generate from passes originating in zone X and ending in zone Y. Flatten this into a feature vector. (3) Cluster players using K-means or similar. (4) Interpret clusters by which positions are overrepresented and by the spatial patterns of their passing. (5) Within each cluster, rank by total xT generated to find the best performers at that passing style.

Key distinction: two fullback clusters might emerge — one dominated by direct wing-to-box progression (high xT, high variance), another by conservative backline distribution (low xT, low variance). When scouting a replacement for a progressive fullback, search within the progressive cluster, not across all fullbacks.

Progression Levels

Diagnostic Tree

Coaching Cues

  • "Same style, different quality. The cluster tells you what they try; the xT tells you how well they do it."
  • "Two fullback clusters: one progresses, one recycles. When your progressive fullback leaves, don't scout from the recyclers."
  • "Andrea Conti was the highest pass xT fullback in his cluster last season. That's your replacement."

Common Errors

  1. Using too many zones for the passing matrix: A 150×150 zone matrix per player is computationally expensive and sparse. 15 macro-zones (15×15 = 225 features) is more practical.
  2. Clustering without xT weighting: Clustering on raw pass counts puts volume passers together. Using xT-weighted pass vectors groups players by the VALUE of their passes, not just frequency.
  3. Assuming all cluster members are interchangeable: Cluster membership means similar style; xT ranking within the cluster measures quality.

Edges

💎 Elite-Only Behavior

Scout by Passing Signature Cluster, Not Position

Two fullback clusters: progressive (wing-to-box, high xT) and conservative (backline recycling, low xT). Signing a conservative recycler to replace a progressive wing-back creates a system mismatch aggregate metrics don't predict.

What most people do
Scout by nominal position and aggregate metrics.
What the best do
Cluster by zone-to-zone xT-weighted passing vectors. Rank within the departing player's cluster.
Why it's an edge: Nominal position is a poor proxy for tactical role. Two "right backs" can have opposite passing functions.
How to exploit: Build passing signature clusters. Identify the departing player's cluster. Rank within-cluster by xT per action.
PhD student, StatsBomb Conference, 2019-10-30

Sources

  • PhD student, StatsBomb Innovation in Football Conference, YouTube, 2019-10-30 — presented zone-to-zone passing signature clustering with PCA and K-means; identified 2 distinct fullback cluster types (progressive vs. conservative); ranked within clusters by xT generation for scouting applications