Home/Soccer Analytics/Press Outcome Profiling & PCA Clustering

Press Outcome Profiling & PCA Clustering

Tactical AnalysisLevel 3 — Advanced

What It Is

Analyzing the distribution of outcomes that result from pressing (tackles, fouls, interceptions, loose balls, miscontrol, ball going out of play) as a proportion of total defensive events, then using principal component analysis (PCA) to cluster teams by their pressing profile. Different pressing styles produce systematically different outcome distributions: man-marking styles produce high duel/foul rates, while space-oriented styles produce high interception/loose ball rates. The outcome profile is the fingerprint of the pressing style.

Correct Execution

(1) For each team, collect all defensive events that occur after a pressure event. (2) Classify outcomes: tackle, foul, duel, interception, loose ball, miscontrol, ball out of play. (3) Compute proportions relative to total defensive events for that team (not absolute counts — high-possession teams have fewer defensive events, which biases raw counts). (4) Build a feature vector for each team: [tackle %, foul %, duel %, interception %, loose ball %, ...]. (5) Add spatial features: pressure initiation zone, post-pressure pass direction, average pressure height. (6) Run PCA on the combined feature matrix to reduce dimensions and visualize team clustering.

Key normalization: use proportions of total defensive events per team, NOT per-possession rates. High-pressing teams dominate possession and have fewer total defensive possessions — per-possession rates inflate their pressing metrics.

Progression Levels

Diagnostic Tree

Coaching Cues

  • "Same philosophy, different execution. The PCA caught the difference — now explain it."
  • "If your pressing metric just ranks by possession, it's not measuring pressing."
  • "The outcome profile is the fingerprint of the pressing style."

Common Errors

  1. Using absolute counts instead of proportions: A team with 200 tackles isn't necessarily more aggressive than one with 100 — they may just have more defensive possessions.
  2. Too few features in PCA: Using only outcome types misses the spatial dimension. Include initiation zone, forced direction, and pressure height for richer clustering.
  3. Over-interpreting PCA clusters: PCA shows proximity in feature space, not identical playing style. Two teams close in PCA may press similarly in aggregate but for completely different tactical reasons.

Sources

  • Nicole Kuzlova, StatsBomb Conference 2021, YouTube, 2021-11-04 — presented press outcome proportion analysis and PCA clustering for 5 pressing styles; identified normalization bias from per-possession rates; showed Leeds as PCA outlier due to unique all-field man-marking