Home/Soccer Analytics/Player Similarity Scores for Recruitment

Player Similarity Scores for Recruitment

RecruitmentLevel 3 — Advanced

What It Is

A player similarity score computes how statistically close a candidate player is to a target archetype, using a weighted distance metric across the relevant profile features. Given a departing player (or archetype), the model surfaces the most similar players in the broader dataset — including players no scout has currently flagged. The primary value: surfacing unknown candidates who match the profile but aren't on anyone's radar.

Correct Execution

Build a similarity score by: (1) selecting the metrics that define the role profile; (2) normalizing each metric to a common scale (z-score or percentile within position-peer group); (3) computing weighted Euclidean distance from the target player's profile vector. Lower distance = more similar. Filter by age range, league level, and contract status before presenting results. Always present the full profile comparison side-by-side, not just the score — the score is a shortlist tool, not a decision.

Progression Levels

Diagnostic Tree

Coaching Cues

  • "Similarity scores find the players you weren't looking for." — Ted Knutson, 2018
  • "The top of the list is the starting point, not the answer."

Common Errors

  1. Equal-weighting all features: Some metrics matter much more than others for a specific role. Weight by coach priority.
  2. Not normalizing within position group: A winger with 50 progressive carries looks very different from a striker with 50 — normalize within role.
  3. Presenting the score without the profile breakdown: The score is a shortlist filter. The decision should be made from the full profile comparison.

Edges

💎 Elite-Only Behavior

Player Similarity Search Surfaces Candidates That Scouts Would Never Identify

Similarity search across multi-dimensional player profiles (20+ metrics, position-adjusted, weighted by game-model importance) systematically identifies candidates from leagues and clubs that scouts don't cover. The most valuable output of similarity search isn't confirming the scouting shortlist — it's surfacing players from lower leagues, smaller clubs, or unfashionable positions who profile identically to the target but are invisible to the scouting network. These are consistently the highest-ROI transfers.

What most people do
Use similarity scores to validate an existing shortlist built by scouts.
What the best do
Run similarity search FIRST, present the results alongside the scouting shortlist, and specifically highlight candidates the scouts hadn't identified. These unknown candidates are where the informational edge exists — by definition, if the scout already knows about them, competitors likely do too.
Why it's an edge: Scouts are geographically and reputationally biased. They watch the leagues they know, follow the players they've heard of. Data-driven similarity search has no geographic bias — it finds the best statistical match regardless of league, nationality, or reputation.
How to exploit: When filling a position, run similarity search on the departing player's profile across ALL leagues with available data. The candidates who appear in the similarity results but are absent from the scout's shortlist are your highest-value targets.
Ted Knutson, Barcelona Coach Analytics Summit, 2018-11-18; referenced as Phase 1 recruitment analytics core capability.

Sources

  • Ted Knutson, Barcelona Coach Analytics Summit, YouTube, 2018-11-18 — demonstrated similarity scores for Jorginho replacement at Napoli; surfaced Frenkie de Jong, Maxime Lopez, Rodrigo Bentancur as unknown/underrated candidates
  • Ted Knutson & Siqur Arshad, WFS 2019 StatsBomb presentation, YouTube, 2019-10-02 — added transfer validation use case: Jovic's Frankfurt statistics resembled Ronaldo's profile, providing pre-transfer analytical validation for Real Madrid's acquisition; also showed Ronaldo's Juventus vs. Real Madrid profile as nearly identical, confirming cross-league statistical stability as a feature of genuine player quality