Using gradient boosting models with cluster-based interpretation to identify the optimal ball circulation parameters when attacking a set defense. Three key parameters emerge: speed of ball circulation, width of play (lateral pitch coverage), and time between consecutive passes. The analysis reveals sweet spots for each parameter and a critical context interaction: speed under pressure increases goal probability (pulling defenders out of position), but speed without pressure decreases it (unnecessary urgency reduces accuracy). Additionally, central play is more rewarding but riskier than wing play — a fundamental tradeoff that requires game-theoretic thinking.
(1) Label events as attacking against a set defense using the proxy framework. (2) Compute rolling features for each event: rolling speed (m/s), rolling width (lateral meters covered), rolling time between consecutive passes. (3) Train a gradient boosting model (XGBoost) to predict P(goal within 5 moves) using these features plus pressure state and distance from goal. (4) Interpret the black box: cluster events by distance-from-goal and feature values, then vary one feature at a time across the cluster centroid, predicting with and without pressure. This reveals local optima in each parameter.
Key findings:
Speed WITHOUT pressure decreases goal probability against organized blocks — unnecessary errors while defense stays set. Speed only helps UNDER pressure. Optimal unpressed speed is ~6 m/s.