Restarting fast and slow: Analyzing restart behavior in 2014 J-League

0

Last year I wrote a series of posts on effective playing time in the top two J-League divisions at the mid-season and end-of-season marks.  Further analysis of the numbers using an admittedly crude regression model indicated that some clubs have a significantly greater effect on the amount of playing time than others, for better or […] Read more ›

Paper Discussions to return in 2015

0

One of the features of Soccermetrics in the early days of the website has been “Paper Discussions“, in which I discuss research articles that investigate quantitative analysis problems related to soccer.  Some of them were classic papers in the field, while others were recently published.  There were even some researchers who went out of their […] Read more ›

Europa League seedings do not separate best and worst

Due to basing the seedings for the Europa League draws on six matches within uneven groupings which were created by the UEFA coefficient systems, the seedings for today’s Europa League draw do not actually separate the best and worst remaining … Continue reading Read more ›

Weighted Shots v Unweighted Shots As A Predictor of Future Goal Difference in the EPL.

Tom Tango has recently presented an alternative to Corsi in hockey that weights shots differently depending on whether they resulted in goals, saves, misses or blocks.

One of the logical tests of the new metric is see how well it correlates to useful team information, such as future goal difference, compared to projecting from previously used metrics, such as unweighted shot differential or ratios.

The expectation voiced in many hockey circles was that because the “Tango” correlated almost perfectly to the traditional Corsi metric, the added information hoped for by weighting different types of shots would be negligible, at best.

In a typical concise and insightful post, here, Tango addresses the issue of the virtually perfect correlation between both metrics. Pointing out that using basic shot data from identical samples to test the correlation to out of sample data, such as future goal difference, gave different coefficients of correlation depending on whether the Corsi or Tango was used.

In short, weighted shots showed higher r values, despite the strong correlation between the two metrics.

 r Values for Weighted & Unweighted Shot Differential and Ratios when Correlating to Future Premiership Goal Difference.

After X Games r for TSR r for Shot Differential. r for Weighted Shot Differential
2 0.49 0.51 0.57
6 0.70 0.71 0.77
10 0.70 0.71 0.76
15 0.74 0.74 0.80
18 0.73 0.74 0.80
20 0.73 0.74 0.79
24 0.72 0.73 0.77
30 0.65 0.66 0.69
34 0.55 0.55 0.56

Tango’s defence of his new metric can be summed up in this extract from the linked post.

“But more amazing is that even though the correlation of Corsi to Tango (both based on the same samples) was close to r=1, when we correlate each to out-of-sample data (in this case, goal differential from OTHER games), Tango correlated at r=.50, while Corsi was r=.44.  Or if you prefer r-squared, it’s .25 to .19, respectively.”

I have therefore repeated the exercise for the Premiership, using three flavours of shot based metrics in one part of the season and testing the correlation between these at an individual team level and goal difference for teams in the remainder of the season.

And the weighting of shots also appears to make a difference in soccer as well as in hockey. Correlation peaks around mid-season, but at every stage, weighting proved a superior correlation to goal difference in the remainder of the season compared to unweighting.

It also makes intuitive sense to reflect the extra information present in a goal compared to just a shot.

Read more ›

Categories
Uncategorized

OptaPro makes off-ball data available for Analytics Forum

Interesting news from OptaPro — Opta Sports’ professional services arm — that player tracking data from English Premier League matches will be made available to successful proposals to their Analytics Forum next year.  The data comes from Opta’s partner Tracab, which uses their image-tracking technology to generate 3D sports data for media customers (they are […] Read more ›

Guest post: Steve Lawrence assesses the Premier League predictions

The Scoreboard Journalism challenge for points and place predictions from prominent media, stats modellers, fans and online publishers for both the Eredivisie and the Premier League has attracted considerable interest and @JamesWGrayson has been publishing regular assessments of how the … Continue reading Read more ›