What the hell happened to Wigan? Part I – scorer bias at the DW?

Rather than reporting TSR’s here I’m simply going to record the percentage of shots a team takes. It simply looks cleaner. I’m going to start by laying out the basics and building up from there. Firstly shots aren’t equally distributed between home and away teams. There’s a 56.6/43.4 split – i.e., of every 7 shots … Continue reading »
Read more ›

315 down, 65 to go. The Premiership so far

Same as after 261 games, with the important races at the top, and a data dump at the bottom. The European spots Last time round qualcompTSR projected the race would finish like this. The same four teams are still in contention although, as I said at the time, Everton were long shots to get in, … Continue reading »
Read more ›

261 down, 119 to go. The Premiership so far

A look at the interesting races and team summaries at the top, for the data dump check out the bottom of the post. The European spots Let start with Everton. They’ve been simply fantastic, the work David Moyes does at that club year after year is nothing short of remarkable. That being said there’s only … Continue reading »
Read more ›

The Premier League battles laid bare

As a companion piece to the “Race to 38/68/87” article which blogger James Grayson published on the Premier League yesterday, I have looked at the current Euro Club Index assessment of the competition. The title race As I touched on … Continue reading
Read more ›

149 down, 231 to go. The Premiership so far

This post comprises of a big data dump of six tables followed by a summary for each team. For an explanation of any of the metrics, click the (link) following their initial mention. First, how the Premiership teams have performed in terms of TSR (link). The higher the number, the better the team controls the … Continue reading »
Read more ›

The Dynamics of Relegation in the Premier League: Early Warning Signs and Seeing the Forest for the Trees

(c) 2011

In hindsight, relegation often seems inevitable. If you had asked the pundits, Blackpool’s demotion to the Championship last year was all but a done deal in August. But do the data agree? And what can they tell us about the inevitability and predictability of relegation ahead of time, rather than after the fact?

It’s not an easy question to answer. The trick to avoiding what psychologists call hindsight bias is to spot trends before they become facts. But that’s a hard thing to do in the middle of a season when the weekly performance of teams varies for all kinds of reasons and the hoopla and grind of the season make it difficult to see the forest – the real performance of a club – for the trees (some examples are here). Moreover, there are so many different and variable data points to consider – match outcomes, individual player form, injuries, you name it – that normal data analysis techniques aren’t always ideal for assessing what is really going on. And finally, to avoid seeing relegation as inevitable requires analysts to be on the lookout for early warning signs – but how would we know what those signs might be and when they might show up?

To explore how these challenges can be dealt with, let’s look at what happens to relegated clubs during the course of an entire season with data from 2011-12. Some obvious questions you might ask of the data are these:

  • How did relegated clubs perform?
  • Were there obvious trends in performance early in the season – did relegated clubs get better or worse over the course of the season?
  • Were the trends in performance radically different between relegated and non-relegated clubs?

Answering these questions means looking at data over time – trends in performance. It also means cutting through the thicket and noise that is inherent in any performance data that vary across teams, and especially from week to week. Analytically, this means that we are interested in both the long-term (season-long) and short-term (week to week) trends in performance. A nifty technique called lowess smoothing regressions – also known as locally weighted polynomial regression – can provide some answers. While it may sound fancy, it’s actually quite simple. Lowess smoothing is a regression technique that allows us to drill down to the true underlying trends in the data in a way that is sensitive to short-term fluctuations and allows curvilinear relationships. Simply, instead of fitting one straight line through the data for, say, a whole season, the technique takes so-called localized subsets of data (weeks) and runs many (in our case, literally hundreds) of regressions to weed out the outliers and identify the shared trends in the short- and the longer-run.

But enough of the econometrics – how does it look in practice?

To understand how clubs’ fortunes are different, as a first step, we can take match data for clubs that were relegated and those that were not, and compare their performance over time. This allows us to establish their trends – but the key to the lowess analyses is that it allows us to pick out the underlying signal in the data around which clubs’ performance profiles vary.

Offensive Production
So first, here are the trends in goal production among relegated and non-relegated clubs over the course of 2010-11 season, as described by the best fitting lowess regression lines.

As the graphs show, in terms of goal production in 2010-11, all clubs declined somewhat over the course of the first 10 weeks of the season and recovered to the early season levels by Week 20. However, clubs that were relegated steadily declined in their offensive performance (by about 10-15%) after Week 20. In contrast, as a whole, the clubs that were not relegated improved offensively over the course of the season (starting around Week 12). The numbers show that relegated clubs started at a lower level of performance and then deteriorated after passing the season’s halfway point, while clubs that stayed up got better. By the end of the year, the gap between the relegated clubs and the rest of the league was sizable (compare performance levels in early weeks to the last few weeks).

Defensive Production
What about defensive production? Here’s what the picture looks like (recall that more goals conceded are a bad thing on this graph).
Looking at the clubs that stayed up, their levels of defensive performance was very consistent during the course of the season (and better than that of the relegated clubs). In contrast, after about Week 12, clubs that were relegated rapidly declined in their defensive performance (by a whopping 60% or so). The strong U-shaped curve to the trend tells us that the relegated clubs defended poorly early in the season, improved to levels roughly equal to the rest of the league about a third of the way through the season, but then saw a massive and steady deterioration in performance as the year wore on to the tune to about a goal per match by the end of the year.
One objection that could be raised to the analyses is that they compare apples and oranges since we are comparing the three relegated clubs to all clubs that stayed up – strugglers and eventual champions. As a result, perhaps the picture would be different when we compare like with like. To see if that is the case, the next graph shows the defensive production of the three clubs that were eventually relegated as well as that of the three clubs right above them that barely beat the drop (Blackburn, Wigan, and Wolves). 
To see if their performance trends look the same or different from those of the relegated clubs, take a look at the next graph.

As we saw already, the left side of the graph shows that clubs that went down got worse on defence after Week 12. But in stark contrast, as a group, the clubs that barely beat the drop actually improved on defense in the second half of season – their goals conceded trends point down, not up. So while the relegated clubs and their most direct competitors looked fairly similar at the season’s halfway point, their performance profiles radically changed in the second half of the year.
So what do we make of these results? As always, a major caveat is that these are data for only one season; and for the sake of simplicity, we are looking only at goals scored and conceded, rather than more finely grained data. But if we take the results at face value, they tell us that clubs’ performance profiles undergo significant change over the course of a season – and that playing well and being safe in December may have little to do with a club’s fortunes come May. In fact, some trends start taking shape about a third of the season in. As importantly, the data clearly show that there is significant room for improvement (and deterioration) in the second half of the season, so improvement over the course of the season and especially after the January transfer window is likely to be key to a successful drive to stave off relegation. Finally, if we look carefully at the data for clubs relegated in the 2010/11 season, we see that the deterioration in offensive performance was much less than their deterioration in defensive performance – it surely looks as though poor defense is what has West Ham, Blackpool, and Birmingham fighting for a chance to rejoin the Premiership this year.

Read more ›