A Statistical 2010 Premier League Season Preview, Part 5
Update: Table was showing Arsenal's and Manchester City's names swapped, for some reason. I've fixed it.
I think we're just about coming to the end of our little sojourn, and so we're ready to publish some results. Let's first go back over what we've learned.
- We can predict goals scored and goals allowed based on the previous season's results.
- The amount of money a club spends in the transfer market affects their expected goal differential in the season following.
- The goal differential for newly promoted sides can also be projected using their Championship statistics, but a new variable must be introduced to take into account the easy of competitions. In our case, it's -43 from the expected differential.
- We can convert from expected goal differential to expected points relatively easily.
This leads us towards a table that lists expected goals scored, goals conceded, goal differential, and points. Since we dealt with the goal differential directly during steps 2 and 3, I had to manually adjust the expected goals scores and conceded by one or two goals to get things to match up correctly. If I had thought about this properly earlier, it wouldn't have been necessary, but such is life. Bear in mind, of course, that table below is full of projections - I've already discussed the sort of errors we get when back-checking - and there's a lot of things that can change over the course of the season. But it's pretty informative anyway (I hope!).

Figure 1: Projected 2010/2011 Premier League Table.
Note that try as I might there's no way to project total wins, draws, and losses from the data - there just are too many paths to get to a certain points total (it's computationally impossible to find the most efficient one), and I didn't feel like simply making numbers up at this stage.
We have Chelsea win the league with 85 points by a four point margin over arch-rivals Manchester United. This gap may well be even closer due to the age of Chelsea's squad, which is a question to be tackled at some later date. The big four looks to have been well and truly broken up, with Manchester City rising to third and keeping Liverpool down all the way at 6th place. Fulham are the biggest risers, going from 12th last year tonight, but at the other end of the table long-time Premier League clubs Wigan Athletic and Bolton face the drop, with Wigan in particular looking to be in precarious position. If you're a fan of one of the clubs shown relegated here; don't worry - this doesn't condemn your team any more than it guarantees another title for Stamford Bridge. Every year teams randomly (yes, randomly!) collapse or do inexplicably well, and no amount of statistical work will ever be able to take the place of the long slog of the actual Premier League season.
15 comments
|
1 recs |
Do you like this story?
Comments
Nice job
I think the work you put into this should be commended. Nice job on the lead in posts and I like this post. I was kind of shocked how high Newcastle was but it is definitely possible.
Thanks!
I wasn’t really sure what to expect coming in, but everything looks plausible enough.
by Graham MacAree on Aug 14, 2010 12:35 AM BST up reply actions
I tried to copy your brilliance
Except analyzing a player instead. I didn’t do you service but you might as well be able to see my watered-down, incorrect analysis. http://leicester.theoffside.com/football-101/season-preview-our-goalkeepers.html
Chime in too if you have any suggestions for me (I was really only working with one variable so my work was nothing)?
um, please visit my soccer (football) blog. it's interesting, I promise. por favor? (filbertway.com)
Sunshine will come to Nats Park, I promise. (visit por favor? my website)
I thought the spider graphs were a really nice way of doings things
Obviously you’re measuring team defence in games Weale played in rather than Weale himself, or so it seemed to me. Do you have data for Leicester’s backup goalie in league play? A comment on graph labelling too: what does the first graph with Leicester and the other teams represent? is it the score?
I also think that not playing football manager gives me a distinct disadvantage in wording. Took me a while to pick up some of your terminology.
by Graham MacAree on Aug 14, 2010 4:00 AM BST up reply actions
He featured in essentially every match except one league match and a Carling Cup affair or two
I put this together hastily so I have to clean up the wording.
The first graph was just supposed to show his +/- (i.e. goal differential while he was on the pitch, like hockey) compared to a baseline league-average +/-. I had a really hard time trying to find advanced keeper stats for the Championship level (I’m not even sure that they are readily available for the Premiership), so I could only really weight things by team defense.
I figure I’ll take a look at the defenders individually with their +/-s and maybe that’ll reveal something. Some of the thoughts on Weale could probably be improved with further analysis on the other parts of the defense, while some of it could be helped with sample size. (Somewhat problematic though b/c last season was really his first full season of play with one team.)
I guess I’ll re-read it and work on it tomorrow. Thanks for your thoughts, thought I’d give the soccer stats thing a try.
um, please visit my soccer (football) blog. it's interesting, I promise. por favor? (filbertway.com)
Sunshine will come to Nats Park, I promise. (visit por favor? my website)
I liked the post a lot
Was a good first foray into stats. I’m kind of stumbling around trying to figure things out too, so your guesses are as good as mine. I’ve never been big on looking at the big picture and trying to isolate in baseball, so hey, new ground for everyone.
Looking forward to reading the rest of what you have to say.
by Graham MacAree on Aug 14, 2010 4:12 AM BST up reply actions
I cleaned it up a bit, haha
It’s an interesting endeavour to look at, to say the least, especially considering the lack of relevant counting stats out there (although the passes completed in certain action zones could be something, just a thought I had right now) even in the EPL.
I really really enjoy your articles though, keep it up and you’ll end up being a pioneer in football statistics too! And the match tomorrow will be interesting.
West Brom has some good pieces here and they, I wonder how they put it together.
um, please visit my soccer (football) blog. it's interesting, I promise. por favor? (filbertway.com)
Sunshine will come to Nats Park, I promise. (visit por favor? my website)
Yeah
I mean, I know how to measure game state, win expectancy, etc – how to do things properly instead of this weird roundabout method I’m suddenly stuck on. I just have nowhere near the data to be able to see it all come to life. I assume it’ll come along eventually.
by Graham MacAree on Aug 14, 2010 4:43 AM BST up reply actions
Loved this series
How might we add in an age factor? Wouldn’t have to be complicated – something akin to what Tango does with Marcel. I’m guessing it would take a ton of work on age/career arcs that hasn’t been done.
by marc w on Aug 14, 2010 5:28 AM BST via mobile reply actions
I'd do it in a similar fashion to the way I worked transfer spending is
See what sort of relationship I can grab between team age and the error between GD’ and GD
by Graham MacAree on Aug 14, 2010 2:54 PM BST up reply actions
Mildly
We see an increase in accuracy, meaning that we get closer to the average actual results, but we also see a small decrease in precision, with standard deviation on projected goal differential going up by one goal. The tradeoff works for me.
by Graham MacAree on Aug 15, 2010 5:44 PM BST up reply actions
Way late, but I wonder if it'd be better to do this by position
or if that’d just make it way too noisy to be of use.
I thought about this already
We really want to be separating out goalkeepers, at least
by Graham MacAree on Aug 17, 2010 2:59 AM BST up reply actions
















