analytics - Recent posts
Viewing all posts which authors have tagged ‘analytics’.
You can also subscribe to this tag's feed.
I'm pleased to announce that I've joined StatDNA as Vice President of Analytics and Software
Development. This is a super exciting opportunity for me as I'll be combining my loves of
software development, data analysis and soccer. What could be better? I'll hopefully have some
blog posts up for StatDNA over at their blog soon using their best in breed data.
There was an interesting article this morning on Soccernet about Robin Van Persie being in the
"injury red zone". Hyperbole aside, it raises the point that Arsenal have had the luxury of
playing Van Persie in every league match so far (starting 12 of 13) but will have to manage his
workload a little more conservatively or risk a decrease in performance or potential injury.
I'm always on the lookout for new ways to visualize data in the hopes that it might lead to a
better understanding of the data. In the first leg of the tie between Real Salt Lake and Seattle
Sounders FC, the Sounders midfield was completely MIA for large portions of the game while RSL
enjoyed large periods of maintaining possession.
There has been lots of talk about the goal glut that is happening in the Premier League right
now. Are pricey strikers to blame or is it the death of quality defense? Decision Technology's
Ian Graham has already taken a look at debunking the Guardian's piece on the "goal glut". I
thought I'd add my two cents.
It was rough being a Sounder's fan last night. Amidst discussions of a CONCACAF Champions
League curse, playing at altitude and missing one of their best players of the season in Mauro
Rosales, the Sounders had a tough playoff matchup against Real Salt Lake. While most fans would
have been surprised if the Sounders had come away with a first leg lead, going down 3-0 was a bit
of a shock.
During tonight's MLS Playoff match between the New York Red Bulls and FC Dallas, the "Curse of
CONCACAF Champions League" was brought up. FC Dallas has had to play more matches than NYRB this
season and came into the match looking a bit fatigued. Since the CONCACAF version isn't as
lucrative as the European version, it is getting the reputation as being a drain on teams.
Youth soccer in Southern California continues to evolve, lead, pioneer, and at least in my
opinion, host the largest density of talent in the country. It's like a mecca for the youth
game.
Now we too hope to contribute to this landscape on a macro scale and push the envelope even
further!
Previously I've written about examining conversion rates and shots as a way of examining which
areas an offense or defense excels at or is struggling with. Shots can be a crude estimation for
opportunities and conversion rate and estimation of how well a team executes on those
opportunities. I had looked at offense and defense separately in the past, but decided to combine
the two to see if any interesting patterns emerged.
As I've previously posted, I had the chance to speak at the New England Symposium on Statistics
in Sports. They've now posted the videos and slides from all the presentations. I've posted my
video below as well as the slides and original blog post so that all the content is in one place.
Originally I wanted to title my talk "Cool Shit You Can Do With Markov Chains in Soccer" but toned
it down a bit to "A framework for tactical analysis and individual offensive production assessment
in soccer using Markov chains".
This weekend I had the privilege of speaking at the New England Symposium on Statistics in
Sports. It is a much more technical conference than the Sloan Sports Analytics Conference so I
felt a bit like a duck out of water given my background in computer science and not hardcore
statistical methods (and these guys were hardcore!
Earlier this week I had the pleasure of sitting down with Ravi and Elisa from Forza Futbol to
record a podcast about Sports Analytics. We discuss a whole host of topics including some of the
work I haven't had a chance to publish yet involving looking at the quality of cantera products
(using the Castrol Index).
I am thrilled to announce that I will be speaking at this year's New England Symposium on
Statistics in Sports (NESSIS) on September 24th. Earlier this year, StatDNA announced a Soccer
Analytics research competition and my paper was selected as the winning entry. I'll be giving a
talk titled "A framework for tactical analysis and individual offensive production assessment in
soccer using Markov chains".
Map of birthplaces for players currently in La Liga (excluding Spain)
This is our second post in our series on the origins of players previously we looked at the
Premier League. La Liga is little different from the Premier League, both in terms of
infrastructure ("B" squads can compete in lower divisions) and culturally (Athletic Bilbao has a
policy to only sign players from the Basque region).
Going global: Birthplaces of Premier League players (excluding UK).
Another summer is on its way out with Arsenal barely making a splash in the transfer market.
Once again it looks like Arsenal will be relying on youth this season. It got me thinking are
Arsenal really good at producing players from their youth academy who are capable of playing in the
Premier League?
When I took this blog semi-pro last October, I planned to be transparent about the site's
finances. I view it as a way to facilitate a dialog with readers and encourage their support. I
figure if my readers think I have my financial house in order, they'll be more likely to purchase
subscriptions.
Soccer Quants Finding New Numbers To Judge Quality
Still much is hidden by the clubs as they don't want to give up advantages, but correlations
between Sprints and Points on Table have been discovered. Importance of tackles has been
reduced.
In all you can find many nuggets, but not much meat in this large post on the developments of
soccer analytics over the past decade.
It's transfer season which means loads of rumors flying around the press and loads of bored fans
searching the internet for info on their club's possible new signings. Since Google Labs recently
released Google Correlate, I figured now was a good time to see if any insights could be gained
using it to investigate transfer rumors.
Major League Soccer has a reputation for being a tough, physical league. What it lacks in
talent and technique it makes up for with speed and strength. Often heard in discussions of the
league is the belief that teams need a target forward to win someone to launch crosses at, get
physical with center backs and hopefully score off of set pieces.
There's been discussion on some of the Sounders FC boards lately about whether or not Sigi
Schmid should be fired. One of the common defenses that comes up is that he's won 2 MLS Cups so
therefore he is a good manager. The question is, in a league based on parity with a playoff system
that almost everyone qualifies for, how hard is it to win an MLS Cup?
Earlier this week the Guardian Data Blog published financial data for the English Premier
League. They chose to go with the obvious story about how in debt the clubs are (and it's true, it
doesn't look good) while others have decided to look at the relationship between wins and wages
(also true, there is a correlation between the two).
Soccer By the Numbers has a great post on the value of corners which raised an interesting point
about the importance of different statistical measures. One of the problems with trying to build
regression models for soccer is that few of the variables are independent. For example, if you
want to look at the relationship corners and shots have on wins, it gets a bit tricky.
Similar to the graph showing the offensive production of teams, this graph shows their defensive
production. I've reversed the direction of the axes so that the upper right quadrant in both
graphs is the preferred location for teams. A key difference between the two is that I haven't
generated a model to determine estimated points based on defensive production, so both color and
bubble size indicate a team's actual points.
Most teams have now played 7 matches and should be settling into what their offensive production
will look like for the rest of the season. What is odd is that there are very few teams in the
magic quadrant. Previously we had seen the best teams in the upper right quadrant, but currently
only Sporting KC is firmly established there, with Chicago, Vancouver and the NY Red Bulls on the
border.
Last week local soccer stats blogger Zach Slaton wrote an article on some work he did that
showed that single-game shot differential in the EPL is actually negatively correlated with match
outcomes. Given the work we've done here related to evaluating teams based on shot differential,
this is obviously of some interest.
This weekend saw MLS lose two of its most talented players to broken legs that were the results
of two awful challenges. As someone who watches a lot of Arsenal matches, this is an all too
familiar story. My initial reaction was that the refs let too much go both in the EPL and MLS.
In MLS I've seen players get away with some awful tackles without a booking and often times without
a foul even being called.
Continuing the theme of looking at the breakdown of Castrol Performance Index Contributions
based on nationalities, I wanted to see how Spain and England compare to each other, particularly
when it comes to domestic players.
Percentage of Castrol Performance Index Contributions by Nationality for English Premier
League
Percentage of Castrol Performance Index Contributions by Nationality for La Liga
In England, Birmingham has the highest percentage of CPI contributions by domestic players with
58.
Previously I posted the number of games played based on nationality for the top 4 clubs in the
EPL. Since the Castrol Performance Index was updated today, I thought it would be interesting to
see how the players from each country contribute to a team's performance.
What's interesting is that while Manchester City led the way in games played by Englishmen,
Manchester United leads the way in terms of Castrol Performance impact.
Michael Moritz wrote a piece for the Wall Street Journal (subscription required) earlier this
week drawing a parallel between open immigration in the English Premier League and how it could
be beneficial to the American tech start-up scene. What Moritz failed to point out (but other
readers noticed) is that the EPL isn't completely open.
Previously we've looked at which teams tend to do well in the MLS Superdraft so I thought it
would be fun to look at the relation between managers and the universities that are developing the
players. I'm using percentage minutes played as a proxy for the quality of the draft pick. I've
restricted the data to draft picks from the first two rounds from the last 5 years.
I've gotten lots and many different kinds of reactions to my guest post on the New York
Times Goal soccer blog. So I thought I'd say a few things about the issues raised by people who
care enough to comment.
First of all, thanks to everyone for reading and going to the trouble to write in, either on the
Times comments section or to me personally.
I've gotten lots and many different kinds of reactions to my guest post on the New York
Times Goal soccer blog. So I thought I'd say a few things about the issues raised by people who
care enough to comment.
First of all, thanks to everyone for reading and going to the trouble to write in, either on the
Times comments section or to me personally.