Data analytics in sports is not a new concept, in fact is was almost 17 years ago when the 2002 Oakland A’s team created history by winning 20 consecutive games in what is still one of the most compelling stories of sports analytics. In fact when the 2011 movie about this feat came out ‘Moneyball’ became a household phrase and these days you’d be hard pressed to find a sports fan that doesn’t know the batting averages or 3rd down percentages of their favourite hitters and QBs.
Professional sports, especially in America, have taken to using data to influence almost every decision made on and off the field. The use of the data in this context is always about marginal gains. How can teams eek out any semblance of an advantage over the next team. Alex Cora did just this in 2018 with the Red Sox. Cora is known for combining a unique blend of people skills with an acute aptitude for analytics in what is deemed by some as ‘Coralytics’. His analytics based approach gave the Red Sox the 2018 World Series title, and made Cora just one of 5 rookie managers to achieve such a feat.
But recently concerns have been raised that analytics are changing the nature of sport completely, and many are concerned that we are losing the essence of what sports used to be, a competition based on athletic capabilities, not tactical ones. It’s a weak argument as tactics have always been involved in sport, the only difference is that analytics allows a numeric value to be applied to intuition. That being said, there is no doubt that sports are changing. The NBA is a perfect example of this. Teams now employ more data analysts than ever before, and their contribution has given rise to the evolution of the three point game. Despite only having a 35% chance of taking the shot, basic math showed managers and players that the attempt alone was worth it because the payout was better. This has led to the average number of 3 point attempts increasing rapidly throughout the last number of years, and has changed the tactics employed from the scouting process right through to the on-court performances.
But is data really good for sports? Does the analysis so fundamentally change the sport that it is almost unrecognisable? Do we even care if our favourite sport morphs over time?
Well in the equestrian sport of eventing, the introduction of data analytics is having the opposite effect. It is being used to protect the sport in its current form, and this is being done by tuning the traditional application of analytics on its head.
In society and sport we are always focused on winning. We want to know who will be the first, the best, the gold medalist. However analytics in eventing is showing us how valuable knowing who the bottom of the pack will be.
Eventing is a three phase competition comprising of dressage, cross country and showjumping. It is an all round test of the horse and rider, and when everything goes right it is a dream to watch.
But, it is a sport, and as is the same in every sport, it doesn’t always go to plan…
So what’s the problem. Well in eventing we have this unique risk called a horse fall. Inherent in this, is the possibility that the horse could fall on top of the rider, and when a 1000 lb + horse falls on top of a human, serious injury, or worse, is never too far away.
Because of this risk, eventing is often classified as one of the most dangerous sports in the world, and a 2005-2015 report by the FEI (world equestrian governing body), showed just how accurate this classification can be. The study showed that eventing had a fall rate of 1 in every 18 starters across all 4 levels*. It showed that the rate of horse falls was 1 in 60 (1.68% chance), and that 1 in every 505 starters were at risk of serious or fatal injuries caused by their fall.That is to say that, in theory, every competitor starts a competition with a 0.2% chance of having a serious or fatal injury.
Now 0.2% chance might not seem like a high number however, just to put it in context, in Major League Baseball there are 2,430 regular season games. If we take 9 players on each team per game, a 0.2% chance would mean that at least 87 players (starters) would leave the league every year due to a serious, career ending injury or fatality. That’s almost one third of the starting roster of each team and in reality this would be higher if you took substitutions into account.
Now obviously the two sports are very different and comparing apples to oranges is not exactly fair but I think the example helps you understand the how extreme the level of risk is when compared to other sports. And while the FEI have introduced a number of safety measures to ensure that the risks are reduced, the broader question of how to manage the risk while not inherently changing the nature of the sport has remained a challenge for them.
One Irish company, EquiRatings, have found one way to solve this problem, and they have done so, almost accidentally, using data analytics.
The company started off as a personal project for Sam Watson, an Irish event rider, who wanted to use data analytics to assess his own personal performance. This initial analysis morphed into analysis of the sport as a whole, again originally so that he, and his co founder could figure out who the winners were likely to be. As it turns out that figuring out who was likely to win an event wasn’t too difficult, however every spectator on the ground could also figure it out. It was something that both intuition and data could do to a fairly equal level, however what their data could tell them, that intuition could not, was who was likely to be at the bottom of the pack, who was likely to have a fall.
From their data they found that 40% of the horse falls in the top levels of eventing came from just 10% of the population. This finding was what sparked their idea for the EquiRatings Quality Index, a measurement of your probability to succeed at a given event. The ERQI uses your past performance to measure the risk that you bring into each event, and works in a traffic light system. Green means that you are comfortable at a given level, amber means that you have had a bad result or two but you are not a high risk yet, and red means that you must move down to the next level because you are in the high risk zone.
The ERQI was introduced as an experiment with Eventing Ireland, the Irish governing body for eventing, in 2015. In its beta format riders were shown their ERQI but it was not enforced (that is riders on red for a given level could still enter a competition at that same level). In just one season at 2* level (the highest risk level in Ireland), only 1.5 % of people were given a red ERQI yet there was 66% reduction in the number of falls.
So why was that?
The athlete in all of us wants to push the boundaries and believe that we are the underdog, but the analyst in us knows better. Prior to the ERQI, athletes were unaware of the risk that they were bringing into each competition. The ERQI cast a mindset of accountability upon the athletes. Riders became conscious of the fact that their performance was being tracked and so they took personal responsibility to move down a level if it looked like they might be close to getting a red rating. Riders didn’t want to be that one athlete that was on red. EquiRatings used data to apply a numeric value to intuition, an intuition that athletes had tended to ignore in favour of positive thinking beforehand.
Since 2015, the ERQI has been enforced in Ireland as a standard with extremely positive results, and a number of countries, including the US have begun to introduce the standard across their national competitions.
Now we all know that accidents can happen and so no matter what eventing will always be a high risk sport, however the individual risk that each individual is bringing into competitions is being lowered and as such the overall risk is reduced.
So, is data really good for sports?
Yes, data changes sports, but evolution is natural, and I would argue that sports would have morphed with or without the introduction of analytics. At this point analytics is so ingrained in the on and off court performances of every team, that it’s probably not going anywhere anytime soon, and I think that applications like those used in eventing show us that in certain sports data analytics can not only be good for the sport, but also vital to its protection in the long term.
Tell me what you think. Are you for or against data in sports? Do you have any concerns for the future of sport?
*There are four levels of international competition in eventing, ranging from 1 star through to 4 star. 4 star events contain cross country fences that are up to 1.20 meters in height with a spread of 2 meters.