Is data analytics in sport a good thing?

Data analytics in sports is not a new concept, in fact is was almost 17 years ago when the 2002 Oakland A’s team created history by winning 20 consecutive games in what is still one of the most compelling stories of sports analytics. In fact when the 2011 movie about this feat came out ‘Moneyball’ became a household phrase and these days you’d be hard pressed to find a sports fan that doesn’t know the batting averages or 3rd down percentages of their favourite hitters and QBs.

Professional sports, especially in America, have taken to using data to influence almost every decision made on and off the field. The use of the data in this context is always about marginal gains. How can teams eek out any semblance of an advantage over the next team. Alex Cora did just this in 2018 with the Red Sox. Cora is known for combining a unique blend of people skills with an acute aptitude for analytics in what is deemed by some as ‘Coralytics’. His analytics based approach gave the Red Sox the 2018 World Series title, and made Cora just one of 5 rookie managers to achieve such a feat.

But recently concerns have been raised that analytics are changing the nature of sport completely, and many are concerned that we are losing the essence of what sports used to be, a competition based on athletic capabilities, not tactical ones. It’s a weak argument as tactics have always been involved in sport, the only difference is that analytics allows a numeric value to be applied to intuition. That being said, there is no doubt that sports are changing. The NBA is a perfect example of this. Teams now employ more data analysts than ever before, and their contribution has given rise to the evolution of the three point game. Despite only having a 35% chance of taking the shot, basic math showed managers and players that the attempt alone was worth it because the payout was better. This has led to the average number of 3 point attempts increasing rapidly throughout the last number of years, and has changed the tactics employed from the scouting process right through to the on-court performances.


But is data really good for sports? Does the analysis so fundamentally change the sport that it is almost unrecognisable? Do we even care if our favourite sport morphs over time?

Well in the equestrian sport of eventing, the introduction of data analytics is having the opposite effect. It is being used to protect the sport in its current form, and this is being done by tuning the traditional application of analytics on its head.

In society and sport we are always focused on winning. We want to know who will be the first, the best, the gold medalist. However analytics in eventing is showing us how valuable knowing who the bottom of the pack will be.

Eventing is a three phase competition comprising of dressage, cross country and showjumping. It is an all round test of the horse and rider, and when everything goes right it is a dream to watch.

But, it is a sport, and as is the same in every sport, it doesn’t always go to plan…

So what’s the problem. Well in eventing we have this unique risk called a horse fall. Inherent in this, is the possibility that the horse could fall on top of the rider, and when a 1000 lb + horse falls on top of a human, serious injury, or worse, is never too far away.

Because of this risk, eventing is often classified as one of the most dangerous sports in the world, and a 2005-2015 report by the FEI (world equestrian governing body), showed just how accurate this classification can be. The study showed that eventing had a fall rate of 1 in every 18 starters across all 4 levels*. It showed that the rate of horse falls was 1 in 60 (1.68% chance), and that 1 in every 505 starters were at risk of serious or fatal injuries caused by their fall.That is to say that, in theory, every competitor starts a competition with a 0.2% chance of having a serious or fatal injury.

Now 0.2% chance might not seem like a high number however, just to put it in context, in Major League Baseball there are 2,430 regular season games. If we take 9 players on each team per game, a 0.2% chance would mean that at least 87 players (starters) would leave the league every year due to a serious, career ending injury or fatality. That’s almost one third of the starting roster of each team and in reality this would be higher if you took substitutions into account.

Now obviously the two sports are very different and comparing apples to oranges is not exactly fair but I think the example helps you understand the how extreme the level of risk is when compared to other sports. And while the FEI have introduced a number of safety measures to ensure that the risks are reduced, the broader question of how to manage the risk while not inherently changing the nature of the sport has remained a challenge for them.

One Irish company, EquiRatings, have found one way to solve this problem, and they have done so, almost accidentally, using data analytics.

The company started off as a personal project for Sam Watson, an Irish event rider, who wanted to use data analytics to assess his own personal performance. This initial analysis morphed into analysis of the sport as a whole, again originally so that he, and his co founder could figure out who the winners were likely to be. As it turns out that figuring out who was likely to win an event wasn’t too difficult, however every spectator on the ground could also figure it out. It was something that both intuition and data could do to a fairly equal level, however what their data could tell them, that intuition could not, was who was likely to be at the bottom of the pack, who was likely to have a fall.

From their data they found that 40% of the horse falls in the top levels of eventing came from just 10% of the population. This finding was what sparked their idea for the EquiRatings Quality Index, a measurement of your probability to succeed at a given event. The ERQI uses your past performance to measure the risk that you bring into each event, and works in a traffic light system. Green means that you are comfortable at a given level, amber means that you have had a bad result or two but you are not a high risk yet, and red means that you must move down to the next level because you are in the high risk zone.

The ERQI was introduced as an experiment with Eventing Ireland, the Irish governing body for eventing, in 2015. In its beta format riders were shown their ERQI but it was not enforced (that is riders on red for a given level could still enter a competition at that same level). In just one season at 2* level (the highest risk level in Ireland), only 1.5 % of people were given a red ERQI yet there was 66% reduction in the number of falls.

So why was that?

The athlete in all of us wants to push the boundaries and believe that we are the underdog, but the analyst in us knows better. Prior to the ERQI, athletes were unaware of the risk that they were bringing into each competition. The ERQI cast a mindset of accountability upon the athletes. Riders became conscious of the fact that their performance was being tracked and so they took personal responsibility to move down a level if it looked like they might be close to getting a red rating. Riders didn’t want to be that one athlete that was on red. EquiRatings used data to apply a numeric value to intuition, an intuition that athletes had tended to ignore in favour of positive thinking beforehand.

Since 2015, the ERQI has been enforced in Ireland as a standard with extremely positive results, and a number of countries, including the US have begun to introduce the standard across their national competitions.

Now we all know that accidents can happen and so no matter what eventing will always be a high risk sport, however the individual risk that each individual is bringing into competitions is being lowered and as such the overall risk is reduced.

So, is data really good for sports?

Yes, data changes sports, but evolution is natural, and I would argue that sports would have morphed with or without the introduction of analytics. At this point analytics is so ingrained in the on and off court performances of every team, that it’s probably not going anywhere anytime soon, and I think that applications like those used in eventing show us that in certain sports data analytics can not only be good for the sport, but also vital to its protection in the long term.

Tell me what you think. Are you for or against data in sports? Do you have any concerns for the future of sport?

*There are four levels of international competition in eventing, ranging from 1 star through to 4 star. 4 star events contain cross country fences that are up to 1.20 meters in height with a spread of 2 meters.


  1. Nice post. Looking forward to the presentation.

  2. Olivia Crowley · ·

    Before reading this post, I had obviously heard of ‘Moneyball’, but I am not particularly familiar with equestrian sport, so I had no idea about the rest. I find it incredibly interesting. In this context, I agree that the use of data analytics is not only positive but crucial to the sport. When it comes to other sports, however, I’m not sure what to think. On one hand, the idea of certain NBA and MLB teams winning games due to anything but skill or effort is quite aggravating. On the other hand, as a Red Sox fan, I can’t help but feel thankful for ‘Coralytics’. I guess as long as any given team can theoretically have access to the same insights with the right expertise, this is a trend that we all must come to accept.

    1. I know it was called out in the beginning of this post but my mind immediately went to Moneyball as soon as I read the title. It was a fascinating look at how the goals of sports have stayed the same, but the methods and tools used to achieve those goals have totally transitioned. However, I’m not sure that data will ever overtake certain aspects of sports. For example, while the 3-point game has increased in basketball year over year, everybody still throws a free-throw overhand when it’s been proven (with data) that an underhand (grandma) throw is more accurate and results in higher free-throw percentages. Of course, nobody wants to look like a grandma!

      1. matturally · ·

        If Shaq shot underhanded he would have been the GOAT

  3. Such an interesting topic! It seems like whenever anyone talks about using data analytics, they are trying to apply it in a “transformative” way, ie., find a way to almost game the system, or the game. This is a great way to show analytics can be used to improve sports – I wonder if we could see data applied in similar ways in other sports. Imagine the changes that could be made in football if there was a way to predict CTE, or the baseball injuries that could be avoided? The Mets might even have a shot at winning (something)!

  4. I think at its core sport will always be based on athletic ability, so I don’t find myself being highly concerned about the future. My one worry is that analytics will become such a large part of how the game is played that it will become too predictable and less exciting to watch. However, I believe that as analytics evolves, so will game strategy. New ideas and concepts (maybe even new sports) will continue to emerge, keeping sports interesting and competitive.

  5. Having worked in sports world for the better part of the last six years and hearing analytics come up many times during that span, this post instantly caught my attention. It’s been interesting to see how analytics have been used in order to help teams win games or decide how to make the best of a salary cap and this is something that I think will always play a role in sports from here on out. That being said, I’ve never really thought of it being used to help assess risk in a sport, and definitely not in eventing. It was really interesting to see how they were able to use analytics to actually make the sport safer, and I agree with Kate that it will be interesting to see whether or not other sports adopt this usage of analytics.

  6. masonpeterman · ·

    I really enjoyed this post. Not only was I not familiar with what horse riding competitions consisted of, but the very real and potential danger of the sport. When it comes to the overall theme, analytics in sports, I see no issue with it at all. While the analytics are allowing us to collect actionable data on players and influence strategy, the beauty of sports is when it comes down to it, it’s humans competing, and regardless of what the numbers say, anything can happen. I think you’re way of putting it, that data allow’s us to put numbers to our intuition is perfect. The use of data and analytics in sports is really just giving us the opportunity to quantify the things we care about. While it may influence strategy and action, I think that’s an issue because it really comes down to individual performance and that’s something that analytics can’t influence. It’s reassuring to see these techniques used not only for strategic purposes, but you show that analytics can help make sports safer. It takes some of the intuition out of sports certainly, but that can help foster safer competition.

  7. matturally · ·

    I loved this post! It has everything you could ever hope for: sports, data, risk of death, that second fail picture with the equestrian just sitting on the ground (totally lost it at that), but what I liked, even more, was that it took something most of us have heard of and turned it on its head. I think the point about athletes wanting to push their limits is spot on; the parallel I would give is weightlifting. Sure, you want to push yourself, but getting injured will set you back so much that it’s not worth the risk. Great post!

    I for one welcome the increased number of 3-pointers and anxiously await the 4-point line.

  8. Great post, and fascinating presentation on this tonight!

    As someone studying data analytics, I’m tempted to say that data is good for sports, but I just don’t know. I lament the way it’s changing basketball in regards to shooting more 3-pointers. I’m a big Denver Nuggets fan and I was absolutely stoked for the Nuggets/Warriors game when they were the top two teams in the West. I was quickly disappointed when the game just turned into chuck it up fest. Literally, the Warriors blew us out just because they were able to shoot 54% from the field. They literally would just run down the court and net a 35-footer. There was almost nothing the Nuggets could do to stop the deluge and they ended up losing by 31 points. It was a sucky game to watch.

    But the more I think about it, sports adapt to changes and will continue to do so. This includes the 3-point line. If the NBA suddenly turns into to a back-and-forth of who can drain more threes, I think the powers that be would step in and do something about that. Whether that means extending the three-point line or a similar change, I’m confident that the governing bodies will adapt to sports. So if analytics changes the game, the game will change to keep its fans happy (which means keeping things interesting, and 3-point fests are not interesting) since that’s their source of revenue.

  9. Great Post! I personally fall into the camp of wondering if Data Analytics should be relied upon so heavily. I remember in high school one of my closest friends only made Varsity because we needed an extra player (the coach could’ve chosen any JV player). She could have easily been looked over because her scoring percentage was so low, but as luck would have it making the Varsity team gave her the confidence to be better and she was our highest scoring member that year. In cases like this where players are analyzed by a system and numbers I think it can have an adverse affect on many players performance.
    However, I do completely agree that it is useful and sometimes life-saving when used appropriately. For example, with the horse showing you discussed I think it’s great that there is a tool out there to allow people to know there chances of getting hurt. I think data should be used in these ways when it is meant to keep athletes safe and better their knowledge of how they will do, I just hope it doesn’t hold back any athletes from making a team or playing in a game because data says we shouldn’t take a chance.

%d bloggers like this: