Statistics in Sport: Too Much of a Good Thing?

Football Manager fans, look away now

Since the Oakland Athletics outwitted money with maths with their unprecedented 2002 baseball success, the techniques of statistical analysis have spread throughout professional sport. In football, the tenets of Moneyball are followed devoutly; players’ data is devoured by analysts, pundits, and the public alike. In replacing the traditional scouting system, players are increasingly scrutinised by numbers. Numbers which, while no doubt useful, can lead to overreliance upon some limited and misread models – at the expense of the human factor. Is there such a thing as applying too much stats to sport?

Data analytics are primarily used in football to gauge performance and recruit new players. Analysts pore over players’ stats: shots on target, key passes, chances created, interceptions and key tackles are all under the microscope. Of these, the most notorious is the expected goals model, created by OptaJoe’s Sam Green in 2012. This measures the probability of scoring a shot from a certain position on the field, accounting for various factors such as distance from the goal and nature of the shot. Its main flaw lies in reflecting the probability of the average player scoring from the chance – in essence, discounting the innate ability of striker and goalkeeper. From identical positions, both Ronaldo and the reader have identical xG figures, despite a slight but sure talent gap. 

If its scope is limited, its interpretation is worse. Many people understand xG as a score-line predictor, and are thus understandably confused when the model suggests Liverpool 2.66 – 1.42 Arsenal. In fact, the concept derives from mathematical expected value theory; it is a measure of central tendency and implies that a majority of the total values will centre in and around it. Unsuitable when applied to individual fixtures, misreading this model mainly harms punters, due to it being a variation of the “Gambler’s Fallacy” – where previous outcomes are erroneously believed to influence future ones.

Have data analytics impacted player behaviour on the pitch? Thankfully not, besides xG consistently reducing the average shot distance every year since its inception. What is worrying, however, is how analysts run the risk of overlooking certain talented individuals in favour of more stat-friendly ones. To illustrate this, consider pass completion rate. Is a player with 100% pass success brilliant or timid? If all passes are sideways or backwards, this rate is easy to achieve and dreadful to spectate. A player with low pass completion, however, may attempt risky forward passes with the aim to cause danger. These will not always come off, but when successful, likelier lead to goals. Statistics demand context to be relevant. 

The game is undoubtedly better with data than without. It maps the likelihood of injuries, it collates information for scouts across continents, and it condenses the complex into the manageable. Yet the problem – often encountered in economics – is how best to capture people within mathematical models. 

Sport is performance; sport is art; sport is human. A spark of passion, emotion, inspiration, can decide the finest of margins. We are perhaps in danger of intellectualising the inherently unpredictable in our quest for perfect stats. When overdone, spreadsheet revisionism can even make masters seem mediocre. On paper Eric Cantona was average. To get his genius, you had to simply watch. 

Latest from Sport