Goldman Sachs World Cup Analytics Show Limits of Big Data

If you’re a fan of the World Cup, you probably had your sights set on a winner before the tournament kicked off. Maybe you really liked how Spain’s team was shaping up (despite the coaching shifts), or you wanted to root for an underdog such as Japan or Croatia.

Goldman Sachs, which knows a little something about probability and risk, built a sophisticated data model to predict the World Cup’s eventual winner. This model leveraged machine learning to simulate 1 million possible evolutions, and updated throughout the tournament, according to Bloomberg. With that kind of setup, you’d think that the algorithms would get at least a few match outcomes right.

But the Goldman Sachs model, despite its massive dataset, despite the data experts working on it, got its predictions wrong.

And therein lies the issue: Although pundits and experts have made bold predictions about how artificial intelligence (A.I.) and machine learning will change our lives in the years and decades to come, these technologies remain relatively nascent, and often unproven in terms of results. It’s not just a matter of predictive analytics such as soccer scores; practical applications of A.I. such as autonomous driving have hit their own roadblocks (pun intended).

In March, for example, one of Uber’s autonomous vehicles crashed into a pedestrian, killing her; Tesla vehicles have likewise collided with obstacles while driving in “autopilot” mode. No wonder some 73 percent of Americans said they were “too afraid” to ride in a car entirely under software control, according to an AAA survey.

If enough companies (and potential consumers) begin to think that A.I. is overhyped, they’ll spend less on the technology, leading to an A.I. “winter.” The tech industry has gone through such periods before, notably in the 1970s and ‘80s; agencies and universities would pull funding from promising projects, draining energy from the field.

At the moment, A.I. and machine learning is making a bit too much money to “freeze up” quickly—look no further than Amazon Alexa and Google Home, which are spreading into millions of households. Within corporations, more and more enterprise applications are incorporating predictive analytics and other facets of machine learning. But as the Goldman Sachs soccer model demonstrates, these technologies still have quite some distance to cover in order to meet the rosiest predictions—and some companies may well give up on their efforts.

But if Goldman Sachs sticks with it, maybe their algorithms will get the World Cup right in… 2026.

6 Responses to “Goldman Sachs World Cup Analytics Show Limits of Big Data”

    • there is so much current and future possibilities that I don’t see how AI can get that right! unless AI can travel in time :-)… Like the comment made before. Too much data can be a bad thing.

      • AI is still in in the initial phase of applying the concept in all fields to determine the complete probabilities of an event to occur e.g the recent soccer world cup prediction only question is that hat are the factors taken into consideration for deriving a conclusion.e.g % of ball possession but this time team who possessed ball the most but unable to strike lost out against the opponent.

  1. There is a science factor and then there is an art factor which plays a major role.
    For predicting anything one has to know the business side of it and then get experts in those fields to work on the predictions. Some geeks churning out algorithms in back office will not cut it all the time.
    For world cup soccer besides ranking, they have to have someone on the filed watching players play, see coaches strategies in each game played against opponents. Notice if there are start players playing in the team and how dedicated they are in playing for the country vs getting injured and thrown out and eventually lose millions playing for European clubs.
    All these factors add up.

  2. This is an addendum to my previous comment.
    There was a good movie made where Clint Eastwood had acted as a baseball scout. He knew the business well. Its a must watch movie for analytics.
    movie name : “trouble with the curve” – 2012.

  3. Rodrigo

    This is such a waste of a read. The author needs do do more research on A.I/Machine Learning before trying to trash current efforts in the field. The models used for any sort of prediction are heavily dependent on the given parameters and classification methodology. Maybe Goldman’s approach wasn’t a good one, but we’ll never be able to tell due to a lack of details. Bad writing with no significant thesis. The author doesn’t mention the predicted winner, or the actual metrics on how the algorithm decides on the winner, or even the percentage of how many matches it accurately predicted. It’s sad to see tons of articles detailing prediction models for NCAA tournaments but nothing on the same level as the World Cup, which attract more fans by orders of magnitude.