Data Science Is Dead

Fun fact: nothing on this blackboard makes any sense.
Fun fact: nothing on this blackboard makes any sense.

Data Science is dead.

Science creates knowledge via controlled experiments, so a data query isn’t an experiment. An experiment suggests controlled conditions; data scientists stare at data that someone else collected, which includes any and all sample biases.

Now, before you drag out the pitchforks: I’m not a query hater. You won’t see me standing outside the Oracle Open World conference with a sign that says “NO SQL” on it. Queries are fine. Smart people don’t always have the right answer, but they need to ask the right questions. Yes, building a query is like “forming a hypothesis,” but at that point we enter the realm of observational or “soft” science. Yes, by this standard, Astronomy and Social Sciences are also not sciences. I have no idea what Computer Science is, but no, it’s not a science either.

Oh what’s that? Your kind of “Data Science” includes things such as A|B Testing, and your “experiments” actually involve executing designs that affect the world? Allow me to retort: that’s not Data Science, that’s actually doing a job. You might have a job title like Product Management or Marketing. But if your job title is “Data Scientist,” you are effectively removing yourself from the actual creation of data.

I do sympathize. I appreciate that it’s no longer sexy to be a Database Administrator, and I guess the term “Business Analyst” is a bit too 1980’s. Slapping “Data Warehousing” on a resume is probably not going to land you a job, and it’s way down there with “Systems Analyst” on the cool-factor scale. If you’re going to make up a cool-sounding job title for yourself, “Data Scientist” seems to fit the bill. You can go buy a lab coat from a medical-supply surplus store and maybe some thick glasses from a costume shop. And it works! When you put “Data Scientist” on your LinkedIn profile, recruiters perk up, don’t they? Go to the Strata conference and look on the jobs board—every company wants to hire Data Scientists.

OK, so we want to be “Data Scientists” when we grow up, right? Wrong. Not only is Data Science not a science, it’s not even a good job prospect. In the immortal words of Admiral Akbar: “It’s a trap.”

These companies expect data scientists to (from a real job posting): “develop and investigate hypotheses, structure experiments, and build mathematical models to identify… optimization points.” Those scientists will help build “a unique technology platform dedicated to… operation and real-time optimization.”

Well, that sounds like a reasonable—albeit buzzword-filled—job description, no? There is going to be a ton of data in the future, certainly. And interpreting that data will determine the fate of many a business empire. And those empires will need people who can formulate key questions, in order to help surface the insights needed to manage the daily chaos. Unfortunately, the winners who will be doing this kind of work will have job titles like CEO or CMO or Founder, not “Data Scientist.” Mark my words, after the “Big Data” buzz cools a bit it will be clear to everyone that “Data Science” is dead and the job function of “Data Scientist” will have jumped the shark.

Yes, more and more companies are hoarding every single piece of data that flows through their infrastructure. As Google Chairman Eric Schmidt pointed out, we create more data in a single day today than all the data in human history prior to 2013.

Unfortunately, unless this is structured data, you will be subjected to the data equivalent of dumpster diving. But surfacing insight from a rotting pile of enterprise data is a ghastly process—at best. Sure, you might find the data equivalent of a flat-screen television, but you’ll need to clean off the rotting banana peels. If you’re lucky you can take it home, and oh man, it works! Despite that unappetizing prospect, companies continue to burn millions of dollars to collect and gamely pick through the data under respective roofs. What’s the time-to-value of the average “Big Data” project? How about “Never”?

If the data does happen to be structured data, you will probably be given a job title like Database Administrator, or Data Warehouse Analyst.

When it comes to sorting data, true salvation may lie in automation and other next-generation processes, such as machine learning and evolutionary algorithms; converging transactional and analytic systems also looks promising, because those methods deliver real-time analytic insight while it’s still actionable (the longer data sits in your store, the less interesting it becomes). These systems will require a lot of new architecture, but they will eventually produce actionable results—you can’t say the same of “data dumpster diving.” That doesn’t give “Data Scientists” a lot of job security: like many industries, you will be replaced by a placid and friendly automaton.

So go ahead: put “Data Scientist” on your resume. It may get you additional calls from recruiters, and maybe even a spiffy new job, where you’ll be the King or Queen of a rotting whale-carcass of data. And when you talk to Master Data Management and Data Integration vendors about ways to, er, dispose of that corpse, you’ll realize that the “Big Data” vendors have filled your executives’ heads with sky-high expectations (and filled their inboxes with invoices worth significant amounts of money). Don’t be the data scientist tasked with the crime-scene cleanup of most companies’ “Big Data”—be the developer, programmer, or entrepreneur who can think, code, and create the future.


Miko Matsumura is a Vice President at Hazelcast, an open source in-memory data grid company. He is a 20-year veteran of Silicon Valley.

Image: Sergey Nivens/

Upload Your ResumeEmployers want candidates like you. Upload your resume. Show them you’re awesome.

16 Responses to “Data Science Is Dead”

  1. Ranko Mosic

    By ‘unstructured data’ author probably means data not residing in relational databases. While it is true that most of analysis ( mining, adv. analytics, or whatever name you want to put here ) is done on structured data, how does he explain successes of web companies in analyzing unstructured data ?
    Also, analyzing data is being around for decades now in insurance, banks, telcos, marketing. Very old and necessary profession, which is now becoming more common under new buzzword terminology. They actually analyze old, archived data and have some success in doing so.
    I agree that scientist is bad name, similar was observed in many other areas trying to get themselves to the next level.

  2. “the winners who will be doing this kind of work will have job titles like CEO or CMO or Founder, not “Data Scientist.” ”

    Well, yes.

    But the first two are arrival points rather than career paths. Maybe Data scientist is a poor name, but as a career path, I would defend the opportunity, satisfaction and power to affect change versus any other.

  3. Abbas Shojaee

    Respected author argues that data science/scientist is just a sexy title for a range of already known job disciplines/definitions. Lets think in reverse order:

    As a part of known definition, Is a Data Warehouse Analyst concerned about using ontologies, different data representation than relational (e.g. graph) and so on to achieve new knowledge? Does a Database Administrator usually know about how to correctly discretize numerical values? Can a classic Statistician to deal with ambiguous measurements or fuzzy units of measurement based on probability theory? Does the structure only exist in traditional relational or more recent RDF triples schemas? These are some of data scientists’ tasks.

    In contrast with idealistic, clean, table formed perfect manageable domestic datasets a new kind of wild, real life, messy, large and less traditionally structured data is out there. The need for interdisciplinary data workers who be able to combine different fields of expertise to explore this wild data in new ways, is what coined the title of data science/ data scientist. It is extremely interdisciplinary and exploratory, mostly dependent on specifically designed experiments on data and so called being a science. This is a growing need and new real emerging field, not a temper.

    I totally disagree with respected author.

    • Steven Knudsen

      Based on my interactions with recruiters, I agree with author. Your expertise is probably worth “north of” $100K salary, whereas data scientist label these days is asspciated with $100K on a 1099, which is nowhere near as lucrative

  4. balakrishna

    I definitely agree with the author to a considerable extent about the recruiters having sky-high expectations under the radar of Data Scientist title.

  5. Well, the problem is not the job title… but are the recruiters, that are not able to detect talent, except if you put some market keywords in your resume.

    So I would not worry about the prospects of the data scientists in the future. Smart talent will know which new keywords they will have to use, in 5 to 10 years from now.

  6. This article is utter rubbish. Troll bait.

    Statistics, language theory, control and instrumentation, signal processing, software engineering are all tools of the scientist. Data Scientists ARE performing “controlled experiments” using data every day.

    • Steven Knudsen

      Based on my interactions with recruiters, I agree with author. Your expertise is probably worth “north of” $100K salary and indicative of REAL SCIENCE, whereas data scientist label these days is associated with $100K on a 1099, which is nowhere near as lucrative. For now, I am keeping my scientist job and not falling for the data scientist lure.

  7. Dear author,

    Lets talk about who a data scientist is in the first place.
    He is a programmer/developer and a mathematician/statistician rolled into one. He possesses commendable analytical skills and a strong business sense. Any one is enough to land a job, let alone all of them.

    Who will who make a ceo?
    A certified jack of all trades with knowledge about all areas of the comapny or a specialist programmer who does the same thing day in and day out?

    DATA DRIVEN DECISION MAKING is here to stay. That’s because numbers never lie even if u can Mr. Author-without-basic-common-sense.

    Good day 🙂

  8. dhrumel shah

    you simply need to go back to middle school to understand science. I believe you attended middle school in the 80s, but you are expected to update yourself with today’s discoveries. Computer science is a science, and it is a degree offered by every major university in the world. Your definition of science pertains only to physical sciences. Data scientists are involved in gathering, cleansing, enriching, transforming, mining and storing data. Their discoveries lead to the development of machines that exhibit artificial intelligence, which is also a science. Patterns and models developed can PREDICT trends in finance, healthcare, marketing, and every possible industry. The demand for data scientists stems from the fact that 90% of the world’s data today was collected in the last 2 years. Data for most companies can no longer be held on 1 machine, which further increases the demand for data scientists who can offer big data solutions. Basically, companies cannot just hire statisticians and business analysts who cannot create models and program to automate processes. They need data scientists who are experts in quantitative analysis, visualization, data mining, machine learning and programming.

  9. Thomas Nichols

    This kind of article, while inflammatory to some, is interesting nonetheless. Let’s start by defining “Data Scientist” as someone who uses generated data to run queries, and publishes insights. This seems to fit the authors depiction of a data scientist. As the author stated, this person is not a scientist in the traditional sense of the word. This person is not truly experimenting, but rather analyzing. The person you have described is not a “Data Scientist,” but rather a “Data Analyst,” and analysis is valuable.

    Now, if we shift our definition of “Data Scientist” to mean someone who hypotheses statistical models, directs the implementation of the necessary engineering of these models on real data, and allows analysts to go to town on these data sets, then yes, that person is a scientist. I think the flaw in this article is a lack of differentiation between the more entry-level analyst positions, and the higher level engineering positions.

    I think what should be said is that if you are really a data analyst, but you call yourself a data scientist, you should be cast off the island to live with the Sophomore’s who call themselves Junior’s because of AP courses. Data Analyst’s serve a valuable function, and offer a path to becoming an Engineer or Scientist, but analysis is not science, it’s analysis.