If you want to work in data science in finance, there are a few things you should probably know. First, that strategic advantages are increasingly conferred in real time by huge unstructured data sets derived from social media. Second, that while the building blocks of data strategy (data storage and administration) are increasingly commoditized, there are still interesting roles for data scientists at the top of the “data stack.”
“Data engineering teams are really shifting their focus from low-level database storage and administration and are instead focusing much more on the high value-add part of the data chain,” said Tom Taylor, head of alpha technology at investment business Man Numeric (part of Man Group), speaking at the recent AI & Data Science in Trading conference.
Alternative data streams like credit card transactions and brand sentiment tracking have now become the norm in finance, said Taylor, and there’s so much data around that to be successful funds don’t just need a team of researchers analyzing its meaning, but an “industrial scale data onboarding and data science capability.”
Some funds, like Two Sigma, have outsourced this process of cleaning data so that it can be onboarded into their systems easily, and companies like Crux Informatics (used by Two Sigma) are emerging as specialists in so-called “data wrangling”: ingesting, cleaning and structuring data sets.
Important as it is, however, data wrangling isn’t where the most appealing data jobs are. If you want to work in some of the most interesting and highest value-adding data jobs in finance, Taylor suggests you should position yourself towards the top of the “data stack.”
At Man Numeric, Taylor said the data stack looks like the chart below. The highest value positions are at the top, the lowest are at the bottom:
While you might be able to get a data job in finance if you’re an expert in SQL, Kafka or Kubernetes, you won’t get the best data jobs just by knowing about data storage and computation packages. Nor will knowing about machine learning or open source Python libraries (in the front office, Man is a Python house) be the deciding factors.
The best data science jobs on the buy-side now go to people who can make data available to the rest of the organization, said Taylor. The future is about increasing the “velocity of data” in firms, and this means enabling “self-service data,” he added. While data ingestion and storage are important, they’re at the commoditized end of the stack. The real focus is now on allowing “users from across the organization to manipulate and analyze data.”
This means that the most valuable data scientists are those who can build custom dashboards or visualizations that allow colleagues to access data directly. Data science teams have an enabling role. Taylor suggested that data fluency across organizations is increasing, both as existing staff add to their skills and as new data-confident graduates are hired.
If you want a long-term, remunerative career in data science in finance, you therefore need to be pitching yourself at the top of the chart above. “If you can build self-service tools, you will be ready for the next few years,” Taylor said.
A modified version of this article originally appeared in eFinancialCareers.