If you want to work in a data engineering role at an investment bank, it’s not enough to be an expert in Python or R: You also need experience with the sorts of platforms and programming languages that allow banks to handle data in real time. One platform is particularly popular: Apache Kafka, which enables data professionals to work with real-time, high-volume data feeds.
Kafka is also prized for its low latency, which is essential when data must be processed relatively quickly. As banks’ demand for data scientists grows, so their need for technologists skilled with this software.
Kafka’s use in financial services is not new. Bloomberg presented on its use in its derivative market data group at a 2018 summit. Dutch Bank ING presented the same year. British bank Nationwide has been boasting about its ‘Kafka speed layer’ since 2019.
Developed by LinkedIn and open-sourced in 2011, Kafka has long been embraced by huge firms with equally huge data needs, including Netflix and IBM. As you might expect, though, use of the platform has only increased among banks and financial-services firms that have an equally pressing need to process huge data sets.
Bloomberg has cited the software’s super-scalability, zero data loss, fault tolerance and large storage capacity. All of those features helped spark strong demand; Goldman Sachs and JPMorgan alone are currently advertising around 100 and 500 jobs (respectively) that list Kafka as a prerequisite.
Goldman is using the platform with everything from its data lake to strats roles, and is in the process of building a brand-new inventory management system, written in Java, that leverages Kafka for event sequencing.
As banks’ use of data and need for data scientists expands, their need for Kafka expertise can be expected to grow correspondingly. Now might be the time to get ahead of the trend. Here’s a handy breakdown of courses that teach Kafka from the ground up.
A modified version of this article originally appeared in eFinancialCareers.