Main image of article Data Science Interviews are About the Audience, Not the Math

I have been on both sides of the data science interview process; therefore, I know how stressful it is for the applicant, but also how important it is for the company to find the right candidate.

Over the years, I have observed that anticipating audience needs is the most important factor at each interview stage; yet data science candidates often over-index on technical acumen, and neglect the fact that every evaluator is reviewing different attributes. In such cases, data scientists will showcase foundational technical knowledge such as the difference between sensitivity and specificity, or modeling evaluation metrics such as log loss or Gini norm. Worse than that, data science candidates tend to go down rabbit holes such as Bayesian parameter estimation.

Going this deep into a highly technical niche subject has two potential risks: Having an interviewer’s eyes glaze over because they are not the right audience, or worse, the interviewer has a deeper technical understanding about that particular subject and will trip you up in your answer!

In my 11 years building a data science career, I have found there to be four main stages to an interview process, each with distinct audiences: the initial phone screen, the technical evaluation, the “take home” assignment, and a behavioral assessment.

May my missteps at each stage aid you in your own quest to become—or move up—to a senior data scientist.

Phone Screen

To start, let's dissect the phone screen. During a phone screen with a leading search company, I sought to impress the recruiter with my technical acumen, dazzle them with my professional experience, and win them over with my charm, with a goal to have the recruiter become an advocate for my candidacy at the next stage.

But the recruiter had different incentives, because I was no doubt among dozens of candidates to be evaluated that week, and that the recruiter was given specific questions with corresponding answers. While being able to articulate what feature engineering and cross-validation solidified my qualifications, satisfying this stage of the process meant being brief and showing interest in the role.

Technical Interviews

The second stage of the interview process is often a series of technical interviews performed by people currently in the role—your future teammates. Many times, this is the stage where you get questioned on whether you have solved problems similar to ones they are facing.

This type of audience has two motivations. First, they want to determine your intellectual horsepower in or adjacent to their field so they can be sure you will bring something valuable to the team. Second, many data scientists are eager to demonstrate the value and importance of their work. I have found that, in these instances, it’s best to let the interviewers speak, and contribute in a manner that shows you have something to share, but not that you will replace them as the brightest unicorn on the team.

For example, don’t ask: “Exactly how many models have you pushed to production?” If the answer is zero, you pose a threat. At this stage of the process, it's important to show your technical skillsets without overselling yourself.

Coding Exercise

The third stage is typically a coding exercise. Companies will say it should take you three hours to complete (for example), when in reality it will take you ten times as long because you will not want to be lazy about it. Furthermore, practical data science is a team sport and we all live on Stack Overflow anyway. To excel at this stage, your audience is looking for concise, easily understood code.

It’s also important to remember that your code is being reviewed against other candidates, so you need to limit your design choices to methods you can justify and ones that you are certain your audience will comprehend. Complex ensemble models with exotic feature engineering may not be as impressive as a random forest with typical variable treatments. Keep it simple yet concise.

Cultural Fit

In the last stage, the hiring manager is simply looking for a cultural fit on the team. There are three facts supporting your candidacy at this point: 1.) there is a huge shortage of data scientists that is expected to last more than 10 years, 2.) the company has invested a lot of time in your candidacy already, and 3.) managers may not be technically fluent themselves, and so rely on your previous interview results.

I once had nine interviews with a company, had made it to the last round with two hiring managers, but ultimately failed because my management philosophy did not coincide with theirs. Rather than coming across as adaptable, my “lean” approach to meetings did not instill confidence that I would be easy to manage or a good coworker. Most of us have seen the most brilliant, yet most difficult, teammate let go while the less-intelligent but easily managed teammate remains. At this stage, you want to position yourself as easy to manage and adaptable above all else.

For those of you struggling to find your spot, persevere! These are real examples of my interview failures that I hope encourage you in your journey.

Ted Kwartler, VP of Trusted AI at DataRobot and Harvard Adjunct Professor, who can offer insight into the ways we can better prepare students to become the future data science and A.I. talent.