The art of hiring a data scientist
By Angela, November 17, 2015
Nearly four years ago, I joined the Insight Data Science team and we launched an intensive 7 week post-doctoral training fellowship bridging the gap between academia and data science. Since then, over 400 Insight alumni have been hired as data scientists or data engineers at top tier companies like Facebook, LinkedIn, Twitter, Airbnb, and Google.
Although I formally left the company in March 2013 (but continue to have an advisory role there), I still field countless questions from entrepreneurs who are looking to hire a data scientist. Because, let’s face it: data scientists aren’t necessarily easy to find. When we launched Insight, the term “data science” was still new. Most founders don’t know where to begin the search, who to look for – not to mention how to bring a data scientist on board so he or she can make valuable contributions to the company.
To that end, I’ve decided to summarize some of what I learned at Insight about recruiting and training data scientists. Keep in mind, I won’t share all of Insight’s secret sauce, but will hopefully provide enough high level lessons and principles that you can apply to your own hiring process.
Finding a data scientist is hard. Let’s look in a different haystack!
When looking for a data scientist, the default approach for many companies is to go after software engineers with a strong interest in statistics, data mining and machine learning. Understandably, there’s a preference for people with strong programming skills. However, drawing math-loving engineers from the tech talent pool isn’t a sustainable strategy. This simply reallocates scarce resources.
So, what’s the alternative? Many PhDs in engineering and the sciences work with “big data” and code their own algorithms. Did you know that particle physicists analyze 5TB of data every day? Or that mechanical engineers design and code models for computational fluid dynamics? It’s a huge opportunity: PhDs can be the new untapped supply for data scientists!
The advantages of hiring PhDs are that you don’t have to teach them the fundamentals of math and they have a working knowledge of how to code that make them a functional data scientist. In addition, you earn their loyalty for taking a chance on them straight out of school and the cost of hiring is lower.
Can everyone make the leap from academia to industry?
Academia and business are two different worlds and there are several kinks to work out when transitioning a grad student their academic environment to industry. Here are three pointers learned from my experience with Insight.
- Teach the language of industry. Just because most academics work in Matlab or R, this doesn’t mean that they aren’t capable of translating their skills to Python, SQL or MongoDB. Learning a new programing language only takes a week or two, so don’t let specific programming languages deter you from interviewing someone who knows their math. In the meantime, you can communicate and evaluate a candidate’s abilities with pseudocode.
- Guide workflow from rigor to iteration. The culture of academia is such that feedback from professors is infrequent and we only defend our thesis when it is absolutely perfect. As a result, PhDs are trained to work towards rigor as opposed to embracing iteration. This is obviously contrary to a startup’s culture. As you engage with a data scientist candidate or new hire, find ways to test their ability to ask for, receive and respond to critical feedback. Perhaps send them home with a 3-day data project and have them check-in every so often.
- Find actionable insights and not just interesting facts. In the spirit of rigor, academics tend to report anything and everything as opposed to thinking about KPIs. This is arguably the hardest mindset to shift because PhDs have spent years in an environment where research is not pushed for commercialization or productization. They aren’t necessarily going to be thinking about the business implications of the data right out of the gate. During the interview process, provide a data set and ask candidates an open-ended question like “How can we use this data on product usage to increase our revenue / engagement?” Then post hire, you need to provide the right context to help a new data scientist find the knobs you can turn and the levers you can pull to grow your company.
Three ways to hire a data scientist
If you want to hire a data scientist, there are three approaches to take:
- Tap your network for existing data scientists and other people who deal with lots of data (quants at banks, etc.). Remember, there are many people who do the same work as a data scientist but their roles might not be labelled as such.
- Go directly to universities and begin a recruiting and training process, following the principles above. This is a time-consuming process, especially when you need to create the framework from scratch. If you do choose this route, I’d suggest building a small team of data science advisors to help you.
- Partner with companies like Insight. They are currently in the SF Bay Area, NYC and Boston, and have started an online program for those outside these geographies. If you’re a founder or company who would like to work with them on your search for a data scientist, feel free to ping me for a connection or email them directly at firstname.lastname@example.org.