This was originally posted on the author’s LinkedIn
You will find lots of advice on how to get a job as a data scientist. I want to add some signal to the noise.
I happen to be hiring for a Data Scientist or two as of this writing, and was thinking of doing a blog post on offering advice for those looking. But then I thought of something better and more insightful: It is a cut-and-paste of an email that I sent to our Director of Talent Acquisition last week when he asked me, “Can you share with me what you’re looking for in regards to your Data Scientist candidates?”
Here it is:
Date: Mon, 26 Jan 2015 14:39:10 -0800
Delivered-To: [email protected]
Subject: Re: Sorry for the mix up regarding Data Science candidates
From: Andrew Parker
To: Ron Miguel
Cc: Nicole Prause
Content-Type: text/plain; charset=UTF-8
Thanks for all your help on this. Nikky, please add your thoughts to this
In general, for this Data Science position, the question is: Can this
person teach us something, and do they bring to the table some core
competency that is missing from the team? Are they senior enough to teach
others when we hire more junior people in the near future?
*If they are new PhDs:*
The thinking is that they’re probably smart, and really it’s a question of
seeing if their interests and expectations are aligned with the actual
projects we have to work on, and to gauge their willingness to learn the
problem domain (SaaS business issues: LTV, Churn / Retention, etc.) and
tools. Seeing extensive use of Python and R is most important. I want to
get a sense that they were the “hacker” in their lab.
The nature of the data matters: specifically I like to see them deal with
messy, noisy, incomplete, unstructured, human inputted or driven data.
Examples are NLP and text processing, linguistic modeling, survey and
clinical data, economic and sociology data, web scraped (or API) data from
Twitter, facebook, etc. Less ideal is data that is highly structured, and
precise or all theoretical.
*If they are new Masters or BS graduates:*
Many of these resumes are over-inflated and key-word stuffed, and I really
have to read between the lines to figure out what they’ve actually done. I
find these the most difficult to read.
*If they have industry experience:*
Anything applying analytical and data processing skills to problems
relevant to ZipRecruiter where they were the lead developer is what we’re
looking for. They have to be able to deal with raw and mostly unstructured
data. E.g. “Go look in this Apache access log file for what you need.” I am
yet unable to accommodate candidates that are accustomed to an IT team
doing all the cleaning and curation of data.
I’d like to have them present something and do a deep-dive on the data,
analysis, code, and insights. I think recent graduates and academics will
have a huge advantage since they are likely working with data and code they
can share publicly. So the on-site visit will be different from the usual
software engineering visit. Maybe for those that don’t have data handy that
they can talk about, Nikky and I will have to identify some data for them
to do a sort of “take home project.” We haven’t gotten to that point yet
We’ll wait for your process guidelines, but in the mean time we’ll score
candidates and make more detailed internal notes.
And that’s it.
Keep in mind that this is for a senior position and describes my team’s particular needs at this point in time. I sent this email well before I contemplated writing a blog post. I hope some of you find this helpful.
If you’re curious, we still have that open Data Scientist position here: