Last week, I had a great opportunity to give a talk on data science application in manufacturing at Acharya Institute of Technology(AIT), Bangalore. Being an alumni, AIT has a special place in my heart. A lot of curious young minds who attended my session had great questions. Some of the highlights of Q&A session are
What is the difference between Data Scientist and Data Analyst?
A data analyst works on combining data from different sources, performing data discovery, creating schema, verify and validate data consistency and provide data reports. They also perform visualization tasks using BI tools. A data scientist would often perform the job of data analyst along with using mathematical and statistical principles to build models to solve a specific problem.
What is the difference between AI and deep learning?
I read an answer for this particular question recently and goes something along these lines “AI is usually built using power point and deep learning models are built using R and Python.” Most folks who use AI are usually from sales to market a product. To call a deep learning model as AI would be specious. Currently, all publicly available deep learning models are still in its infancy. We are not too far off from reaching cognitive ability in these models.
If I want to be a data scientist, where do I start?
Data science field is an amalgamation of different fields. Most commonly, one needs to have a strong background in probability theory and statistics. If you can master these two fields, you are half way through. Then you can figure out what domain you want to get into and work on you skills from there. For example, if you want to work on a social media platform then you can learn A/B testing and so on.
What models should we need to know from different types of machine learning like supervised and unsupervised; if I am asked in an interview?
Learn the basics first starting with a simpler one’s like linear regression and k-means clustering. Learn the math and assumptions underneath it and how it works. Once you’ve mastered it, fitting a model in R is as simple as calling an “lm” or “k-means” function. Most basic models are the most simplest to explain in an interview or to a non-subject matter expert.
As a recent graduate, companies ask for experience in data science. How can I go about it?
Data science is a very new field. Even data science executives from various companies come from different backgrounds like R&D, deployment, IT etc. There is nothing wrong with companies to ask for experience. This is mainly due to reduce their training time and risk and monetize much faster. Now coming to the question, one needs to build their portfolio to show their knowledge out there like LinkedIn articles, positing some of their work like tutorials and packages on GitHub, getting certifications, being active in data science community discussions and volunteering in data science conference. There are numerous ways that you could build your portfolio.
Should I go for a graduate program or get certifications to advance my career as a data scientist?
I would recommend getting a degree. There are not a lot of schools that offer data science graduate program. Even if you find it, it’s usually an extension from statistics or business school program. Some of the alternatives is to get a degree in business information system, statistics and computer science. The main reason, I would recommend a degree is because, it would help with your career ladder. During promotion cycles, they request you to have at least a master’s degree. Most companies prefer a Ph.D. There is a reason for this, when it comes to data science, it’s not just knowing how to use R/Python and fitting a model with functions. It’s more than that. It’s like any other research job where you build a qualitative and quantitative reasoning to set up trials and build models in different scenarios, adding, removing and creating data through experimentation to build effective models. It also involves reading a lot of new research and testing those techniques in your work. Coding and fitting models is just a small part of data science. Most certifications teach you just coding, fitting models and using a set of tools. So, my recommendation is to get a degree as a primary objective and have few certifications. There is a nice report by Burtch Works on this here.
If you read through all of the above, thanks for sticking around. Comment below to share your thoughts on this. If you like check of my slides from my presentation.
If you like this post, look at my other posts as well.