How much data science do you actually remember?

[This article was first published on r-bloggers – SHARP SIGHT LABS, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

How many data science books have you read? 5? 10? A few dozen?

How many free online courses have you taken? A few?

How many blog posts have you read? (I’d be willing to bet: you’ve read dozens.)

If you’re like most budding data scientists, you’ve probably consumed a lot of material. You probably even learned some of it.

The problem though, is that the vast majority of people learn but then quickly forget.

There’s a difference between learning and remembering

Here’s an example:

ggplot(data = diamonds, aes(x  = carat, y = price)) +
    geom_point(color = "dark red") +
    labs(title = "Diamond weight vs price") +
    theme(plot.title = element_text(family = "Verdana", size = 20)) 

If you’re like most aspiring data scientists, you’ll try to learn this code by using the copy-and-paste method. You’ll take this code from a blog post like this, copy it into RStudio and run it.

Most aspiring data scientists do the exact same thing with online courses. They’ll watch a few videos, open the course’s sample code, and then copy-and-paste the code.

Watching videos, reading books, and copy-and-pasting code do help you learn, at least a little. If you watch a video about ggplot2, you’ll probably learn how it works pretty quickly. And if you copy-and-paste some ggplot2 code, you’ll probably learn a little bit about how the code works.

Here’s the problem: if you learn code like this, you’ll probably forget it within a day or two.

This is critical: there’s a big difference between learning and remembering. Learning is actually the easy part.

The hard part is remembering.

To master data science, you need to remember

True mastery requires remembering.

It’s not just enough to “learn.” If you learn some R syntax (or a data science concept) today, it’s effectively worthless if you forget it by next week.

Then what? If you forget what you learn, what will happen in an interview? What will happen in a fast-paced job?

You need to remember data science to get a job

Let’s say that you’re trying to get a data science job. You take a few online courses, buy a few books, and copy-and-paste some code.

Maybe you get a few certificates. You proudly put them on your resume.

And you diligently send out resumes for data science jobs.

Eventually, you get an interview.

When you walk into that interview, you had better know your stuff. You need to really, really know it. Remember it.

Any good employer is going to test you. Great companies will grill you in an interview. They’ll ask you questions. They’ll ask you to write sample code.

You had better remember …

What happens if you forget? What if, when they ask you to make a simple scatterplot in R, you forget? What if you can’t do it?

Will you tell them “I don’t remember how to write the code, but I have a data-science certificate from an online learning course?”

If you can’t write the code fluently, on command, no good company will hire you.

I can’t emphasize this enough: if you want to get a good data science job, you need to really know your stuff. You need to remember how to write the code, from memory, on command. That is how you ace an interview. You show them, in person, that you can write R code with your eyes closed. You show them that you can do the work. That you can get things done.

As long as you’re a decent person and don’t have any major character flaws, your mastery of R and data science concepts will help you nail the interview and get the job.

You need to remember data science to get things done

Ok. Let’s say that you do get a data job.

The company that hired you is “fast-paced.” I’ll point out here that “fast-paced” is a code word for an environment where there is a lot of work, and a lot of pressure to perform. Keep in mind that these days, almost all companies call themselves “fast-paced.” Almost all companies expect people to “do more with less.” That’s just the way that things are right now … most good companies operate lean, and they expect you to perform.

What happens when you get thrown into a demanding data science job? Will you be able to handle it? Will you be able to perform?

I know a few people who, either through luck or guile, have obtained good data jobs at good companies, even though they aren’t good data scientists. These are the people who haven’t even mastered the basics. You give them a task, and they don’t really know how to do it. You ask them to write some code, and they have to search for how to do it on Google.

Now, I want to be clear: every data scientist needs to look things up once and a while. If you’re working on a hard project and using advanced procedures, you’ll need to consult reference materials sometimes. That’s pretty normal.

I’m talking about guys that have to look up the basics. They are “copy and paste” coders.

These people are always the underperformers of the team. They struggle to get their work done on time and at a high level of quality. They struggle to remember how to get things done: how to write the code; how to apply the data science concept.

The problem is that they simply don’t remember the syntax. They don’t remember the concepts and principles. They don’t remember the critical information.

Because they can’t remember the things they need to know, these people are always struggling to perform, especially under pressure. They produce code that’s low quality. They work slowly. They ultimately struggle to pull their own weight, and other people need to do more work because of them.

If you get a job as a data scientist, will you be an underperformer? The guy who doesn’t pull his own weight? One of the data scientists who struggles to get the work done? Who’s always stressed?

Or, do you want to be an elite performer? The go-to person that always gets the job done, quickly and effectively (and is rewarded accordingly).

When learning data science, focus on long-term memory

It’s not enough to just learn. You need to learn and remember in the long run. You need to learn and remember in order to master the material, get the job, and become a top performer.

So, it doesn’t matter how many data science courses you take. It doesn’t matter how many data science books you buy and read. What matters, at the end of the day, is how much you remember. How much you can do “on command.”

You need to “learn how to learn”

To get to a level where you learn and remember data science, you need to “learn how to learn.”

Let me say that again: you need to learn how to learn.

Learning is a skill. If you want to master data science, you need to know how to learn, so that you can learn data science efficiently and effectively. You need to be able to learn and not forget.

Ironically, while people in the tech industry are ecstatic about machine learning, no one is paying any attention to human learning. I won’t explain my complete thoughts on human learning in this blog post (I’ll eventually write a separate blog post), but I think that human learning is one of the most important subjects of this century.

In any case, if you want to master data science quickly, you need to learn how to learn.

You need to learn “how to practice”

Part of learning “how to learn”, is understanding “how to practice.”

Data science has multiple facets. It has a knowledge base that you need to know (a set of concepts that you need to understand). But data science is also a set of skills.

As I’ve already emphasized, you need be a great data scientist, you need to be able to write code.

Learning to write code is quite a bit like learning to play piano or guitar. It’s a skill. And to master it, you need to practice.

This is one of the biggest gaps in data science education today. No one talks about how to practice.

The best guitar players practice relentlessly. When they first start out, they have drills that they run through in order to learn and master basic techniques. As they progress, they move on to other drills in order to learn and master new techniques. As they continue to develop, many of them supplement their practical, hands-on skill (i.e., playing the notes) with music theory. This further enhances their understanding and their skill.

Data science should be extremely similar. There is certainly a theoretical component (similar to music theory), but in practical terms, writing data science code is a hands-on skill. It’s something that you do, not just something that you know.

To really master data science, you need to learn and remember. But not just learn and remember in your head. Data science is something that you do. You need to “remember with your hands.”

To do that, you need to practice.

And to practice efficiently and effectively, you need to know how to practice.

Discover how to learn, practice, and remember data science

If you want to discover how to learn, practice, and ultimately remember data science, sign up for our email list.

In coming weeks and months, Sharp Sight will be writing extensively on data science learning.

If you sign up to the email newsletter, you’ll learn the tips, hacks, and strategies for rapidly learning data science (so that you never forget).

The post How much data science do you actually remember? appeared first on SHARP SIGHT LABS.

To leave a comment for the author, please follow the link and comment on their blog: r-bloggers – SHARP SIGHT LABS. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)