Blog Archives

The ‘knight on an infinite chessboard’ puzzle: efficient simulation in R

December 10, 2018
By
The ‘knight on an infinite chessboard’ puzzle: efficient simulation in R

Previously in this series: The “lost boarding pass” puzzle The “deadly board game” puzzle I’ve recently been enjoying The Riddler: Fantastic Puzzles from FiveThirtyEight, a wonderful book from 538’s Oliver Roeder. Many of the probab...

Read more »

Exploring college major and income: a live data analysis in R

October 16, 2018
By
Exploring college major and income: a live data analysis in R

I recently came up with the idea for a series of screencasts: I've thought about recording a screencast of an example data analysis in #rstats. I'd do it on a dataset I'm unfamiliar with so that I can show and narrate my live thought process.Any suggestions for interesting datasets to use?— David Robinson (@drob) October 6, 2018 Hadley Wickham had the...

Read more »

Who wrote the anti-Trump New York Times op-ed? Using tidytext to find document similarity

September 6, 2018
By
Who wrote the anti-Trump New York Times op-ed? Using tidytext to find document similarity

Like a lot of people, I was intrigued by “I Am Part of the Resistance Inside the Trump Administration”, an anonymous New York Times op-ed written by a “senior official in the Trump administration”. And like many data scientists, I was curious about what role text mining could play. Ok NLP people, now’s your chance to shine. Just spitballing here...

Read more »

Scientific debt

May 10, 2018
By
Scientific debt

A very useful concept in software engineering is technical debt. Technical debt occurs when engineers choose a quick but suboptimal solution to a problem, or don’t spend time to build sustainable infrastructure. Maybe they’re using an approach that doesn’t scale well as the team and codebase expand (such as hardcoding “magic numbers”), or using a tool for reasons of convenience...

Read more »

Data science at DataCamp

April 10, 2018
By
Data science at DataCamp

In January, I was excited to make an announcement about a shift in my career: I have some exciting news: today I'm joining @DataCamp as their Chief Data Scientist 🎉📊📈 pic.twitter.com/wiN9J4qSjx— David Robinson (@drob) January 29, 2018 When I first discussed the role with the DataCamp CEO, I described my goal as to “Make DataCamp as good at doing data science...

Read more »

What digits should you bet on in Super Bowl squares?

February 4, 2018
By
What digits should you bet on in Super Bowl squares?

My new office introduced me to a betting game I wasn’t previously familiar with: Super Bowl squares. It’s played with a ten-by-ten grid, like this one from printyourbrackets.com: Each row and column gets an assortment of digits from 0-9 represen...

Read more »

Exploring handwritten digit classification: a tidy analysis of the MNIST dataset

January 22, 2018
By
Exploring handwritten digit classification: a tidy analysis of the MNIST dataset

In a recent post, I offered a definition of the distinction between data science and machine learning: that data science is focused on extracting insights, while machine learning is interested in making predictions. I also noted that the two fields greatly overlap: I use both machine learning and data science in my work: I might fit a model...

Read more »

What’s the difference between data science, machine learning, and artificial intelligence?

January 9, 2018
By
What’s the difference between data science, machine learning, and artificial intelligence?

When I introduce myself as a data scientist, I often get questions like “What’s the difference between that and machine learning?” or “Does that mean you work on artificial intelligence?” I’ve responded enough times that my answer easily qualifies for my “rule of three”: When you’ve written the same code 3 times, write a functionWhen you’ve given the same in-person...

Read more »

Advice to aspiring data scientists: start a blog

November 14, 2017
By

Last week I shared a thought on Twitter: When you’ve written the same code 3 times, write a functionWhen you’ve given the same in-person advice 3 times, write a blog post— David Robinson (@drob) November 9, 2017 Ironically, this tweet hints at a piece of advice I’ve given at least 3 dozen times, but haven’t yet written a post about. I’ve...

Read more »

Announcing “Introduction to the Tidyverse”, my new DataCamp course

November 9, 2017
By
Announcing “Introduction to the Tidyverse”, my new DataCamp course

For the last few years I’ve been encouraging a particular approach to R education, particularly teaching the dplyr and ggplot2 packages first and introducing real datasets early on. This week I’m excited to announce the next step: the release of Introduction to the Tidyverse, my new interactive course on the DataCamp platform. The course is an introduction to the dplyr...

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)