Running the Same Task in Python and R

October 8, 2018
By

(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers)

According to a KDD poll fewer respondents (by rate) used only R in 2017 than in 2018. At the same time more respondents (by rate) used only Python in 2017 than in 2016.

Let’s take this as an excuse to take a quick look at what happens when we try a task in both systems.

For our task we picked the painful exercise of directly reading a 50,000,000 row by 50 column data set into memory on a machine with only 8GB of ram.

In Python the Pandas package takes around 6 minutes to read the data, and then one is ready to work.

Read python

In R both utils::read.csv() and readr::read_csv() fail with out of memory messages. So if your view of R is “base R only”, or “base R plus tidyverse only”, or “tidyverse only”: reading this file is a “hard task.”

Read r 1

With the above narrow view one would have no choice but to move to Python if one wants to get the job done.

Or, we could remember data.table. While data.table is obviously not part of the tidyverse, data.table has been a best-practice in R for around 12 years. It can read the data and is ready to work in R in under a minute.

Read r 2

In conclusion, to get things done in a pinch: learn Python or learn data.table. And, in my opinion, “tidyverse first teaching” (commonly code for “tidyverse only teaching”) may not serve the R community well in the long run.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)