Blog Archives

Tutorial: Scalable R on Spark with SparkR, sparklyr and RevoScaleR

October 12, 2016
By
Tutorial: Scalable R on Spark with SparkR, sparklyr and RevoScaleR

If you'd like to manipulate and analyze very large data sets with the R language, one option is to use R and Apache Spark together. R provides the simple, data-oriented language for specifying transformations and models; Spark provides the storage and computation engine to handle data much larger than R alone can handle. At the KDD 2016 conference last...

Read more »

Watch the world warm with this animated globe, created with R

October 11, 2016
By
Watch the world warm with this animated globe, created with R

Due to anthropogenic climate change, the average global temperature has increased steadily over the past decade or so. While we're all familiar with the hockey-stick line chart of rising temperature, the change is even more dramatic on this animated globe showing the local effects of climate change. The first half of the animation shows the monthly local change compared...

Read more »

Make ggplot graphics2 interactive with ggiraph

October 10, 2016
By

R's ggplot2 package is a well-known tool for producing beautiful static data visualizations that you can include in a printed report. But what if you want to include a ggplot2 graphic on a webpage and provide the ability for the user to interact with the data? The ggiraph package by David Gohel (available for installation via CRAN). WIth ggiraph,...

Read more »

In case you missed it: Septemer 2016 roundup

October 7, 2016
By

In case you missed them, here are some articles from September of particular interest to R users. The R-Ladies meetups and the Women in R Taskforce support gender diversity in the R community. Highlights from the Microsoft Data Science Summit include recordings of many presentations about R, and the keynote "The Future of Data Analysis" by Edward Tufte. An...

Read more »

Import data to R from SAS, SPSS and Stata with Haven

October 6, 2016
By

Regardless of the tool you use to analyse data, you'll often have to access data living in file formats generated by other tools. The "haven" package from RStudio allows you to import and export data in SAS, SPSS and Stata formats. Version 1.0 was released on October 4, and is now available on CRAN. Haven is also installed as...

Read more »

Statcheck: an R package to check statistical results in psychology papers

October 5, 2016
By
Statcheck: an R package to check statistical results in psychology papers

The results of many scientific papers are wrong. There are many reasons for this, including p-hacking, publication bias, and the general inability to replicate results. But there's another, more mundane cause: incorrect calculation of p-values in statistical tests. This could be caused by simple transcription errors when plugging numbers into a statistical tool, incorrect rounding, or misapplication of the...

Read more »

Homer, not Bart, is the star of the Simpsons

October 3, 2016
By
Homer, not Bart, is the star of the Simpsons

It's been a long time since I watched the The Simpsons, but I was always under the impression that Bart was the primary character. Perhaps it was all the Do the Bartman and "Cowabunga!" nonsense from the 90s. Anyway, data scientist Todd W Schneider used R to analyze the scripts of the first 26 seasons and found that Homer...

Read more »

All the R Ladies

September 30, 2016
By
All the R Ladies

Two groups are making and impact in improving the gender diversity of R users worldwide. The R-Ladies organization is creating chapters worldwide to facilitate female R programmers meeting and working together, and the Taskforce on Women in the R Community is working to improve the participation and experience of women in the R community. R has more participation by...

Read more »

Watch: Highlights of the Microsoft Data Science Summit

September 29, 2016
By

I just got back from Atlanta, the host of the Microsoft Machine Learning and Data Science Summit. This was the first year for this new conference, and it was a blast: the energy from the 1,000 attendees was palpable. I covered Joseph Sirosh's keynote presentation yesterday, but today I wanted to highlight a few other talks from the program...

Read more »

Using R to detect fraud at 1 million transactions per second

September 28, 2016
By
Using R to detect fraud at 1 million transactions per second

In Joseph Sirosh's keynote presentation at the Data Science Summit on Monday, Wee Hyong Took demonstrated using R in SQL Server 2016 to detect fraud in real-time credit card transactions at a rate of 1 million transactions per second. The demo (which starts at the 17:00 minute mark) used a gradient-boosted tree model to predict the probability of a...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)