Blog Archives

Edward Tufte Keynote Presenter at Data Science Summit, Sep 26-27

August 23, 2016
By
Edward Tufte Keynote Presenter at Data Science Summit, Sep 26-27

I'm excited to share that one of my data science heroes will be a presenter at the Microsoft Data Science Summit in Atlanta, September 26-27. Edward Tufte, the data visualization pioneer, will deliver a keynote address on the future of data analysis and the how to make more credible conclusions based on data. If you're not familiar with Tufte,...

Read more »

Five great charts in 5 lines of R code each

August 22, 2016
By
Five great charts in 5 lines of R code each

Sharon Machlis is a journalist with Computerworld, and to show other journalists how great R is for data visualization she shows them these five data visualizations, each of which can be created in 5 lines of R code or less. I've reproduced Sharon's code and charts below. I did make a couple of tweaks to the code, though. I...

Read more »

Five problems (and one solution) with dual-axis time series plots

August 19, 2016
By
Five problems (and one solution) with dual-axis time series plots

If you need to present two time series spanning the same period, but in wildly different scales, it's tempting to use a time series chart with two separate vertical axes, one for each series, like this one from the Reserve Bank of New Zealand: Charts like this typically have one or more crossover points, and that crossing imparts meaning...

Read more »

Sentiment analysis of Trump’s tweets with R

August 18, 2016
By
Sentiment analysis of Trump’s tweets with R

Data Scientist David Robinson caused a bit of a stir in the media when he analyzed Donald Trump's tweets and revealed that those sent from an Android device were likely sent by the candidate himself, while those sent from an iPhone were likely sent by campaign staffers. The difference? As seen in the chart below, Android-based tweets used angrier,...

Read more »

Extract tables from messy spreadsheets with jailbreakr

August 17, 2016
By
Extract tables from messy spreadsheets with jailbreakr

R has some good tools for importing data from spreadsheets, among them the readxl package for Excel and the googlesheets package for Google Sheets. But these only work well when the data in the spreadsheet are arranged as a rectangular table, and not overly encumbered with formatting or generated with formulas. As Jenny Bryan pointed out in her recent...

Read more »

The inexorable growth of student debt, charted with R

August 15, 2016
By
The inexorable growth of student debt, charted with R

Len Kiefer, Deputy Chief Economist at Freddie Mac, recently published the following chart to his personal blog showing household debt in the United States (excluding mortgage debt). As you can see, student loan debt has steadily increased over the last 13 years and has now eclipsed all other forms of non-mortgage debt: He also created this animated chart showing...

Read more »

Tuning Apache Spark for faster analysis with Microsoft R Server

August 12, 2016
By
Tuning Apache Spark for faster analysis with Microsoft R Server

My colleagues Max Kaznady, Jason Zhang, Arijit Tarafdar and Miguel Fierro recently posted a really useful guide with lots of tips to speed up prototyping models with Microsoft R Server on Apache Spark. These tips apply when using Spark on Azure HDInsight, where you can spin up a Spark cluster the cloud with Microsoft R installed on the head...

Read more »

In case you missed it: July 2016 roundup

August 10, 2016
By

In case you missed them, here are some articles from July of particular interest to R users. R moves up to 5th place in the annual IEEE Spectrum programming language rankings. A guide to R-related presentations at the JSM 2016 conference. FiveThirtyEight uses R extensively for data journalism, as explained in a presentation at useR!2016. An in-depth look at...

Read more »

New cheat-sheet for the dplyrXdf package

August 8, 2016
By
New cheat-sheet for the dplyrXdf package

Hadley Wickham's dplyr package is an amazing tool for restructuring, filtering, and aggregating data sets using its elegant grammar of data manipulation. By default, it works on in-memory data frames, which means you're limited to the amount of data you can fit into R's memory. Hadley also provided an extension mechanism to make dplyr work with external data sources,...

Read more »

Interactive, Illustrator-quality graphics with R

August 5, 2016
By
Interactive, Illustrator-quality graphics with R

While many media properties including the New York Times, FiveThirtyEight and FlowingData use the R language to prepare graphics for publication, they often use Adobe Illustrator or similar graphics tools to touch up the last 5% or so of the graphics. Not so for Switzerland's news site swissinfo.ch, whose data journalist Duc-Quang Nguyen created these gorgeous bar charts showing...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)