The R Journal, Volume 8/1, August 2016 – is online!
<img src=' [Read more...]
cricketr sizes up legendary All-rounders of yesteryear
Introduction This is a post I have been wanting to write for several months, but had to put it off for one reason or another. In this post I use my R package cricketr to analyze the performance of All-rounder greats namely Kapil Dev, Ian Botham, Imran Khan and Richard ...

R with Parallel Computing from User Perspectives
Share This: This article is originally published in Capital of Statistic by Chinese [link] and I would like to thank He Tong for lots of great suggestions. All code in this post can be found on GitHub [link]. Data scientists are already very familiar with statistical software like R, SAS, ... [Read more...]
RProtoBuf 0.4.6: bugfix update
Relatively quickly after version 0.4.5 of RProtoBuf was released, we have a new version 0.4.6 to announce which appeared on CRAN today.
RProtoBuf provides R bindings for the Google Protocol Buffers ("Protobuf") data encoding and serialization library used and released by Google, and deployed as a language and operating-system agnostic protocol by ... [Read more...]
Turning keywords into a co-occurrence network
This post is addressed to the GLM Fall 2016 students who are currently taking my Statistical Reasoning and Quantitative Methods course at Sciences Po in Paris.
Dear students
Since you are going to learn a lot of statistical computing/programming this semester, I thought it would be a good idea to ... [Read more...]
Human in A.I. loop
Practical example of how to leverage humans and artificial intelligence strengths at a fashion company.

In case you missed it: August 2016 roundup
In case you missed them, here are some articles from August of particular interest to R users. An amusing short video extols the benefits of reproducible research with R. A guide to implementing a churn model for mobile phone customers with Microsoft R Services. Computerworld's Sharon Machlis presents 5 data visualizations ... [Read more...]
BOOK REVIEW: Financial Analytics with R
There’s a new source in town for those who want to learn R and it’s a good, old-fashioned book called Financial Analytics with R: Building a Laptop Laboratory for Data Science. Written by Mark Bennett and Dirk Hugen, it hits the shelves in the U.K. in September ... [Read more...]
Spreadsheet Errors (discussed in The Economist)
Five years ago I wrote a post titled, "Beware of Econometricians Bearing Spreadsheets". The take-away message from that post was simple: there's considerable, well-documented, evidence that spreadsheets are very, very, dangerous when it comes to statistical calculations. That is, if you care about getting the right answers!Read that post, ... [Read more...]
R package forecast v7.2 now on CRAN
I’ve pushed a minor update to the forecast package to CRAN. Some highlights are listed here.
Plotting time series with ggplot2 You can now facet a time series plot like this:
library(forecast) library(ggplot2) lungDeaths [Read more...]
GoodReads: Webscraping and Text Analysis with R (Part 1)
Inspired by this article about sentiment analysis and this guide to webscraping, I have decided to get my hands dirty by scraping and analyzing a sample of reviews on the website Goodreads. The goal of this project is to demonstrate a complete example, going from data collection to machine learning ... [Read more...]
mlr loves OpenML
OpenML stands for Open Machine Learning and is an
online platform, which aims at supporting collaborative machine learning
online. It is an Open Science project that allows its users to share data, code
and machine learning experiments.
At the time of writing this blog I am in Eindoven at an ...

The elements of scaling R-based applications with DeployR
If you want to build an application using R that serves many users simultaneously, you're going to need to be able to run a lot of R sessions simultaneously. If you want R to run in the cloud, you can publish R functions as a Web service (and you can ... [Read more...]
Efficient Processing With Apply() Exercises
The apply() function is an alternative to writing loops, via applying a function to columns, rows, or individual values of an array or matrix. The structure of the apply() function is: apply(X, MARGIN, FUN, ...) The matrix variable used for the exercises is: dataset1 [Read more...]
Make Easy Heatmaps to Visualize your Turnaround Times
The Problem In two previous posts, I discussed visualizing your turnaround times (TATs). These posts are here and here. One other nice way to visualize your TAT is by means of a heatmap. In particular, we would like to look at the TAT for every hour of the week in ... [Read more...]
The start of satRdays
This post was originally shared on the R Consortium blog.Almost 200 people from 19 countries registered for the first satRday conference which was held last Saturday, September 3rd, in Budapest. The final count showed that nearly 170 R users spent 12 hours at the conference venue attending workshops, regular and lighting talks, social ... [Read more...]
Application possibilities of data science in laser technology
Speaker of the [R] Kenntnis-Tage 2016: Julia Gleixner | TRUMPF Laser GmbH The solid-state lasers of TRUMPF Laser are used for machining materials in various areas – from welding car bodies to cutting stents to drilling minute holes for the production of solar cells. These are just some of the diverse application possibilities. ...

Effect-Size Calculation for Meta-Analysis in R #rstats
When conducting meta-analysis, you most likely have to calculate or convert effects sizes into an effect size with common measure. There are various tools to do this – one easy to use tool is the Practical Meta-Analysis Effect Size Calculator from David B. Wilson. This online-tool is now implemented as an ... [Read more...]
Benford’s Law in R (cont.): Actual Data
This is the second post based on Sara Silverstein's blog on Benford’s Law. Previously we duplicated the comparison of the proportion of first digits from a series of randomly generated numbers, and successive arithmetic operations on those numbers, and saw that the the more complicated the operation, the closer ...

New in Magick 0.3
A new version of the ropensci magick package has been released to CRAN. Magick is a package for Advanced Image-Processing in R. It wraps the ImageMagick STL which is perhaps the most comprehensive open-source image processing library available today. Our original announcement has more details.
New features
This new version ... [Read more...]
