Speed test of sequence generation for unbalanced simulation

March 30, 2015
By

I have a simulation package that allows for the simulation of regression models including nested data structures. You can see the package on github here: simReg. Over the weekend I updated the package to allow for the simulation of unbalanced designs. ...

Read more »

Improved memory usage and RJSONIO compatibility in jsonlite 0.9.15

March 30, 2015
By
Improved memory usage and RJSONIO compatibility in jsonlite 0.9.15

The jsonlite package implements a robust, high performance JSON parser and generator for R, optimized for statistical data and the web. Last week version 0.9.15 appeared on CRAN which improves memory usage and compatibility with other packages. Migrating to jsonlite The upcoming release of shiny will switch from RJSONIO to jsonlite. To...

Read more »

MCMskv, Lenzerheide, Jan. 5-7, 2016

March 30, 2015
By
MCMskv, Lenzerheide, Jan. 5-7, 2016

Following the highly successful MCMski IV, in Chamonix last year, the BayesComp section of ISBA has decided in favour of a two-year period, which means the great item of news that next year we will meet again for MCMski V , this time on the snowy slopes

Read more »

Fastest Growing Software for Scholarly Analytics: Python, R, KNIME…

March 30, 2015
By
Fastest Growing Software for Scholarly Analytics: Python, R, KNIME…

In my ongoing quest to “analyze the world of analytics”, I’ve added the following section below to The Popularity of Data Analysis Software: It would be useful to have growth trend graphs for each of the analytics packages I track, … Continue reading →

Read more »

rud.is » R 2015-03-30 13:32:08

March 30, 2015
By
rud.is » R 2015-03-30 13:32:08

Over on The DO Loop, @RickWicklin does a nice job visualizing the causes of airline crashes in SAS using a mosaic plot. More often than not, I find mosaic plots can be a bit difficult to grok, but Rick’s use was spot on and I believe it shows the data pretty well, but I also

Read more »

The most common R error messages

March 30, 2015
By

R has something of a reputation for generating, shall we say, obscure error messages like this: Error in model.frame.default(formula = y ~ female + DNC + SE_region + : could not find function "function (object, ...) nobject" One tip for dealing with error messages is to ignore everything between "Error in" and the colon: unless you are running a...

Read more »

Mapping Flows in R

March 30, 2015
By
Mapping Flows in R

Last year I published the above graphic, which then got

Read more »

Data Visualization cheatsheet, plus Spanish translations

March 30, 2015
By
Data Visualization cheatsheet, plus Spanish translations

We’ve added a new cheatsheet to our collection. Data Visualization with ggplot2 describes how to build a plot with ggplot2 and the grammar of graphics. You will find helpful reminders of how to use: geoms stats scales coordinate systems facets position adjustments legends, and themes The cheatsheet also documents tips on zooming. Download the cheatsheet

Read more »

Need for Processing Speed: data.table

March 30, 2015
By
Need for Processing Speed: data.table

Monday 30 March 2015 - 15:05 The first time I discovered data.table it felt like magic. I was waiting on a process that was projected to take the better part of an afternoon. In the meantime, I followed the data.table tutorial, rewrote my code using the data.table structure, and fully executed...

Read more »

R & Google Maps and R & Robotics (ROS)

R & Google Maps and R & Robotics (ROS)

The next RBelgium meetup will be about R & Google Maps and R & Robotics (ROS). BNOSAC will be hosting the event this time. This is the schedule: • 17h30-18h: open questions • 18h-19h: R and Google Maps • 19h-20h: R and...

Read more »

Sampling Distributions and Central Limit Theorem in R

March 29, 2015
By
Sampling Distributions and Central Limit Theorem in R

The Central Limit Theorem (CLT), and the concept of the sampling distribution, are critical for understanding why statistical inference works. There are at least a handful of problems that require you to

Read more »

Autoregressive Conditional Poisson Model – I

March 29, 2015
By
Autoregressive Conditional Poisson Model – I

Modeling the time series of count outcome is of interest in the operational risk while forecasting the frequency of losses. Below is an example showing how to estimate a simple ACP(1, 1) model, e.g. Autoregressive Conditional Poisson, without covariates with ACP package.

Read more »

R Stats + Digital Analytics: 8 Blogs you should Follow

March 29, 2015
By
R Stats + Digital Analytics: 8 Blogs you should Follow

Are you interested in using R for your digital analytics projects? Do you need to perform prediction modelling and visualizations on your digital data and Excel can´t just do the job as you wanted?Or, you simply have no idea how R could help you in your digital analytics problems and you would like to see some real working examples...

Read more »

intuition beyond a Beta property

March 29, 2015
By
intuition beyond a Beta property

A self-study question on X validated exposed an interesting property of the Beta distribution: If x is B(n,m) and y is B(n+½,m) then √xy is B(2n,2m) While this can presumably be established by a mere change of variables, I could not carry the derivation till the end and used instead the moment generating function E

Read more »

Iteratively Populating Templated Sentences With Inline R in knitr/Rmd

March 29, 2015
By
Iteratively Populating Templated Sentences With Inline R in knitr/Rmd

As part of the Wrangling F1 Data With R project, I want to be able to generate sentences iteratively from a templated base. The following recipe works for sentences included in an external file: What I’d really like to be able to do is put the Rmd template into a chunk something like this…: and

Read more »

Segmenting F1 Qualifying Session Laptimes

March 29, 2015
By
Segmenting F1 Qualifying Session Laptimes

I’ve started scraping some FIA timing sheets again, including practice and qualifying session laptimes. One of the things I’d like to do is explore various ways of looking at the qualifying session laptimes, which means identifying which qualifying session each laptime falls into: For looking at session utilisation charts I’ve been making use of accumulated

Read more »

Space Launch Sites over Time

March 29, 2015
By
Space Launch Sites over Time

Continuing from last weeks post, I am now looking at space launch sites.DataData are from the main table. In addition, this sites table was manually browsed for interpretation of abbreviations.List of most important sitesJust by running counts the most...

Read more »

Makefiles and RMarkdown

March 28, 2015
By
Makefiles and RMarkdown

Quite some time ago (October 2013, according to Amazon), I bought a copy of “Reproducible Research with R and RStudio” by Christopher Gandrud. And it was awesome. Since then, I’ve been using knitr and RMarkdown quite a lot. However, until recently, I never bothered with a makefile. At the time, I had assumed that it

Read more »

ggExtra: R package for adding marginal histograms to ggplot2

March 28, 2015
By
ggExtra: R package for adding marginal histograms to ggplot2

My first CRAN package, ggExtra, contains several functions to enhance ggplot2, with the most important one being ggExtra::ggMarginal() - a function that finally allows easily adding marginal density plots or histograms to scatterplots. Availability You can read the full README describing the functionality in detail or browse the source code on GitHub. The package is available through both...

Read more »

Parallel R with BatchJobs

March 28, 2015
By
Parallel R with BatchJobs

Parallelizing R with BatchJobs – An example using k-means Gord Sissons, Feng Li Many simulations in R are long running. Analysis of statistical algorithms can generate workloads that run for hours if not days tying up a single computer. Given the amount of time R programmers can spend waiting for results, getting acquainted parallelism makes

Read more »

Using the R MatchIt package for propensity score analysis

March 28, 2015
By

Descriptive analysis between treatment and control groups can reveal interesting patterns or relationships, but we cannot always take descriptive statistics at face value. Regression and matching methods allow us to make controlled comparisons to reduc...

Read more »

A Deterministic Compartmental Model for Erythropoeisis with R

March 28, 2015
By

Recently I have finished working on and developing a deterministic, compartmental model of erythropoeisis (How Red Blood Cells are produced and destroyed) in R using deSolve package. I have also made a Shiny application for the simulation. The model ca...

Read more »

R style default plot for Pandas DataFrame

March 28, 2015
By
R style default plot for Pandas DataFrame

The default plot method for dataframes in R is to show each numeric variable in a pair-wise scatter plot. I find this to be a really useful first look at dataset, both to see correlations and joint distributions between variables, but also to quickly diagnose potential strangeness like bands of repeating values or outliers. From

Read more »

Interactive Maps for John Snow’s Cholera Data

March 28, 2015
By
Interactive Maps for John Snow’s Cholera Data

This week, in Istanbul, for the second training on data science, we’ve been discussing classification and regression models, but also visualisation. Including maps. And we did have a brief introduction to the  leaflet package, devtools::install_github("rstudio/leaflet") require(leaflet) To see what can be done with that package, we will use one more time the John Snow’s cholera dataset, discussed in previous...

Read more »

Using closures as objects in R

March 27, 2015
By
Using closures as objects in R

For more and more clients we have been using a nice coding pattern taught to us by Garrett Grolemund in his book Hands-On Programming with R: make a function that returns a list of functions. This turns out to be a classic functional programming techique: use closures to implement objects (terminology we will explain). It … Continue reading...

Read more »

Replay: Reproducible data analysis with the checkpoint package

March 27, 2015
By

Thanks to all who attended my webinar earlier this week, Reproducibility with Revolution R Open and the Checkpoint Package. If you missed the live session, you can catch up with the slides and video replay which I've embedded below. If you just want to check out the demo of the checkpoint package, it starts at 18:30 in the video...

Read more »

rClr 0.7-4 released

March 26, 2015
By
rClr 0.7-4 released

Version 0.7-4 of rClr (source code mirrored on GitHub), a package to access arbitrary .NET code seamlessly and in-process, has been released.This is a maintenance release with an important fix to memory management. In some circumstances, passing data ...

Read more »

Google Scholar Finds Far More SPSS Articles; Analytics Forecast Updated

March 26, 2015
By
Google Scholar Finds Far More SPSS Articles; Analytics Forecast Updated

Only last August I wrote that among scholars, the use of R had probably exceeded that of SPSS to become their most widely used software for analytics. That forecast was based on Google Scholar searches focused on one year at a … Continue reading →

Read more »

Review of "Hands-On Programming with R"

March 26, 2015
By

by Joseph Rickert There have been well over a hundred books on R published within the last ten years. Most of these texts with titles like “Introduction Statistics with R” or “Time Series with R” offer the reader a way to jump right in and perform some concrete statistical analysis using R’s myriad built-in functions and extensive visualization features....

Read more »