choroplethr v3.1.0: Better Summary Demographic Data

May 5, 2015
By
choroplethr v3.1.0: Better Summary Demographic Data

Today I am happy to announce that choroplethr v3.1.0 is now on CRAN. You can get it by typing the following from an R console: install.packages("choroplethr") This version adds better support for summary demographic data for each state and county in the US. The data is in two data.frames and two functions. The data.frames are:

Read more »

stringr 1.0.0

May 5, 2015
By
stringr 1.0.0

I’m very excited to announce the 1.0.0 release of the stringr package. If you haven’t heard of stringr before, it makes string manipulation easier by: Using consistent function and...

Read more »

Data Science in HR

May 5, 2015
By
Data Science in HR

by Joseph Rickert Last year in a post on interesting R topics presented at the JSM I described how data scientists in Google's human resources department were using R...

Read more »

Predicting events, when they haven’t happened yet

May 5, 2015
By
Predicting events, when they haven’t happened yet

Suppose you have to predict the probabilities of events which haven't happened yet. How do you do this?Here is an example from the 1950s when Longley-Cook, an actuary at...

Read more »

Clusters May Be Categorical but Cluster Membership Is Not All-or-None

May 4, 2015
By
Clusters May Be Categorical but Cluster Membership Is Not All-or-None

Very early in the study of statistics and R, we learn that random variables can be either categorical or continuous. Regrettably, we are forced to relearn this distinction over...

Read more »

RcppAnnoy 0.0.6

RcppAnnoy 0.0.6

A few days ago, Erik released a new version of his Annoy library -- a small, fast, and lightweight C++ template header library for approximate nearest neighbours --...

Read more »

take those hats off [from R]!

May 4, 2015
By
take those hats off [from R]!

This is presumably obvious to most if not all R programmers, but I became aware today of a hugely (?) delaying tactic in my R codes. I was working...

Read more »

Working with “large” datasets, with dplyr and data.table

May 4, 2015
By
Working with “large” datasets, with dplyr and data.table

A few months ago, I was doing some training on data science for actuaries, and I started to get interesting puzzeling questions. For instance, Fleur was working on telematic...

Read more »

Call R and Python from base SAS

May 4, 2015
By
Call R and Python from base SAS

Since 2009, it has been possible to call R from SAS programs. However, this integration requires IML, an add-on matrix-object language for SAS which isn't available with all SAS...

Read more »

using GOSemSim to rank proteins obtained by co-IP

May 4, 2015
By
using GOSemSim to rank proteins obtained by co-IP

Co-IP is usually used to identified interactions among specific proteins. It is widely used in detecting protein complex. Unfortunately, an identified protein may not be an interactor, and sometimes...

Read more »

Geomorph beta in development (2.1.5)

May 3, 2015
By

Dear geomorph users,We've been busy adding some new functions to the forthcoming v.2.1.5, currently in beta stage and available on gitHub (installed using: devtools::install_github("EmSherratt/geomorph",ref = "Develop")). Users be aware that ...

Read more »

dplyr Tutorial: verbs + split-apply

May 3, 2015
By
dplyr Tutorial: verbs + split-apply

At a recent Saint Louis R users meeting I had the pleasure of giving a basic introduction to the awesome dplyr R package. For me, data analysis ubiquitously involves...

Read more »

Cohort Analysis with Heatmap

Cohort Analysis with Heatmap

Previously I shared the data visualization approach for descriptive analysis of progress of cohorts with the “layer-cake” chart (part I and part II). In this post, I want to share...

Read more »

Introducing Radiant: A shiny interface for R

May 3, 2015
By

Radiant is a platform-independent browser-based interface for business analytics in R, based on the Shiny package. Key features Explore: Quickly and easily summarize, visualize, and analyze your data ...

Read more »

Survival Analysis With Generalized Additive Models : Part IV (the survival function)

May 2, 2015
By
Survival Analysis With Generalized Additive Models : Part IV (the survival function)

The ability of PGAMs to estimate the log-baseline hazard rate, endows them with the capability to be used as smooth alternatives to the Kaplan Meier curve. If we assume...

Read more »

Update to Introduction to programming econometrics with R

May 2, 2015
By

This semester I taught a course on applied econometrics with the R programming language. For this, I created a document that I gave to my students and shared online....

Read more »

Survival Analysis With Generalized Additive Models : Part III (the baseline hazard)

May 2, 2015
By
Survival Analysis With Generalized Additive Models : Part III (the baseline hazard)

In the third part of the series on survival analysis with GAMs we will review the use of the baseline hazard estimates provided by this regression model. In contrast...

Read more »

Survival Analysis With Generalized Models: Part II (time discretization, hazard rate integration and calculation of hazard ratios)

May 2, 2015
By
Survival Analysis With Generalized Models: Part II (time discretization, hazard rate integration and calculation of hazard ratios)

In the second part of the series we will consider the time discretization that makes the Poisson GAM approach to survival analysis possible. Consider a set of s individual...

Read more »

Rcpp 0.11.6

The new release 0.11.5 of Rcpp arrived on the CRAN network for GNU R yesterday; the corresponding Debian package has also been uploaded. Rcpp has become the most popular...

Read more »

RcppArmadillo 0.5.100.1.0

A new minor release 5.100.1 of Armadillo was released by Conrad yesterday. Armadillo is a powerful and expressive C++ template library for linear algebra aiming towards a...

Read more »

Should I use premium Diesel? Result: No

May 2, 2015
By
Should I use premium Diesel? Result: No

A while ago I had a post: 'Should I use premium Diesel? Setup. Since that time the data has been acquired. This post describes the results.DataData is registered by me...

Read more »

Introducing Radiant: A shiny interface for R

May 1, 2015
By

Radiant is a platform-independent browser-based interface for business analytics in R, based on the Shiny package. Key features Explore: Quickly and easily summarize, visualize, and analyze your data ...

Read more »

Revolution R Open 8.0.3 now available

May 1, 2015
By
Revolution R Open 8.0.3 now available

Revolution R Open 8.0.3 is now available for download for Windows, OS X, Red Hat, Ubuntu and OpenSUSE. This release includes seveal new features: it upgrades RRO to the...

Read more »

RStudio v0.99 Preview: Graphviz and DiagrammeR

May 1, 2015
By
RStudio v0.99 Preview: Graphviz and DiagrammeR

Soon after the announcement of htmlwidgets, Rich Iannone released the DiagrammeR package, which makes it easy to generate graph and flowchart diagrams using text in a Markdown-like syntax. The package...

Read more »

Survival Analysis With Generalized Additive Models : Part I (background and rationale)

May 1, 2015
By
Survival Analysis With Generalized Additive Models : Part I (background and rationale)

After a really long break, I’d will resume my blogging activity. It is actually a full circle for me, since one of the first posts that kick started this blog,...

Read more »

Shiny: Officer Involved Shootings

May 1, 2015
By

US Officer Involved Shootings Mar-Apr 2015 with Shiny Now everyone can be a data analyst with RStudio’s Shiny package. Fellow R programmer and Las Vegas import, Steve Wells, has created...

Read more »

rstanmulticore: A cross-platform R package to automatically run RStan MCMC chains in parallel

May 1, 2015
By

*** This work has been supported by a grant from the Spencer Foundation (#201400002). The views expressed are those of the author and do not necessarily reflect those of...

Read more »

How large vectors in R might be stored compactly

April 30, 2015
By
How large vectors in R might be stored compactly

Vectors in R can currently have elements of two sizes — 8-byte double-precision floating-point elements for `numeric’ vectors, or 4-byte elements for `integer’ or `logical’ vectors.  You can also have vectors whose...

Read more »

Upcoming talks about jsonlite and mongolite

April 30, 2015
By
Upcoming talks about jsonlite and mongolite

This summer I will be giving an invited talk at the annual French R Meeting in Grenoble as well as a shorter talk...

Read more »