## Graph-based circle packing

July 27, 2015
The previous two posts showed examples of a simple circle packing algorithm using the packcircles package (available from CRAN and GitHub). The algorithm involved iterative pair-repulsion to jiggle the circles until (hopefully) a non-overlapping arrangement emerged. In this post we'll look an alternative approach. An algorithm to find an arrangement of circles satisfying a prior specification of...

## Statistical Models of Judgment and Choice: Deciding What Matters Guided by Attention and Intention

July 27, 2015
Preference begins with attention, a form of intention-guided perception. You enter the store thirsty on a hot summer day, and all you can see is the beverage cooler at the far end of the aisle with your focus drawn toward the cold beverages that you im...

## Egyptian fractions [Le Monde puzzle #922]

July 27, 2015
For its summer edition, Le Monde mathematical puzzle switched to a lighter version with immediate solution. This #922 considers Egyptian fractions which only have distinct denominators (meaning the numerator is always 1) and can be summed. This means 3/4 is represented as ½+¼. Each denominator only appears once. As I discovered when looking on line,

## Hadley Wickham on why he created all those R packages

July 27, 2015
Priceonomics published on Friday an in-depth profile of Hadley Wickham, author of many of the most popular R packages including ggplot2, dplyr and devtools. In the article, he reveals that his motivation for creating these packages was primarily to provide better ways of accomplishing routine tasks in R, an immensely useful contribution that sadly wasn't recognized in an academic...

## Announcing: Mastering RStudio

July 27, 2015
Learn the holistic use of RStudio to communicate your R code effectively and persuasively. Max (@nierhoff) and I are both absolute R enthusiasts. We both strongly believe in the power of R for statistical computing. And we are both also fascinated by the nearly endless possibilities of the R programming language enabling users to build robust The post

## Efficient accumulation in R

July 27, 2015
R has a number of very good packages for manipulating and aggregating data (plyr, sqldf, ScaleR, data.table, and more), but when it comes to accumulating results the beginning R user is often at sea. The R execution model is a bit exotic so many R users are very uncertain which methods of accumulating results are … Continue reading...

## RBerkeley Was Just Pining For The Fjords

July 27, 2015
If you made it to Chapter 8 of Data-Driven Security after ~October 2014 and tried to run the BerkeleyDB R example, you were greeted with: Warning in install.packages : package ‘RBerkely’ is not available (for R version ) That’s due to the fact that it was removed from CRAN at the end of September, 2014 because the package author &...

## Evading the “Hadley tax”: Faster Travis tests for R

July 26, 2015
Hadley is a popular figure, and rightly so as he successfully introduced many newcomers to the wonders offered by R. His approach strikes some of us old greybeards as wrong---I particularly take exception with some of his writing which frequently portrays a particular approach as both the best and only one. Real programming, I think, is...

## Turning your R (or Python) models into APIs

July 26, 2015
More and more real-world systems are relying on data science and analytical models to deliver sophisticated functionality or improved user experiences. For example, Microsoft combined the power of advanced predictive models and web services to develop the real-time voice translation feature in Skype. Facebook and Google continuously improve their deep learning models for...

## Tiny Data, Approximate Bayesian Computation and the Socks of Karl Broman: The Movie

July 26, 2015
This is a screencast of my UseR! 2015 presentation: Tiny Data, Approximate Bayesian Computation and the Socks of Karl Broman. Based on the original blog post it is a quick’n’dirty introduction to approximate Bayesian computation (and is also, in a sense, an introduction to Bayesian statistics in general). Here it is, if you have 15 minutes to...

## Making Static/Interactive Voronoi Map Layers In ggplot/leaflet

July 26, 2015
Despite having shown various ways to overcome D3 cartographic envy, there are always more examples that can cause the green monster to rear it’s ugly head. Take the Voronoi Arc Map example. For those in need of a primer, a Voronoi tesslation/diagram is: …a partitioning of a plane into regions based on distance to points

## RcppZiggurat 0.1.3: Faster Random Normal Draws

July 26, 2015
After a slight hiatus since the last release in early 2014, we are delighted to announce a new release of RcppZiggurat which is now on the CRAN network for R. The RcppZiggurat package updates the code for the Ziggurat generator which provides very f...

## Installing and Starting SparkR Locally on Windows OS and RStudio

July 26, 2015
Introduction With the recent release of Apache Spark 1.4.1 on July 15th, 2015, I wanted to write a step-by-step guide to help new users get up and running with SparkR locally on a Windows machine using command shell and RStudio. SparkR provides an R frontend to Apache Spark and using Spark’s distributed computation engine allows

## Where do letters occur in words

July 26, 2015
A while back I encountered an interesting graphic showing where letters were located in english words (http://www.prooffreader.com/2014/05/graphing-distribution-of-english.html). The other day I decided to do a similar one for letters in danish words and for this I used R. I downloaded all abstracts from the danish Wikipedia and made my own version as you can see... Read more »

## Predicting Titanic deaths on Kaggle II: gbm

July 26, 2015
Following my previous post I have decided to try and use a different method: generalized boosted regression models (gbm). I have read the background in Elements of Statistical Learning and arthur charpentier's nice post on it. This data ...

## Rcpp 0.12.0: Now with more Big Data!

July 25, 2015
A new release 0.12.0 of Rcpp arrived on the CRAN network for GNU R this morning, and I also pushed a Debian package upload. Rcpp has become the most popular way of enhancing GNU R with C++ code. As of today, 423 packages on CRAN depend on Rcpp...

## Roll Your Own Gist Comments Notifier in R

July 25, 2015
As I was putting together the coord_proj ggplot2 extension I had posted a (https://gist.github.com/hrbrmstr/363e33f74e2972c93ca7) that I shared on Twitter. Said gist received a comment (several, in fact) and a bunch of us were painfully reminded of the fact that there is no built-in way to receive notifications from said comment activity. @jennybryan posited that it

## IEEE Spectrum Puts R in 6th Place

July 25, 2015
R has moved up three positions to 6th place on IEEE Spectrum ranking. How long will it be before Julia is on the list? The post IEEE Spectrum Puts R in 6th Place appeared first on Exegetic Analytics.

## Logistic Growth, S Curves, Bifurcations, and Lyapunov Exponents in R

July 24, 2015
If you’ve ever wondered how logistic population growth (the Verhulst model), S curves, the logistic map, bifurcation diagrams, sensitive dependence on initial conditions, “orbits”, deterministic chaos, and Lyapunov exponents are related to

## New quantmod and TTR on CRAN

July 24, 2015
I just sent quantmod_0.4-5 to CRAN, and TTR_0.23-0 has been there for a couple weeks. I'd like to thank Ivan Popivanov for many useful reports and patches to TTR. He provided patches to add HMA (Hull MA), ALMA, and ultimateOscillator functions.Jam...

## A Path Towards Easier Map Projection Machinations with ggplot2

July 24, 2015
The \$DAYJOB doesn’t afford much opportunity to work with cartographic datasets, but I really like maps and tinker with shapefiles and geo-data when I can, plus answer a ton of geo-questions on StackOverflow. R makes it easy—one might even say too easy—to work with maps. All it takes to make a map of the continental

## {Long Vs. Wide} Data Frames

July 24, 2015
Introduction This is an excellent resource to understand 2 types of data frame format: Long and Wide. Just take a look at figure 1 inside the article 1) Long format: ggplot2 needs in certain scenarios this kind of format to work (generally grouped...

## R #6 in IEEE 2015 Top Programming Languages, Rising 3 Places

July 24, 2015
IEEE Spectrum has published its 2015 list of Top Programming Languages, and R ranks in 6th place, jumping 3 places from its 2014 ranking. Here's what the IEEE has to say about the top 10 from the table above: The big five—Java, C, C++, Python, and C#—remain on top, with their ranking undisturbed, but C has edged to within...

## Why I use Panel/Multilevel Methods

July 24, 2015
$Why I use Panel/Multilevel Methods$

I don’t understand why any researcher would choose not to use panel/multilevel methods on panel/hierarchical data. Let’s take the following linear regression as an example: , where is a random effect for the i-th group. A pooled OLS regression model for the above is unbiased and consistent. However, it will be inefficient, unless for all

## mapView: basic interactive viewing of spatial data in R

July 24, 2015
Working with spatial data in R I find myself quite often in the need to quickly visually check whether a certain analysis has produced reasonable results. There are two ways I usually do this. Either I: (sp)plot the data in … Continue reading →

## CACM Highlights R

July 23, 2015
The Association for Computing Machinery is the main professional organization for computer science, largely for academia but still with a broad membership. ACM publishes a number of journals, most of them for research but its flagship publication is a magazine, the Communications of the ACM. The current issue of the CACM includes an article, “Bringing … Continue reading...

## A 15-Week Intro Statistics Course Featuring R

July 23, 2015
Do you teach introductory statistics or data science? Need some help planning your fall class? I apply the 10 Principles of Burning Man in the design and conduct of all my undergraduate

## An alternative presentation of the ProPublica Surgeon Scorecard

July 23, 2015
ProPublica, an independent investigative journalism organisation, have published surgeon-level complications rates based on Medicare data. I have already highlighted problems with the reporting of the data: surgeons are described as having a “high adjusted rate of complications” if they fall in the red-zone, despite there being too little data to say whether this has happened