2014 in Review: Docker Rising

January 2, 2015
By

When looking back on 2014 from an infrastructure perspective, it's hard not to have one word on the lips: Docker. (Or, as we are wont to do in Silicon Valley when a technology is particularly hot, have the same word on the lips three times over à la Gabbo: "Docker, Docker, DOCKER!") While Docker …

Read more »

Custom Gridlines and Line Guides in R/ggplot Charts

January 2, 2015
By
Custom Gridlines and Line Guides in R/ggplot Charts

In the last quarter of last year, I started paying more attention to the use of custom grid lines and line guides in charts I’ve been developing for the Wrangling F1 Data With R book. The use of line guides was in part inspired by canopy views from within the cockpit of one of the

Read more »

Adjustment for Multiple Comparison Tests with R: Resources on the web

January 2, 2015
By

1. Bonferroni correctionp.adjust(p, method = "bonferroni")Read: http://en.wikipedia.org/wiki/2. Sidak (Dunn-Sidak) correctionRead: http://en.wikipedia.org/wiki/3. Holm-Bonferroni correctionp.adjust(p, method = "holm")Read: http://en.wikipedia.org/wiki/4. Hochberg correctionp.adjust(p, method = "hochberg")Read: http://stats.stackexchange.com/questions/Read: http://onbiostatistics.blogspot.it/5. Hommel correctionp.adjust(p, method = "hommel")Read: http://stats.stackexchange.com/questions6. Benjamini-Hochberg correctionp.adjust(p, method = "BH")or equivalentlyp.adjust(p, method = "fdr")Read: http://nebc.nerc.ac.uk/courses/Read: http://en.wikipedia.org/wiki/7. Benjamini–Yekutieli (Benjamini–Hochberg–Yekutieli) correctionp.adjust(p, method = "BY")Read: http://en.wikipedia.org/wiki/

Read more »

An experience of EARL

January 2, 2015
By
An experience of EARL

Coordinates: 2014 September 15-17 in the London borough of #rstats. 15th, evening I had just the right number of R bugs so that I could walk to the drinks and arrive fashionably late.  On the way, I realized that I hadn’t been near the Tower of London since the first year I moved to London even The post

Read more »

Video: H2O Talks by Trevor Hastie and John Chambers

January 2, 2015
By

by Joseph Rickert In a recent post, where I presented some R related highlights of November's H20 World conference, I singled out and described talks by Trevor Hastie and John Chambers and remarked that it would be nice if the videos would be made available. Well, thanks to the generosity of the folks at H2O I got my wish....

Read more »

Querying the Bitcoin blockchain with R

January 2, 2015
By
Querying the Bitcoin blockchain with R

The crypto-currency Bitcoin and the way it generates “trustless trust” is one of the hottest topics when it comes to technological innovations right now. The way Bitcoin transactions always backtrace the whole transaction list since the first discovered block (the Genesis block) does not only work for finance. The first startups such as Blockstream

Read more »

A weird and unintended consequence of Barr et al’s Keep It Maximal paper

January 2, 2015
By

Barr et al's well-intentioned paper is starting to lead to some seriously weird behavior in psycholinguistics! As a reviewer, I'm seeing submissions where people take the following approach:1. Try to fit a "maximal" linear mixed model.  If you get...

Read more »

cpumemlog: Monitor CPU and RAM usage of a process (and its children)

January 1, 2015
By

Long time no see ...Today I pushed the cpumemlog script to GitHub https://github.com/gregorgorjanc/cpumemlog. Read more about this useful utility at the GitHub site.

Read more »

2014 highlight: Statistical Learning course by Hastie & Tibshirani

January 1, 2015
By
2014 highlight: Statistical Learning course by Hastie & Tibshirani

What I like most about the R and Python developer and user communities, is their incredible openness and generosity. One of the finest examples in the past year was the online course “Statistical Learning” taught by Stanford professors Trevor Hastie and Rob Tibshirani. In this MOOC they explain very understandably (even

Read more »

Change Point Detection in Time Series with R and Tableau

January 1, 2015
By
Change Point Detection in Time Series with R and Tableau

Introduction Happy new year to all of you. Even if you still fight with the aftereffects of your new year’s party, the following is something that may help in getting you...

Read more »

Germans used to have more Sex in Summer!

January 1, 2015
By
Germans used to have more Sex in Summer!

Wow – what a headline … okay, I admit it’s phrased quite sensational given that it anticipates just one possible interpretation of increasingly more births around summer / autumn compared to in spring … but I guess I just get … Continue reading →

Read more »

Happy New Year! A look at the top posts from 2014.

January 1, 2015
By

Happy New Year everyone! Another year has come and gone, and this blog has just entered its seventh year of publication. (Once again, I missed the anniversary back on December 9.) Thanks to everyone who has supported this blog over the past 6 years by reading, sharing and commenting on our posts. And an extra- special thanks to all...

Read more »

Why Backtesting On Individual Legs In A Spread Is A BAD Idea

December 31, 2014
By
Why Backtesting On Individual Legs In A Spread Is A BAD Idea

So after reading the last post, the author of quantstrat had mostly critical feedback, mostly of the philosophy that prompted … Continue reading →

Read more »

R in Nature, Mashable

December 31, 2014
By
R in Nature, Mashable

R was recently the subject of a feature article in the prestigious science magazine Nature: Programming tools: Adventures with R. Besides being free, R is popular partly because it presents different faces to different users. It is, first and foremost, a programming language — requiring input through a command line, which may seem forbidding to non-coders. But beginners can...

Read more »

Interactive Simple Networks

December 31, 2014
By
Interactive Simple Networks

This post isn't anything new in terms of analysis, but just a cooler look at a previous post.  I looked at board members of large companies in a previous blog post and showed via a simple network how they share board members and provided some...

Read more »

digest 0.6.8

December 31, 2014
By

Release 0.6.8 of digest package is now on CRAN and will get to Debian shortly. This release opens the door to also providing the digest functionality at the C level to other R packages. Wush Wu is going to use the murmurHash C implementation in his r...

Read more »

DataVis with Plot.ly (@plotlygraphs) – Meetup Summary

December 30, 2014
By
2014-12-31 15_11_12-Clipboard

It pains me to admit it, but even though I had visited their site, created...

Read more »

SAS is #1…In Plans to Discontinue Use

December 30, 2014
By
SAS is #1…In Plans to Discontinue Use

I’ve been tracking The Popularity of Data Analysis Software for many years now, and a clear trend is the decline of the market share of the bigger analytics firms, notably SAS and SPSS. Many people have interpreted my comments as implying … Continue reading →

Read more »

Plot with ggplot2 and plotly within knitr reports

December 30, 2014
By
Plot with ggplot2 and plotly within knitr reports

Plotly is a platform for making, editing, and sharing graphs. If you are used to making plots with ggplot2, you can call ggplotly() to make your plots interactive, web-based, and collaborative. For example, see plot.ly/~marianne2/166, shown below. Notice the hover text! The “plotly” R package lets you use plotly with R. Want to...

Read more »

Widgets For Christmas

December 30, 2014
By

For Christmas, I generally want electronic widgets, but after six months of development, all I wanted this Christmas was htmlwidgets, and Santa RStudio/jj,joe,yihui and Santa Ramnath delivered early with this RStudio tweet on December 17th. htmlwidget...

Read more »

The 6th Spanish R Users Conference

December 30, 2014
By
The 6th Spanish R Users Conference

by Emilio L. Cano The VI Spanish R Users Conference took place on October 23 and 24 in Santiago de Compostela (Spain). It was a two-day event with a variety of talks and workshops about the R statistical software and programming language and its applications. First of all, let me thank all the local organizers, the melisa association and...

Read more »

Mapping San Francisco crime

December 30, 2014
By
Mapping San Francisco crime

When I was working as a data scientist at Apple in Silicon Valley, I’d drive up to San Francisco on nights and weekends to meet a girl for dinner or go to a meetup. I sort of fell in love with the city, and ... The post Mapping San Francisco crime appeared first on SHARP SIGHT LABS.

Read more »

Cluster Analysis of the NFL’s Top Wide Receivers

December 29, 2014
By
Cluster Analysis of the NFL’s Top Wide Receivers

“The time has come to get deeply into football. It is the only thing we have left that ain't fixed.”Hunter S. Thompson, Hey Rube Column, November 9, 2004I have to confess that I haven’t been following the NFL this year as much as planned or hoped.  On only 3 or 4 occasions this year have I been able to...

Read more »

OpenCPU release 1.4.6: gzip and systemd

December 29, 2014
By
OpenCPU release 1.4.6: gzip and systemd

OpenCPU server version 1.4.6 has been released to launchpad, OBS, and dockerhub (more about docker in a future blog post). I also updated the instructions to install the server or build from source for rpm or deb. If you have a running deployme...

Read more »

top posts for 2014

December 29, 2014
By
top posts for 2014

Here are the most popular entries for 2014: 17 equations that changed the World (#2) 995 Le Monde puzzle 992 “simply start over and build something better” 991 accelerating MCMC via parallel predictive prefetching 990 Bayesian p-values 960 posterior predictive p-values 849 Bayesian Data Analysis 846 Bayesian programming 834 Feller’s shoes

Read more »

WrightMap and TAM – Example continued…

December 29, 2014
By
WrightMap and TAM – Example continued…

As a follow up on the previous about integrating the TAM and WrightMap packages, we received a message from one of the TAM developers, Alexander Robitzsch, suggesting that it is possible to generate the Wright Map directly from the MML estimated distribution (instead of using the WLE estimates used in the previous post). Let’s start with the same setup: library(TAM) library(WrightMap) data( sim.rasch...

Read more »

First Day of the Month, Using R

December 29, 2014
By
First Day of the Month, Using R

Future-proofing is an important concept when designing automated reports. One thing that can get out of hand over time is when you accumulate so many periods of data that your charts start to look overcrowded. You can solve for this by limiting the num...

Read more »

Multivariate Medians

December 29, 2014
By

I'll bet that in the very first "descriptive statistics" course you ever took, you learned about measures of "central tendency" for samples or populations, and these measures included the median. You no doubt learned that one useful feature of the median is that, unlike the (arithmetic, geometric, harmonic) mean, it is relatively "robust" to outliers in the data.(You...

Read more »

R wins a 2014 Bossie Award

December 29, 2014
By

I missed this when it was announced back on September 29, but R won a 2014 Bossie Award for best open-source big-data tools from InfoWorld (see entry number 5): A specialized computer language for statistical analysis, R continues to evolve to meet new challenges. Since displacing lisp-stat in the early 2000s, R is the de-facto statistical processing language, with...

Read more »