## ABC model choice by random forests [guest post]

August 10, 2014
This paper proposes a new approach to likelihood-free model choice based on random forest classifiers. These are fit to

This post is first in a series that looks at the inferences you can make using social media data.

## Fracking in your neighborhood? Shale- Oil and Gas Economic Impact Map

August 10, 2014
I have been working on a visualisation to highlight the results from my paper Fracking Growth, to highlight the local economic impact of the recent oil and gas boom in the US.  There are two key insights. First, there are strong spillover effects from the oil and gas sector. I estimate that every oil and

## To cooperate of defect (besides of coding): Prisoners dilemma, a game theory example in R

August 10, 2014
Hello Computer Science and/or R enthusiasts.This week I had the opportunity to try something that was in my To-Do list a while ago. The idea came almost instantly after reading Dr. Richard Dawkins book, The Selfish Gene (which was a BD gift, thanks Andy).I feel the obligated necessity to program my own implementation of the prisoners dilemma and...

## Ploting SEMs in R using semPlot

August 10, 2014
This is a short post presenting the great package semPlot developed by Sacha Epskamp (check out his nice website: http://sachaepskamp.com/) to make nice plots from your SEMs. SEMs are a modelling tool that allow the researcher to investiguate complex relationships between the variables, you may find here many links to free tutorials: http://www.structuralequations.org/. Here I

## Guns are cool – Regions

August 10, 2014
This was supposed to be a post in which General Social Surveys (GSS) data were used to understand a bit more about the causation of differences between states. Thus it was to give additioanl insight than my previous post; Guns are Cool - Differenc...

## An Idiot Learns Bayesian Analysis: Part 3

August 9, 2014
A week or so ago, the grand Magus over at lamages.blogspot.com/ published a great, quick thought exercise taken from Daniel Kahneman’s book Thinking, Fast and Slow. Here are the particulars of the problem: you’re in a community with two different color vehicles; 85% are green and 15% are blue. A vehicle was involved in a

## Book Review: Bioinformatics with R Cookbook

August 9, 2014
The book BioInformaticswith R Cookbook is a 340 pages book published by PACKT publishing last June. The book is intended for individuals working on the areas of biology and genetics. Most of the techniques and type of analysis (i.e. sequence, protein structure,...

## Visualizing Geo-Referenced Data With R

August 9, 2014
Visualizing Geo-Referenced Data With R

## Commitments of Traders: Moves in the Last Week

August 9, 2014
In my previous post I gave some background information on the Commitments of Traders report along with a selection of summary plots. One of the more interesting pieces of information that one can glean from these reports is the shift in trading sentiment from week to week. Below is a plot reflecting the relative change

## Bioinformatics with R Cookbook – Book Review

I just finished reading Bioinformatics with R Cookbook...so here's my review -:)I was excited to read this book, because it's been a while since I read any R book...but...I gotta admit...this is not my kind of book...as I discovered that obviously...

## PubChem 446220 = Yeyo

August 8, 2014
I just updated my R package, CTSgetR, for biological database translation using the Chemical Translation Service (CTS). While making code examples I came across some humorous chemical name synonyms for the molecule referenced in PubChem  as CID = 446220. Below are a few examples, can you guess what this is? Badrock, Bazooka, Bernice, Bernies, Blast, Blizzard, Bouncing Powder, Bump, Burese,

## The Open Source R Programming Language is Becoming Pervasive

August 8, 2014
So says CIO.com, in a recent article 11 Market Trends in Advanced Analytics. R, an open source programming language for computational statistics, visualization and data is becoming a ubiquitous tool in advanced analytics offerings. Kirsch says nearly every top vendor of advanced analytics has integrated R into their offering and so that they can now import R models. This...

## Community conversations and a new package for full text

August 8, 2014
Community Community is at the heart of rOpenSci. We couldn't have accomplished most of our work without help from various contributors and users. Most of our discussions with the broader community over the past year have been through twitter or one-on-one conversations. However, we would like to foster more open ended and deeper discussions with our community. To this end,...

## San Leandro and Hayward Housing Prices

I've done a previous post about the salaries of data scientists, but now I'm going to look at one of the negative sides of all the high salaries generated by the tech field in the Bay Area – real estate prices.

## Vtreat: designing a package for variable treatment

August 7, 2014
When you apply machine learning algorithms on a regular basis, on a wide variety of data sets, you find that certain data issues come up again and again: Missing values (NA or blanks) Problematic numerical values (Inf, NaN, sentinel values like 999999999 or -1) Valid categorical levels that don’t appear in the training data (especially Related posts:

## A Simple Shiny App for Monitoring Trading Strategies – Part II

August 7, 2014
This is a follow up on my previous post “A Simple Shiny App for Monitoring Trading Strategies“.  I added a few improvements that make the app a bit better (at least for me!). Below is the list of new features : A sample  .csv file (the one that contains the raw data) A “EndDate”  drop

## Incidental R

August 7, 2014
by Joseph Rickert Last week, I posted a list of sessions at the Joint Statistical Meetings related to R. As it turned out, that list was only the tip of the iceberg. In some areas of statistics, such as graphics, simulation and computational statistics the use of R is so prevalent that people working in the field often don't...

## Rcpp now used by 250 CRAN packages

August 7, 2014
Rcpp reached a nice round milestone yesterday: 250 packages on CRAN now depend on it ...

## why clusterProfiler fails

August 6, 2014
Recently, there are some comments said that sometimes clusterProfiler failed in KEGG enrichment analysis. kaji331 compared cluserProfiler with GeneAnswers and found that clusterProfiler gives larger p values. The result forces me to test it. Read More: 251 Words Totally

## John Chambers useR! 2014 Keynote

August 6, 2014
At useR! 2014, John Chambers was generous enough to provide us with insight into the...

## Making Maps with a Punchline

August 6, 2014
I’ve had a lifelong fascination with maps, and working with R definitely enables my map...

## Data science goes to college with DataFest

August 6, 2014
Below is the first of several exciting data science developments for the younger generation, happening...

## A New Use for Pipes in R: Forkbombs

August 6, 2014
Almost 3 years ago, I wrote about how to forkbomb with R. A quick recap is that a forkbomb is a low-tier, malicious misuse of a system; sort of a "baby's first denial of service". The idea is that you write a program that will start an entirely new copy of itself each time it is executed. Executing it...

## In case you missed it: July 2014 Roundup

August 6, 2014
In case you missed them, here are some articles from June of particular interest to R users: The deadline for our contest to visualize the location of R user groups has been extended to August 16. Previews of R-related sessions at this year's JSM conference in Boston. Coding errors in R graphics scripts serendipitously create some interesting art.

## Predicting Monthly Car Sales for Brands in US: First Step

August 6, 2014
I've set out to produce monthly forecasts of monthly car sales by brand in the US. So far I've made a SUTSE dynamic linear model (code on Github) and created a Shiny app (http://sweiss.shinyapps.io/carvis/) as a prototype (no predictions...

## NCEAS Codefest

August 6, 2014
We're delighted to be sponsoring the upcoming Open Science Codefest in Santa Barbara, California, alongside RENCI, NCEAS, NSF, DataONE, and Mozilla Science Lab. The Open Science Codefest's goal is to gather researchers from across ecology, biodiversity science, and other earth and environmental sciences with programmer types to collaborate on coding projects. The ideas...

## Results of the Readers’ Survey

August 5, 2014
First of all, let me say “Thank You” to all of the 357 people who completed the survey. I was hoping for 100, so needless to say the response blew away my expectations. This endeavor seems like a worthwhile effort to do once a year. Next year I will refine the...