Fama/French Factors in 1 line of code

August 11, 2014
By

In the past, getting Fama/French factors from the Kenneth French dataset involved a convoluted procedure to download the zip file, unzip the file, clean the data, and convert to xts.  Now with Quandl, we can do it simply in one line of code.  Note: ...

Read more »

A Conversation with Max Kuhn – The useR! 2014 Interview

August 11, 2014
By
youtube___aaaaaaaaaaaaUntitled

The Interview In the video above, Max provides some amazing insights into the why and...

Read more »

Exploiting Heterogeneity to Reveal Consumer Preference: Data Matrix Factorization

August 11, 2014
By
Exploiting Heterogeneity to Reveal Consumer Preference: Data Matrix Factorization

We begin with a data matrix, a set of numbers arrayed so that each row contains information from a different consumer. Marketing research focuses on the consumer, but the columns are permitted more freedom, although they ought to tell us something abou...

Read more »

Master R Developer Workshop – San Francisco, January 19-20

August 11, 2014
By
Master R Developer Workshop – San Francisco, January 19-20

RStudio is planning a new Master R Developer Workshop to be taught by Hadley Wickham in the San Francisco Bay Area on January 19-20. This will be the same workshop that Hadley is teaching in September in New York City to a sold out audience. If you did not get a chance to register for the

Read more »

Opening Up Access to Data: Why APIs May Not Be Enough…

August 11, 2014
By
Opening Up Access to Data: Why APIs May Not Be Enough…

Last week, a post on the ONS (Office of National Statistics) Digital Publishing blog caught my eye: Introducing the New Improved ONS API which apparently “mak things much easier to work with”. Ooh… exciting…. maybe I can use this to start hacking together some notebooks?:-) It was followed a few days later by this one

Read more »

A John Ehlers oscillator — Cycle RSI(2)

August 11, 2014
By
A John Ehlers oscillator — Cycle RSI(2)

Since I’ve hit a rut in trend following (how do you quantify rising/falling/flat? What even defines those three terms in … Continue reading →

Read more »

A Conversation with Hadley Wickham – the useR! 2014 interview

August 11, 2014
By
hadley_Untitled

Hadley Wickham is famous. He’s not Kardashian famous, but walking around useR! and seeing the community’s reaction to him, there’s no question, he’s ‘R famous’. If you have the good fortune to see his talks, tutorials, or sessions in person, you owe it to yourself to do so.

Read more »

EARL Conference, London, 15-17 September 2014

August 11, 2014
By
EARL Conference, London, 15-17 September 2014

A final reminder that the EARL (Effective Applications of the R Language) Conference is only a month away; the conference will be held in London from the 15-17th September. EARL is the first conference dedicated to the widespread and growing use of R commercially and features presentations from R gurus such as Hadley Wickham and R business experts from companies...

Read more »

Announcing our ambassadors program

August 11, 2014
By

In the last 12 months we traveled all over the world delivering talks and hands on workshops at various conferences and universities. This was a great opportunity for us to raise awareness for the project and get more of you involved as contributors and collaborators. As we scale the project to the next level, we need your help in...

Read more »

Minimal reproducible examples

August 10, 2014
By

I occasionally get emails from people thinking they have found a bug in one of my R packages, and I usually have to reply asking them to provide a minimal reproducible example (MRE). This post is to provide instructions on how to create a MRE. Bug reports on github, not email First, if you think there is a bug,...

Read more »

ABC model choice by random forests [guest post]

August 10, 2014
By
ABC model choice by random forests [guest post]

This paper proposes a new approach to likelihood-free model choice based on random forest classifiers. These are fit to

Read more »

What your twitter friends say about you and your interests

August 10, 2014
By
What your twitter friends say about you and your interests

This post is first in a series that looks at the inferences you can make using social media data. The …Continue reading →

Read more »

Fracking in your neighborhood? Shale- Oil and Gas Economic Impact Map

August 10, 2014
By
Fracking in your neighborhood? Shale- Oil and Gas Economic Impact Map

I have been working on a visualisation to highlight the results from my paper Fracking Growth, to highlight the local economic impact of the recent oil and gas boom in the US.  There are two key insights. First, there are strong spillover effects from the oil and gas sector. I estimate that every oil and

Read more »

To cooperate of defect (besides of coding): Prisoners dilemma, a game theory example in R

August 10, 2014
By
To cooperate of defect (besides of coding): Prisoners dilemma, a game theory example in R

Hello Computer Science and/or R enthusiasts.This week I had the opportunity to try something that was in my To-Do list a while ago. The idea came almost instantly after reading Dr. Richard Dawkins book, The Selfish Gene (which was a BD gift, thanks Andy).I feel the obligated necessity to program my own implementation of the prisoners dilemma and...

Read more »

Ploting SEMs in R using semPlot

August 10, 2014
By
Ploting SEMs in R using semPlot

This is a short post presenting the great package semPlot developed by Sacha Epskamp (check out his nice website: http://sachaepskamp.com/) to make nice plots from your SEMs. SEMs are a modelling tool that allow the researcher to investiguate complex relationships between the variables, you may find here many links to free tutorials: http://www.structuralequations.org/. Here I

Read more »

Guns are cool – Regions

August 10, 2014
By
Guns are cool – Regions

This was supposed to be a post in which General Social Surveys (GSS) data were used to understand a bit more about the causation of differences between states. Thus it was to give additioanl insight than my previous post; Guns are Cool - Differenc...

Read more »

An Idiot Learns Bayesian Analysis: Part 3

August 9, 2014
By
An Idiot Learns Bayesian Analysis: Part 3

A week or so ago, the grand Magus over at lamages.blogspot.com/ published a great, quick thought exercise taken from Daniel Kahneman’s book Thinking, Fast and Slow. Here are the particulars of the problem: you’re in a community with two different color vehicles; 85% are green and 15% are blue. A vehicle was involved in a

Read more »

Book Review: Bioinformatics with R Cookbook

August 9, 2014
By
Book Review: Bioinformatics with R Cookbook

(This article was first published on Data Analysis and Visualization in R, and kindly contributed to R-bloggers) The book BioInformaticswith R Cookbook is a 340 pages book published by PACKT publishing last June. The book is intended for individuals working on the areas of biology and genetics. Most of the techniques and type of analysis (i.e. sequence, protein structure,...

Read more »

Visualizing Geo-Referenced Data With R

August 9, 2014
By

(This article was first published on Turning numbers into stories, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Turning numbers into stories. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

Commitments of Traders: Moves in the Last Week

August 9, 2014
By
Commitments of Traders: Moves in the Last Week

In my previous post I gave some background information on the Commitments of Traders report along with a selection of summary plots. One of the more interesting pieces of information that one can glean from these reports is the shift in trading sentiment from week to week. Below is a plot reflecting the relative change

Read more »

Bioinformatics with R Cookbook – Book Review

Bioinformatics with R Cookbook – Book Review

I just finished reading Bioinformatics with R Cookbook...so here's my review -:)I was excited to read this book, because it's been a while since I read any R book...but...I gotta admit...this is not my kind of book...as I discovered that obviously...

Read more »

PubChem 446220 = Yeyo

August 8, 2014
By
PubChem 446220 = Yeyo

I just updated my R package, CTSgetR, for biological database translation using the Chemical Translation Service (CTS). While making code examples I came across some humorous chemical name synonyms for the molecule referenced in PubChem  as CID = 446220. Below are a few examples, can you guess what this is? Badrock, Bazooka, Bernice, Bernies, Blast, Blizzard, Bouncing Powder, Bump, Burese,

Read more »

The Open Source R Programming Language is Becoming Pervasive

August 8, 2014
By

So says CIO.com, in a recent article 11 Market Trends in Advanced Analytics. R, an open source programming language for computational statistics, visualization and data is becoming a ubiquitous tool in advanced analytics offerings. Kirsch says nearly every top vendor of advanced analytics has integrated R into their offering and so that they can now import R models. This...

Read more »

Community conversations and a new package for full text

August 8, 2014
By

Community Community is at the heart of rOpenSci. We couldn't have accomplished most of our work without help from various contributors and users. Most of our discussions with the broader community over the past year have been through twitter or one-on-one conversations. However, we would like to foster more open ended and deeper discussions with our community. To this end,...

Read more »

San Leandro and Hayward Housing Prices

San Leandro and Hayward Housing Prices

I’ve done a previous post about the salaries of data scientists, but now I’m going to look at one of the negative sides of all the high salaries generated by the tech field in the Bay Area – real estate prices. A … Continue reading →

Read more »

Vtreat: designing a package for variable treatment

August 7, 2014
By
Vtreat: designing a package for variable treatment

When you apply machine learning algorithms on a regular basis, on a wide variety of data sets, you find that certain data issues come up again and again: Missing values (NA or blanks) Problematic numerical values (Inf, NaN, sentinel values like 999999999 or -1) Valid categorical levels that don’t appear in the training data (especially Related posts:

Read more »

A Simple Shiny App for Monitoring Trading Strategies – Part II

August 7, 2014
By

This is a follow up on my previous post “A Simple Shiny App for Monitoring Trading Strategies“.  I added a few improvements that make the app a bit better (at least for me!). Below is the list of new features : A sample  .csv file (the one that contains the raw data) A “EndDate”  drop

Read more »

Incidental R

August 7, 2014
By
Incidental R

by Joseph Rickert Last week, I posted a list of sessions at the Joint Statistical Meetings related to R. As it turned out, that list was only the tip of the iceberg. In some areas of statistics, such as graphics, simulation and computational statistics the use of R is so prevalent that people working in the field often don't...

Read more »

Rcpp now used by 250 CRAN packages

August 7, 2014
By
Rcpp now used by 250 CRAN packages

Rcpp reached a nice round milestone yesterday: 250 packages on CRAN now depend on it ...

Read more »