Lively R

June 27, 2014
By

Next week, the UseR conference comes to UCLA.  And in anticipation, I thought a little foreshadowing would be nice.  Amelia McNamara, UCLA Stats grad student and rising stats ed star, shared with me a new tool that has the potential … Continue reading →

Read more »

Scraping Fantasy Football Projections from the Web

June 27, 2014
By
Scraping Fantasy Football Projections from the Web

In this post, I show how to download fantasy football projections from the web using R.  In prior posts, I showed how to scrape projections from ESPN, CBS, NFL.com, and The post Scraping Fantasy Football Projections from the Web appeared first on Fantasy Football Analytics.

Read more »

FRAMA Part II: Replicating A Simple Strategy

June 27, 2014
By
FRAMA Part II: Replicating A Simple Strategy

This post will begin the investigation into FRAMA strategies, with the aim of ultimately finding a FRAMA trading strategy with … Continue reading →

Read more »

Multivariate Data Analysis and Visualization Through Network Mapping

June 27, 2014
By
Multivariate Data Analysis and  Visualization Through Network Mapping

Recently I had the pleasure of speaking about one of my favorite topics, Network Mapping. This is a continuation of a general theme I’ve previously discussed and involves the merger of statistical and multivariate data analysis results with a network. Over the past year I’ve been working on two major tools, DeviumWeb and MetaMapR, which

Read more »

Squeezing more speed from R for nothing, Rcpp style

Squeezing more speed from R for nothing, Rcpp style

In a previous post we explored how you can greatly speed up certain types of long-running computations in R by parallelizing your code using multicore package*. I also mentioned that there were a few other ways to speed up R code; the one I will be exploring in this post is using Rcpp to replace »more

Read more »

Comment of the week

June 27, 2014
By

This one, from DominikM: Really great, the simple random intercept – random slope mixed model I did yesterday now runs at least an order of magnitude faster after installing RStan 2.3 this morning. You are doing an awesome job, thanks a lot! The post Comment of the week appeared first on Statistical Modeling, Causal Inference, and...

Read more »

How data-driven companies use R to compete

June 27, 2014
By

The editors at DataInformed invited me to write an article about how R is used in business, and I was pleased to oblige. The article, How Companies use R to Compete in a Data-Driven World, is now live and describes how Facebook, The New York Times, X+1, ANZ Bank and many others successfully use R to analyze their data....

Read more »

The future of R on the web at #user2014

June 27, 2014
By
The future of R on the web at #user2014

The schedule and abstracts for useR! 2014 have been posted on the conference website. Session 2 (tuesday 1pm) of the Kaleidoscope track will feature a fantastic set of talks about R and the web, including RCloud (Gordon Woodhull, AT&T), Ope...

Read more »

Bayesian First Aid: Test of Proportions

June 26, 2014
By
Bayesian First Aid: Test of Proportions

Does pill A or pill B save the most lives? Which web design results in the most clicks? Which in vitro fertilization technique results in the largest number of happy babies? A lot of questions out there involves estimating the proportion or relative frequency of success of two or more groups (where success could be a saved life, a...

Read more »

(Py, R, Cmd) Stan 2.3 Released

June 26, 2014
By

We’re happy to announce RStan, PyStan and CmdStan 2.3. Instructions on how to install at: http://mc-stan.org/ As always, let us know if you’re having problems or have comments or suggestions. We’re hoping to roll out the next release a bit quicker this time, because we have lots of good new features that are almost ready The post

Read more »

Updates to R package raincpc: Global Daily Rainfall for over 35 years

June 26, 2014
By
Updates to R package raincpc: Global Daily Rainfall for over 35 years

The Climate Prediction Center's  (CPC) global rainfall data, 1979 - present, 50 km resolution, is one of the few high-quality, long-term, observation-based, daily rainfall products available for free. Although raw data is available at&nb...

Read more »

Review of Applied Predictive Modeling by Kuhn and Johnson

June 26, 2014
By
Review of Applied Predictive Modeling by Kuhn and Johnson

by Joseph Rickert Predictive Modeling or “Predictive Analytics”, the term that appears to be gaining traction in the business world, is driving the new “Big Data” information economy. Predictably, there is no shortage of material to be found on this subject. Some discussion of predictive modeling is sure to be found in any reasonably technical presentation of business decision...

Read more »

Maybe I Don’t Really Know R After All

June 26, 2014
By
Maybe I Don’t Really Know R After All

Lately, I’ve been feeling that I’m spreading myself too thin in terms of programming languages. At work, I spend most of my time in Hive/SQL, with the occasional Python for my smaller data. I really prefer Julia, but I’m alone at work on that one. And since I maintain a package on CRAN (RSiteCatalyst), I frequently spend Related posts:

Read more »

Jun 26-27, 2014 – Introduction to Data Science with R in NYC

June 26, 2014
By
Jun 26-27, 2014 – Introduction to Data Science with R in NYC

You can either register from eventbrite or our school site NYC Data Science Academy. Date: Thursday/Friday , June 26th and 27th, 2014 Time:  9:00am to 5:00pm Location: 500 7th Ave, 17th Floor, glass door classroom, New York, NY 10018 NYC Data Science Academy, training subbrand of SupStat (Official Training partner with RStudio Inc) is hosting our... Read more »

Tailoring univariate probability distributions

June 26, 2014
By
Tailoring univariate probability distributions

This post shows how to build a custom univariate distribution in R from scratch, so that you end up with the essential functions: a probability density function, cumulative distribution function, quantile function and random number generator. In the beginning all you need is an equation of the probability density function, … Continue reading →

Read more »

Be Careful with Using Model Design in R

June 25, 2014
By
Be Careful with Using Model Design in R

In R, useful functions for making design matrices are model.frame and model.matrix. I will to discuss some of the differences of behavior across and within the two functions. I also have an example where I have run into this problme and it caused me to lose time. Using model.frame for a design matrix Whenever I

Read more »

A Simple Shiny App for Monitoring Trading Strategies

June 25, 2014
By
A Simple Shiny App for Monitoring Trading Strategies

In a previous post I showed how to use  R, Knitr and LaTeX to build a template strategy report. This post goes a step further by making  the analysis  interactive. Besides the interactivity, the Shiny App also solves two problems : I can now access all my trading strategies from a single point regardless of the instrument traded.

Read more »

Boolean 3 (finally) on CRAN

June 25, 2014
By

I have finally managed to get boolean3 accepted to CRAN. You can find it here: boolean3 on CRAN. To summarize: boolean3 provides a means of estimating partial-observability binary response models following boolean logic. boolean3 was developed by Jason W. Morgan under the … Continue reading →

Read more »

What Would Cohen Have Titled “The Earth is Round (p < .05)” in 2014?

June 25, 2014
By
What Would Cohen Have Titled “The Earth is Round (p < .05)” in 2014?

The area of bibliometrics is not my area of expertise but is still of interest as a researcher. I sometimes think about how Google has impacted the way we title articles. Gone are the days of witty, snappy titles. Title … Continue reading →

Read more »

R Scrabble: Part 2

June 25, 2014
By
R Scrabble: Part 2

Ivan Nazarov and Bartek Chroł gave very interesting comments to my last post on counting number of subwords in NGSL words. In particular they proposed large speedups of my code. So I thought to try checking a larger data set. So today I will work with TWL2006 - the official word authority for tournament Scrabble...

Read more »

Interactive, web-ready ggplot2-style graphics with ggvis

June 25, 2014
By
Interactive, web-ready ggplot2-style graphics with ggvis

Hadley Wickham's been working on the next-generation update to ggplot2 for a while, and now it's available on CRAN. The ggvis package is completely new, and combines a chaining syntax reminiscent of dplyr with the grammar of graphics concepts of ggplot2. The resulting charts are web-ready in scalable SVG format, and can easily be made interactive thanks to RStudio's...

Read more »

Cleaning up oversized github repositories for R and beyond

June 25, 2014
By
Cleaning up oversized github repositories for R and beyond

The version control system Git is an amazing piece of software for tracking every change that you make to a project and saving its entire history. It is incredibly useful, for users of R and other programming languages, leading it shoot from 0 market share in 2005 (when it was first released) to market domination in one short decade. However, Git can cause confusion....

Read more »

Identify Sleepers in Fantasy Football using Statistics and Wisdom of the Crowd

June 24, 2014
By
Identify Sleepers in Fantasy Football using Statistics and Wisdom of the Crowd

In this post, I demonstrate how to statistically identify sleepers in fantasy football using the wisdom of the crowd. The R Scripts The R Script for the “Wisdom of the The post Identify Sleepers in Fantasy Football using Statistics and Wisdom of the Crowd appeared first on Fantasy Football Analytics.

Read more »

Do you believe in World Cup superstition?

June 24, 2014
By
Do you believe in World Cup superstition?

If you believe in supernatural causality, you will love what the numbers of the World Cup have to say which team is going to win this Cup in Brazil. According to this numerology approach neither Brazil nor Germany or Netherlands will be the winner, but Uruguay. The table below shows my reason. This was produced

Read more »

ABC model choice by random forests

June 24, 2014
By
ABC model choice by random forests

After more than a year of collaboration, meetings, simulations, delays, switches,  visits, more delays, more simulations, discussions, and a final marathon wrapping day last Friday, Jean-Michel Marin, Pierre Pudlo,  and I at last completed our latest collaboration on ABC, with the central arguments that (a) using random forests is a good tool for choosing the

Read more »

Bedtools tutorial from 2013 CSHL course

June 24, 2014
By
Bedtools tutorial from 2013 CSHL course

A couple of months ago I posted about how to visualize exome coverage with bedtools and R. But if you're looking to get a basic handle on genome arithmetic, take a look at Aaron Quinlan's bedtools tutorials from the 2013 CSHL course. The tutorial uses ...

Read more »

Open Source Playing with ggvis using rCharts, angular, uikit, ace-editor

June 24, 2014
By

Open source allows us to do amazing things.  Take me for example.  I am a below average coder, and I can hack together these incredibly powerful tools to do something like below. Some might naively say that rCharts and ggvis are competitors.  Howeve...

Read more »

Statistics and R at the Intel ISEF Science Fair

June 24, 2014
By
Statistics and R at the Intel ISEF Science Fair

by Wayne Smith, Ph.D. California State University, Northridge Editor's note: This post was abstracted from the monthly newsletter of the Southern California Chapter of the ASA. On May 13th and 14th, the Intel International Science and Engineering Fair (Intel ISEF) the world’s largest international pre-college competition, was held at the Los Angeles Convention Center. I was blessed with the...

Read more »

Example 2014.7: Simulate logistic regression with an interaction

June 24, 2014
By
Example 2014.7: Simulate logistic regression with an interaction

Reader Annisa Mike asked in a comment on an early post about power calculation for logistic regression with an interaction. This is a topic that has come up with increasing frequency in grant proposals and article submissions. We'll begin by showing how to simulate data with the interaction, and in our next post...

Read more »