Call for participation: DMApps 2013 – an International Workshop on Data Mining Applications in Industry and Government

March 10, 2013
By
Call for participation: DMApps 2013 – an International Workshop on Data Mining Applications in Industry and Government

Call for participation: DMApps 2013 – an International Workshop on Data Mining Applications in Industry and Government in conjunction with PAKDD 2013, Gold Coast, Australia, April 14, 2013 http://dmapps2013.rdatamining.com To attend the workshop, you need to register for PAKDD 2013 … Continue reading →

Read more »

Notes on my R / Git workflow

March 10, 2013
By

These are some notes on my current R git work flow, which is quite fluid, and git has enough quirks that I usually forget part of it ! Creating Projects I've used both RStudio and Eclipse.  RStudio seems easier to create a 'project' and add a loca...

Read more »

Calculating Custom Fantasy Football Projections for Your League using R

March 9, 2013
By
Calculating Custom Fantasy Football Projections for Your League using R

In prior posts, I have shown how to download fantasy football projections from ESPN, CBS, and NFL.com.  In this post, I will demonstrate how to take the projected points from these sources and calculate the projected points for your custom league ...

Read more »

Getting flexible with SAP HANA

Getting flexible with SAP HANA

Most of you might not be aware of a feature introduced on SAP HANA SPS5. This new feature is called "Flexible Tables", which means that you can define a table that will grow depending on your needs. Let's see an example...You define a table with ID, NA...

Read more »

MCMSki IV, Jan. 6-8, 2014, Chamonix (news #4)

March 9, 2013
By
MCMSki IV, Jan. 6-8, 2014, Chamonix (news #4)

More news about MCMSki IV! Remember, the call is still open for contributed sessions for a few more weeks, till March. 20 to be precise (make sure to contact me at [email protected] if you are considering putting one session together). To all those who already submitted a session, thanks a lot, please stay tuned, and

Read more »

Analyzing Monthly Expenses with a Pareto Chart

March 9, 2013
By
Analyzing Monthly Expenses with a Pareto Chart

This month, ASQ CEO Paul Borawski encourages us to share stories about “quality solutions in unexpected places.” This is such a fun question, because now I’ll be noticing these unexpected gems all

Read more »

Downloading NFL.com Fantasy Football Projections using R

March 9, 2013
By
Downloading NFL.com Fantasy Football Projections using R

In this post, I will show how to download NFL.com fantasy football projections using R.The R ScriptThe R Script for downloading fantasy football projections from NFL.com is located at: https://github.com/dadrivr/FantasyFootballAnalyticsR...

Read more »

The Gambling Machine Puzzle

March 9, 2013
By
The Gambling Machine Puzzle

This puzzle came up in the New York Times Number Play blog. It goes like this: An entrepreneur has devised a gambling machine that chooses two independent random variables x and y that are uniformly and independently distributed between 0 and 100. He plans to tell any customer the value of x and to ask him

Read more »

GSOC 2013: IID Assumptions in Performance Measurement

March 9, 2013
By
GSOC 2013: IID Assumptions in Performance Measurement

Google Summer of Code for 2013 has been announced and organizations such as R are beginning to assemble ideas for student projects this summer. If you’re an interested student, there’s a list of project proposals on the R wiki. If you’re considering being a mentor, post a project idea on the site soon – project

Read more »

Visualizing Risky Words — Part 2

March 9, 2013
By
Visualizing Risky Words — Part 2

This is a follow-up to my Visualizing Risky Words post. You’ll need to read that for context if you’re just jumping in now. Full R code for the generated images (which are pretty large) is at the end. Aesthetics are the primary reason for using a word cloud, though one can pretty quickly recognize what

Read more »

Analyzing SimplyStatistics visits info

March 9, 2013
By
Analyzing SimplyStatistics visits info

Recently we had to analyze the data of the number of visits per day to SimplyStatistics.org. There were two goals: Estimate the fraction of visitors retained after a spike in the number of visitors Identify (if any) any factors that influence the fraction estimated in 1. For me it was a fun project in part because I like SimplyStatistics but also...

Read more »

A bit more on sample size

March 8, 2013
By
A bit more on sample size

In our article What is a large enough random sample? we pointed out that if you wanted to measure a proportion to an accuracy “a” with chance of being wrong of “d” then a idea was to guarantee you had a sample size of at least: This is the central question in designing opinion polls Related posts:

Read more »

R vs. Perl/mySQL – an applied genomics showdown

March 8, 2013
By

R vs. Perl/mySQL - an applied genomics showdown Recently I was given an assignment for a class I'm taking that got me thinking about speed in R. This isn't something I'm usually concerned with, but the first time I tried to run my solution (ussing plyr's ddply() it was going to take all night to compute. I consulted the professor that taught...

Read more »

Quandl package released to CRAN

March 8, 2013
By

In a guest post here on February 20, Tammer Kamel introduced us to Quandl, a kind of "wikipedia" of time series data. In the post, Tammer (the founder of Quandl) noted that they were working on an R package to give R users access to Quandl as a data source. That package is now available. It includes the Quandl...

Read more »

Comparing quantiles for two samples

March 8, 2013
By
Comparing quantiles for two samples

Recently, for a research paper, I some samples, and I wanted to compare them. Not to compare they means (by construction, all of them were centered) but there dispersion. And not they variance, but more their quantiles. Consider the following boxplot type function, where everything here is quantile related (which is not the case for standard boxplot, see http://freakonometrics.hypotheses.org/4138,...

Read more »

Data Visualization: Shiny Democratization

March 8, 2013
By
Data Visualization: Shiny Democratization

In organizing Data Visualization DC we focus on three themes: The Message, The Process, The Psychology. In other words, ideas and examples of what can be communicated, the tools and know-how to get it done, and how best to communicate. … Continue reading → The post Data Visualization: Shiny Democratization appeared first on Data Community DC.

Read more »

Publishing Stats for Analytic Reuse – FAOStat Website and R Package

March 8, 2013
By
Publishing Stats for Analytic Reuse – FAOStat Website and R Package

How can stats and data publishers, from NGOs and (inter)national statistics agencies to scientific researchers, publish their data in a way that supports its analysis directly, as well as in combination with other datasets? Here’s one approach I learned about from Michael Kao of the UN Food and Agriculture Organisation statistics division, FAOStat. At first

Read more »

Cool GSS training video! And cumulative file 1972-2012!

March 8, 2013
By

Felipe Osorio made the above video to help people use the General Social Survey and R to answer research questions in social science. Go for it! Meanwhile, Tom Smith reports: The initial release of the General Social Survey (GSS), cumulative file for 1972-2012 is now on our website. Codebooks and copies of questionnaires will be The post Cool...

Read more »

Visualizing rOpenSci collaboration

March 8, 2013
By
Visualizing rOpenSci collaboration

We (rOpenSci) have been writing code for R packages for a couple years, so it is time to take a look back at the data. What data you ask? The commits data from GitHub ~ data that records who did what and when. Using the Github commits API we can gather data on who commited code to a...

Read more »

From OpenOffice noob to control freak: A love story with R, LaTeX and knitr

March 8, 2013
By
From OpenOffice noob to control freak: A love story with R, LaTeX and knitr

Lately I had to write a seminar paper for a class and I decided to overdo it.But let's start at the very beginning. Here is my evolution of how I used to write stuff and how I got from this:to that:School: OpenOffice - I guess everyone has some&nb...

Read more »

ddply in action

March 7, 2013
By
ddply in action

Top Batting Averages Over Time Top Batting Averages Over Time reference:http://www.baseball-databank.org/ ShortI'm going to use plyr and ggplot2 to look at how top batting averages have changed over time First load the data: options(width = 100)library(ggplot2) ## Warning message: package 'ggplot2' was built under R version 2.14.2 library(plyr)data(baseball)head(baseball) ## ...

Read more »

geom_point Legend with Custom Colors in ggplot

March 7, 2013
By
geom_point Legend with Custom Colors in ggplot

Formerly, I showed how to make line segments using ggplot.Working from that previous example, there are only a few things we need to change to add custom colors to our plot and legend in ggplot.First, we'll add the colors of our choice. I'll do th...

Read more »

ggplot ggoldy

March 7, 2013
By
ggplot ggoldy

One of my graduate students worked some ggplot magic and created an almost Light Bright-esqe plot of our very own Goldy Gopher. She also, thoughtfully, published a tutorial on her blog. Read and enjoy!

Read more »

Downloading CBS Fantasy Football Projections in R

March 7, 2013
By
Downloading CBS Fantasy Football Projections in R

In this post, I will show how to download CBS fantasy football projections using R. The R Script The R Script for downloading fantasy football projections from CBS is located The post Downloading CBS Fantasy Football Projections in R appeared first on Fantasy Football Analytics.

Read more »

Veterinary Epidemiologic Research: Linear Regression Part 2 – Checking assumptions

March 6, 2013
By
Veterinary Epidemiologic Research: Linear Regression Part 2 – Checking assumptions

We continue on the linear regression chapter the book Veterinary Epidemiologic Research. Using same data as last post and running example 14.12: Now we can create some plots to assess the major assumptions of linear regression. First, let’s have a look at homoscedasticity, or constant variance of residuals. You can run a statistical test, the

Read more »

Stan 1.2.0 and RStan 1.2.0

March 6, 2013
By
Stan 1.2.0 and RStan 1.2.0

Stan 1.2.0 and RStan 1.2.0 are now available for download. See: http://mc-stan.org/ Here are the highlights. Full Mass Matrix Estimation during Warmup Yuanjun Gao, a first-year grad student here at Columbia (!), built a regularized mass-matrix estimator. This helps for posteriors with high correlation among parameters and varying scales. We’re still testing this ourselves, so The post Stan...

Read more »

Let’s Do Some Hierarchical Bayes Choice Modeling in R!

March 6, 2013
By
Let’s Do Some Hierarchical Bayes Choice Modeling in R!

It can be difficult to work your way through hierarchical Bayes choice modeling.  There is just too much new to learn.  If nothing else, one gets lost in all ways that choice data can be collected and analyzed.  Then there is all this ou...

Read more »

Lambda.r 1.1.1 released (and introducing the EMPTY keyword)

March 6, 2013
By
Lambda.r 1.1.1 released (and introducing the EMPTY keyword)

I’m pleased to announce that lambda.r 1.1.1 is now available on CRAN. This release is mostly a bug fix release, …Continue reading »

Read more »

A volatility filter using historical vol

March 6, 2013
By
A volatility filter using historical vol

We have been looking at a way to improve risk adjusted returns by using a volatility filter. Although we could use VIX or equivalent, it turns out that historical volatility will work just as well, if not a little better.You can see part 1 here Digging into the VIX, and part 2 here What can we use...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

datasociety

http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training









Contact us if you wish to help support R-bloggers, and place your banner here.