Podcast #5: Coursera Debrief

November 19, 2012
By

Jeff and I talk with Brian Caffo about teaching MOOCs on Coursera.

Read more »

Gathering RealClearPolitics Polling Trends with XML

November 19, 2012
By
Gathering RealClearPolitics Polling Trends with XML

Now that the election is over, you may want to use polling data in a model of the campaign. Simon Jackman has thoughtfully made his daily state-by-state predictions available for download, but a commonly-used dataset is the RealClearPolitics polling a...

Read more »

The estimation of Value at Risk and Expected Shortfall

November 19, 2012
By
The estimation of Value at Risk and Expected Shortfall

An introduction to estimating Value at Risk and Expected Shortfall, and some hints for doing it with R. Previously “The basics of Value at Risk and Expected Shortfall” provides an introduction to the subject. Starting ingredients Value at Risk (VaR) and Expected Shortfall (ES) are always about a portfolio. There are two basic ingredients that … Continue reading...

Read more »

The Heteroskedastic Probit Model

November 19, 2012
By
The Heteroskedastic Probit Model

Specification testing is an important part of econometric practice. However, from what I can see, few researchers perform heteroskedasticity tests after estimating probit/logit models. This is not a trivial point. Heteroskedasticity in these models can represent a major violation of the probit/logit specification, both of which assume homoskedastic errors. Thankfully, tests for heteroskedasticity in these

Read more »

Italian bioR Day at PTP

November 19, 2012
By

On the 30th of November 2012 Parco Tecnologico Padano (PTP) Lodi, will host the event "Italian BioR Day". Italian BioR Day, promoted by Parco Tecnologico Padano (PTP) and Quantide srl, is linked to the events organized by MilanoR. It will … Continue reading →

Read more »

Momentum in R: Part 3

November 18, 2012
By
Momentum in R: Part 3

In the previous post, I demonstrated simple backtests for trading a number of assets ranked based on their 3, 6, 9, or 12 (i.e lookback periods) month simple returns. While it was not an exhaustive backtest, the results showed that when trading the top 8 ranked assets, the ranking based 3, 6, 9, and 12 … Continue reading...

Read more »

Genome annotation with NCBI2R

November 18, 2012
By

It's very convenient manage data with R: you can import your dataset, you could find many packages which respond to your needs, then you could plot your results. However it could be very bothersome retrieve the data from online databases. … Continue reading →

Read more »

R and SQLite: Part 1

November 18, 2012
By

Creating SQLite databases from R

Welcome to Simply Statistics 2.0

November 18, 2012
By

Welcome to the re-designed, re-hosted and re-platformed Simply Statistics blog. We have moved the blog over to the WordPress platform to give us some newer features that were lacking over at tumblr. So far the transition has gone okay but … Continue reading →

Read more »

Interactive Scenarios With Shiny – The Race to the F1 2012 Drivers’ Championship

November 18, 2012
By
Interactive Scenarios With Shiny – The Race to the F1 2012 Drivers’ Championship

In Paths to the F1 2012 Championship Based on How They Might Finish in the US Grand Prix I posted a quick hack to calculate the finishing positions that would determine the F1 2012 Drivers’ Championship in today’s United States Grand Prix, leaving a tease dangling around the possibility of working out what combinations would

Read more »

Secret Santa – again

November 18, 2012
By

Based on comments by cellocgw I decided to look at last week's Secret Santa again. This time, the moment a person, whoever that is, draws his/her own name, the drawing starts again at the first person.IntroductionA group of n persons draws sequentially...

Read more »

Sunday Data/Statistics Link Roundup (11/18/12)

November 18, 2012
By

An interview with Brad Efron about scientific writing. I haven’t watched the whole interview, but I do know that Efron is one of my favorite writers among statisticians. Slidify, another approach for making HTML5 slides directly from R.  I love … Continue reading →

Read more »

The new definitive guide for setting up Eclipse, StatET, and R on Windows 7

November 17, 2012
By

Quite a while back I wrote some tutorials on getting the StatET plugin for Eclipse running, so that you can write R code and run it within the Eclipse development environment. The developers of all of these pieces of software have kept marching on with...

Read more »

Datacentric product development and the rebirth of engineering

November 17, 2012
By
Datacentric product development and the rebirth of engineering

An old irony in New York is the ubiquity of the ‘gourmet deli’. It is hard to find a deli …Continue reading »

Read more »

More sense of random effects

November 17, 2012
By
More sense of random effects

I can’t exactly remember how I arrived to Making sense of random effects, a good post in the Distributed Ecology blog (go over there and read it). Incidentally, my working theory is that I follow Scott Chamberlain (@recology_), who follows … Continue reading →

Read more »

Get the exit polls from CNN using R and Python

November 17, 2012
By

Yesterday I posted an example of plotting 2012 U.S. presidential exit poll results using ggplot2. There I took for granted that a data.frame containing all we need resides in a file called "PresExitPolls2012.Rdata". Today I want to show how I scraped t...

Read more »

Visualizing Missing Data

November 17, 2012
By
Visualizing Missing Data

There are several graphics available for visualizing missing data including the VIM package. However, I wanted a plot specifically for looking at the nature of missingness across variables and a clustering variable of interest to support data preparati...

Read more »

Visualizing Missing Data

November 17, 2012
By
Visualizing Missing Data

There are several graphics available for visualizing missing data including the VIM package. However, I wanted a plot specifically for looking at the nature of missingness across variables and a clustering variable of interest to support data preparati...

Read more »

Using R — Packaging a C library in 15 minutes

November 16, 2012
By

This entry is part 14 of 12 in the series Using RYes, this post condenses 50+ hours of learning into a 15 minute tutorial.  Read ‘em and weep.  (That is, you read while I weep.) OK.  For the last week …   read more ...

Read more »

RcppArmadillo 0.3.4.4

November 16, 2012
By

A minor bug-fix release 3.4.4 of Armadillo came out upstream a few days ago. RcppArmadillo, our wrapper for R and Armadillo, is now on CRAN with its corresponding version 0.3.4.4. No R level or interface changes were made and the upstream changes are ...

Read more »

The Race to the F1 2012 Drivers’ Championship – Initial Sketches

November 16, 2012
By
The Race to the F1 2012 Drivers’ Championship – Initial Sketches

In part inspired by the chart described in The electoral map sans the map, I thought I’d start mulling over a quick sketch showing the race to the 2012 Formula One Drivers’ Championship. The chart needs to show tension somehow, so in this first really quick and simple rough sketch, you really do have to

Read more »

Parallelized Back Testing

November 16, 2012
By

As mentioned earlier, currently I am playing with trading strategies based on Support Vector Machines. At a high level, the approach is quite similar to what I have implemented for my ARMA+GARCH strategy. Briefly, the simulation goes as follows: we step through the series one period (day, week, etc) at a time. For each period,

Read more »

Making sense of random effects

November 16, 2012
By
Making sense of random effects

The other night in my office I got into a discussion with my office mate, the brilliant scientist / amazing skier Dr. Thor Veen about how to understand the random effect variance term in a mixed-effects model. Thor teaches the R statistics course here at UBC, and last night a student came to the office...

Read more »

VIDEO: Looking to the regression coefficients in R

November 16, 2012
By
VIDEO: Looking to the regression coefficients in R

(This article was first published on NIR-Quimiometria, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on their blog: NIR-Quimiometria. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

Which programming language is the most concise?

November 16, 2012
By
Which programming language is the most concise?

An expressive programming language allows developers to implement algorithms quickly, by using high-level concepts and leaving the details to the language implementation. The result is clearer, more maintainable code that can be created in less time. (Although shorter code isn't always better, especially when taken to extremes.) So which programming languages use the least code, when compared on an...

Read more »

Simulating Sudden Oak Death Dynamics

November 16, 2012
By
Simulating Sudden Oak Death Dynamics

I am working on a project with the Rizzo Lab examining the dynamics of Sudden Oak Death (SOD). I really have to write more about this, but today I’m just going to post the results of an initial exercise. Here I attempt to replicate model results from Cobb et al. (2012). The model in that paper simulates...

Read more »

Revolution Newsletter: November 2012

November 16, 2012
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full November edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Now Available: Revolution R Enterprise 6.1 The latest release of Revolution Analytics' enterprise-ready data...

Read more »

Excel + Cytoscape + R = ExCytR

November 16, 2012
By
Excel + Cytoscape + R = ExCytR

My new project is coming along nicely and should be released early 2013. It builds on the structures developed in imDEV to link Excel, Cytoscape and R using RExcel,  RCytoscape, and CytoscapeRPC . This trio can be used to rapidly generate beautiful and  informative network representations of data. Here is an example of a  undirected Gaussian graphical

Read more »

Logo Contest Winner

November 16, 2012
By
Logo Contest Winner

Congratulations to Bradley Saul, the winner of the Simply Statistics Logo contest! We had some great entries which made it difficult to choose between them. You can see the new logo to the right of our home page or the … Continue reading →

Read more »

Sponsors