R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

March 23, 2012
By
R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

Generalized Estimating Equations (GEE) can be used to analyze longitudinal count data; that is, repeated counts taken from the same subject or site. This is often referred to as repeated measures data, but longitudinal data often has more repeated obse...

Read more »

Gini Efficient Frontier

March 23, 2012
By
Gini Efficient Frontier

David Varadi have recently wrote two posts about Gini Coefficient: I Dream of Gini, and Mean-Gini Optimization. I want to show how to use Gini risk measure to construct efficient frontier and compare it with alternative risk measures I discussed previously. I will use Gini mean difference risk measure – the mean of the difference

Read more »

Serious stats – free statistics resources

March 23, 2012
By

The companion web site for Serious Stats is now live:http://www.palgrave.com/psychology/baguley/The web site includes:- a free sample chapter (Chapter 15: Contrasts)- data sets- R scripts- 5 online supplements (for meta-analysis, multiple imputation, r...

Read more »

Serious stats companion web site now live: sample chapter, data and R scripts

March 23, 2012
By
Serious stats companion web site now live: sample chapter, data and R scripts

The companion web site for Serious stats is now live: http://www.palgrave.com/psychology/Baguley/ It includes a sample chapter (Chapter 15: Contrasts), data sets, R scripts for all the examples and supplementary material. Filed under: news, R code, ser...

Read more »

Dissimilarity Between Soil Profiles: A Closer Look

March 23, 2012
By
Dissimilarity Between Soil Profiles: A Closer Look

Continuing the previous discussion of pair-wise dissimilarity between soil profiles, the following demonstration (code, comments, and figures) further elaborates on the method. A more in-depth discussion of this example will be included as a vignette w...

Read more »

Launching iButton Thermochrons with the help of R

March 23, 2012
By

Maxim's iButton Thermochron temperature dataloggers are little silver doo-dads the size of a large watch battery that can record up to 2048 time-stamped temperature values. The internal battery is usually good for a few years of use. Maxim supplies a J...

Read more »

R in Google Summer of Code 2012

March 23, 2012
By

This post is a slightly revised (and "blogified") version of the message Brian Peterson has sent to various R mailing lists.Once again, R has been accepted as a mentoring organization for the Google Summer of Code (2012).  We invite students interested in this program to learn more about it.  A good starting point...

Read more »

RStudio Development Environment

March 23, 2012
By
RStudio Development Environment

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R and if you’re not already accustomed to using IDEs for other The post RStudio...

Read more »

RStudio Development Environment

March 23, 2012
By
RStudio Development Environment

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R ...

Read more »

R, Twitter and McDonald’s

March 23, 2012
By
R, Twitter and McDonald’s

Ed Chen is a data scientist at Twitter, so he's accustomed to working with big data and complex models. In an interview with MIT Technology Review, he describes his data science toolbox: A common pattern for me is that I'll code a MapReduce job in Scala, do some simple command-line munging on the results, pass the data into Python...

Read more »

This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

March 23, 2012
By
This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

The NIH provides financial support for a large percentage of biological and medical research in the United States. This funding supports a large number of US jobs, creates new knowledge, and improves healthcare for everyone. So I am signing this petiti...

Read more »

Low (and high) volatility strategy effects

March 23, 2012
By
Low (and high) volatility strategy effects

Does minimum variance act differently from low volatility?  Do either of them act like low beta?  What about high volatility versus high beta? Inspiration Falkenblog had a post investigating differences in results when using different strategies for low volatility investing.  Here we look not at a single portfolio of a given strategy over time, but … Continue reading...

Read more »

Forecasts and ggplot

March 22, 2012
By

The forecast package uses the base R graphics for all plots, but some people may prefer to use the nice graphics available using the ggplot2 package. In the following two posts, Frank Davenport shows how it can be done: Plotting forecast() objects in ...

Read more »

Project Euler: Problem 20

March 22, 2012
By

n! means n x (n - 1) x ... x 3 x 2 x 1For example, 10! = 10 x 9 x ... x 3 x 2 x 1 = 3628800,and the sum of the digits in the number 10! is 3 + 6 + 2 + 8 + 8 + 0 + 0 = 27.Find the sum of the di...

Read more »

Do we appreciate sunbathing in Spring ?

March 22, 2012
By
Do we appreciate sunbathing in Spring ?

We are currently experiencing an extremely hot month in Montréal (and more generally in North America). Looking at people having a beer, and starting the first barbecue of the year, I was wondering: if we asked people if global warming was a good ...

Read more »

Using Ggplot2 to plot last.fm top 100 albums

March 22, 2012
By
Using Ggplot2 to plot last.fm top 100 albums

I found out that last.fm had made data files available for their Best of 2011 artist list, and I thought it'd be a great opportunity to learn some more about data management in R and Ggplot2.

Read more »

Tracking SFO Airport’s Performance Using R, HANA and D3

March 22, 2012
By
Tracking SFO Airport’s Performance Using R, HANA and D3

This is my first introduction to D3 and I am simply blown away.  Mike Bostock (@mbostock), you are genius and thanks for creating D3!  With HANA, R, D3, HTML5 and iPad, and you got yourself a KILLER combo!I have been burning my midnight oil o...

Read more »

Comparing Banzhaf and Shapley-Shubik power indices

March 22, 2012
By
Comparing Banzhaf and Shapley-Shubik power indices

Last week I analyzed Shapley-Shubik power index in R. I got several requests to write a code calculating Banzhaf power index. Here is the proposed code.Again I use data from Warsaw School of Economics rector elections (the details are in my last post)....

Read more »

Exponentiation of a matrix (including pseudoinverse)

March 22, 2012
By
Exponentiation of a matrix (including pseudoinverse)

The following function "exp.mat" allows for the exponentiation of a matrix (i.e. calculation of a matrix to a given power). The function follows three steps:1) Singular Value Decomposition (SVD) of the matrix2) Exponentiation of the singular values3) Re-calculation of the matrix with the new singular valuesThe most common case where the method is applied...

Read more »

Example 9.24: Changing the parameterization for categorical predictors

March 22, 2012
By
Example 9.24: Changing the parameterization for categorical predictors

In our book, we discuss the important question of how to assign different parameterizations to categorical variables when fitting models (section 3.1.3). We show code in R for use in the lm() function, as follows:lm(y ~ x, contrasts=list(x,"contr.trea...

Read more »

as.character() for rownames()

March 22, 2012
By
as.character() for rownames()

Rainer pointed out, in response to my post, Row names in data frames: Beware of 1:nrow, that if I’d used rownames(x) <- as.character(1:3) rather than rownames(x) <- 1:3, I wouldn’t have had the problem I’d seen. If you type rownames(x) you see the same result as rownames(z), and is.character(rownames(x)) and is.character(rownames(z)) both return TRUE, but

Read more »

Montreal R Workshop: Introduction to Bayesian Methods

March 22, 2012
By
Montreal R Workshop: Introduction to Bayesian Methods

Monday, March 26, 2012  14h-16h, Stewart Biology N4/17 Corey Chivers, Department of Biology McGill University This is a meetup of the Montreal R User Group. Be sure to join the group and RSVP. More information about the workshop here. Topics Why would we want to be Bayesian in the first place?  In this workshop we

Read more »

Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

March 21, 2012
By
Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot.  The scenario is that you are fitting a model to a time series object with training data, then forecas...

Read more »

Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

March 21, 2012
By
Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot.  The scenario is that you are fitting a model to a time series object with training data, then forecas...

Read more »

Metadata Dubuque and UHI

March 21, 2012
By
Metadata Dubuque and UHI

I’m in the process of remaking all the metadata from scratch and looking once again at the question of UHI. There are not any global conclusions we can draw from the data yet; I’m just in the process of checking out everything that is available that could be used to illuminate the problem. The problem,

Read more »

How to Generate Exponential Delays

March 21, 2012
By
How to Generate Exponential Delays

This question arose while addressing Comments on a previous blog post about exponentially distributed delays. One of my ongoing complaints is that many, if not most, popular load-test generation tools do not provide exponential variates as part of a library of time delays or think-time distributions. Not only is this situation bizarre, given that all load tests...

Read more »

R/Finance 2012 program announced, registration open

March 21, 2012
By

Registration is now open for R/Finance 2012 in Chicago, the conference devoted to applications of R in the financial sector. The program has also been announced, with topics including: modelling insurance claim reserves; risk management in power markets; peer performance of hedge funds; hedging event risk; operational risk measurement with R and RevoScaleR; and many other applications of R....

Read more »

Row names in data frames: beware of 1:nrow

March 21, 2012
By
Row names in data frames: beware of 1:nrow

I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x, as 1:nrow(x), R will act as if you’d not assigned row names, and the names might get changed when you do rbind. Here’s an illustration: As

Read more »

knitr, github, and a new phase for the lab notebook

March 21, 2012
By
knitr, github, and a new phase for the lab notebook

I have recently modified the basic workflow of my lab notebook since discovering knitr. Before, I would write code files which I could track on github, push figures created by the code to flickr, and then write a notebook entry on wordpress describing what I was doing. I’d embed each figure I wanted into the

Read more »