## R script to calculate QIC for Generalized Estimating Equation (GEE) Model Selection

March 23, 2012
By

Generalized Estimating Equations (GEE) can be used to analyze longitudinal count data; that is, repeated counts taken from the same subject or site. This is often referred to as repeated measures data, but longitudinal data often has more repeated obse...

## Gini Efficient Frontier

March 23, 2012
By
$Gini Efficient Frontier$

David Varadi have recently wrote two posts about Gini Coefficient: I Dream of Gini, and Mean-Gini Optimization. I want to show how to use Gini risk measure to construct efficient frontier and compare it with alternative risk measures I discussed previously. I will use Gini mean difference risk measure – the mean of the difference

## Serious stats – free statistics resources

March 23, 2012
By

The companion web site for Serious Stats is now live:http://www.palgrave.com/psychology/baguley/The web site includes:- a free sample chapter (Chapter 15: Contrasts)- data sets- R scripts- 5 online supplements (for meta-analysis, multiple imputation, r...

## Serious stats companion web site now live: sample chapter, data and R scripts

March 23, 2012
By

The companion web site for Serious stats is now live: http://www.palgrave.com/psychology/Baguley/ It includes a sample chapter (Chapter 15: Contrasts), data sets, R scripts for all the examples and supplementary material. Filed under: news, R code, ser...

## Dissimilarity Between Soil Profiles: A Closer Look

March 23, 2012
By

Continuing the previous discussion of pair-wise dissimilarity between soil profiles, the following demonstration (code, comments, and figures) further elaborates on the method. A more in-depth discussion of this example will be included as a vignette w...

## Launching iButton Thermochrons with the help of R

March 23, 2012
By

Maxim's iButton Thermochron temperature dataloggers are little silver doo-dads the size of a large watch battery that can record up to 2048 time-stamped temperature values. The internal battery is usually good for a few years of use. Maxim supplies a J...

## R in Google Summer of Code 2012

March 23, 2012
By

This post is a slightly revised (and "blogified") version of the message Brian Peterson has sent to various R mailing lists.Once again, R has been accepted as a mentoring organization for the Google Summer of Code (2012).  We invite students interested in this program to learn more about it.  A good starting point...

## RStudio Development Environment

March 23, 2012
By

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R and if you’re not already accustomed to using IDEs for other The post RStudio...

## RStudio Development Environment

March 23, 2012
By

Compared to many other languages of equal popularity, there are realtively few development environments for R. In fact, the total number of production ready R IDEs could probably be counted on one hand. That deficiency is a small price to pay to use R ...

March 23, 2012
By

Ed Chen is a data scientist at Twitter, so he's accustomed to working with big data and complex models. In an interview with MIT Technology Review, he describes his data science toolbox: A common pattern for me is that I'll code a MapReduce job in Scala, do some simple command-line munging on the results, pass the data into Python...

## This graph shows that President Obama’s proposed budget treats the NIH even worse than G.W. Bush – Sign the petition to increase NIH funding!

March 23, 2012
By

The NIH provides financial support for a large percentage of biological and medical research in the United States. This funding supports a large number of US jobs, creates new knowledge, and improves healthcare for everyone. So I am signing this petiti...

## Low (and high) volatility strategy effects

March 23, 2012
By

Does minimum variance act differently from low volatility?  Do either of them act like low beta?  What about high volatility versus high beta? Inspiration Falkenblog had a post investigating differences in results when using different strategies for low volatility investing.  Here we look not at a single portfolio of a given strategy over time, but … Continue reading...

## Forecasts and ggplot

March 22, 2012
By

The forecast package uses the base R graphics for all plots, but some people may prefer to use the nice graphics available using the ggplot2 package. In the following two posts, Frank Davenport shows how it can be done: Plotting forecast() objects in ...

## Project Euler: Problem 20

March 22, 2012
By

n! means n x (n - 1) x ... x 3 x 2 x 1For example, 10! = 10 x 9 x ... x 3 x 2 x 1 = 3628800,and the sum of the digits in the number 10! is 3 + 6 + 2 + 8 + 8 + 0 + 0 = 27.Find the sum of the di...

## Do we appreciate sunbathing in Spring ?

March 22, 2012
By

We are currently experiencing an extremely hot month in Montréal (and more generally in North America). Looking at people having a beer, and starting the first barbecue of the year, I was wondering: if we asked people if global warming was a good ...

## Using Ggplot2 to plot last.fm top 100 albums

March 22, 2012
By

I found out that last.fm had made data files available for their Best of 2011 artist list, and I thought it'd be a great opportunity to learn some more about data management in R and Ggplot2.

## Tracking SFO Airport’s Performance Using R, HANA and D3

March 22, 2012
By

This is my first introduction to D3 and I am simply blown away.  Mike Bostock (@mbostock), you are genius and thanks for creating D3!  With HANA, R, D3, HTML5 and iPad, and you got yourself a KILLER combo!I have been burning my midnight oil o...

## Comparing Banzhaf and Shapley-Shubik power indices

March 22, 2012
By

Last week I analyzed Shapley-Shubik power index in R. I got several requests to write a code calculating Banzhaf power index. Here is the proposed code.Again I use data from Warsaw School of Economics rector elections (the details are in my last post)....

## Exponentiation of a matrix (including pseudoinverse)

March 22, 2012
By

The following function "exp.mat" allows for the exponentiation of a matrix (i.e. calculation of a matrix to a given power). The function follows three steps:1) Singular Value Decomposition (SVD) of the matrix2) Exponentiation of the singular values3) Re-calculation of the matrix with the new singular valuesThe most common case where the method is applied...

## Example 9.24: Changing the parameterization for categorical predictors

March 22, 2012
By

In our book, we discuss the important question of how to assign different parameterizations to categorical variables when fitting models (section 3.1.3). We show code in R for use in the lm() function, as follows:lm(y ~ x, contrasts=list(x,"contr.trea...

## as.character() for rownames()

March 22, 2012
By

Rainer pointed out, in response to my post, Row names in data frames: Beware of 1:nrow, that if I’d used rownames(x) <- as.character(1:3) rather than rownames(x) <- 1:3, I wouldn’t have had the problem I’d seen. If you type rownames(x) you see the same result as rownames(z), and is.character(rownames(x)) and is.character(rownames(z)) both return TRUE, but

## Montreal R Workshop: Introduction to Bayesian Methods

March 22, 2012
By

Monday, March 26, 2012  14h-16h, Stewart Biology N4/17 Corey Chivers, Department of Biology McGill University This is a meetup of the Montreal R User Group. Be sure to join the group and RSVP. More information about the workshop here. Topics Why would we want to be Bayesian in the first place?  In this workshop we

## Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

March 21, 2012
By

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot.  The scenario is that you are fitting a model to a time series object with training data, then forecas...

## Plotting forecast() objects in ggplot part 2: Visualize Observations, Fits, and Forecasts

March 21, 2012
By

In my last post I presented a function for extracting data from a forecast() object and formatting the data so that it can be plotted in ggplot.  The scenario is that you are fitting a model to a time series object with training data, then forecas...

March 21, 2012
By

I’m in the process of remaking all the metadata from scratch and looking once again at the question of UHI. There are not any global conclusions we can draw from the data yet; I’m just in the process of checking out everything that is available that could be used to illuminate the problem. The problem,

## How to Generate Exponential Delays

March 21, 2012
By

This question arose while addressing Comments on a previous blog post about exponentially distributed delays. One of my ongoing complaints is that many, if not most, popular load-test generation tools do not provide exponential variates as part of a library of time delays or think-time distributions. Not only is this situation bizarre, given that all load tests...

## R/Finance 2012 program announced, registration open

March 21, 2012
By

Registration is now open for R/Finance 2012 in Chicago, the conference devoted to applications of R in the financial sector. The program has also been announced, with topics including: modelling insurance claim reserves; risk management in power markets; peer performance of hedge funds; hedging event risk; operational risk measurement with R and RevoScaleR; and many other applications of R....

## Row names in data frames: beware of 1:nrow

March 21, 2012
By

I spent some time puzzling over row names in data frames in R this morning. It seems that if you make the row names for a data frame, x, as 1:nrow(x), R will act as if you’d not assigned row names, and the names might get changed when you do rbind. Here’s an illustration: As