## Down and Dirty Forecasting: Part 2

May 24, 2013
This is the second part of the forecasting exercise, where I am looking at a multiple regression. To keep it simple I chose the states that boarder WI and the US unemployment information for the regression. Again this is a down and dirty analysis, I wo...

## What is probabilistic truth? Part 2 – Everything is conditional

May 24, 2013
Read Part 1 When making a statement of the form “1/2 is the correct probability that this coin will land tails”, there are a few things which are left unsaid, but which are typically implied. The statement is one about the probability of an unknown event occurring, and it would seem reasonable to write this

## Down and Dirty Forecasting: Part 1

May 24, 2013
I wanted to see what I could do in a hurry using the commands found at Forecasting: Principles and Practice . I chose a simple enough data set of Wisconsin Unemployment from 1976 to the present (April 2013). I kept the last 12 months worth of...

## Compiling R from Source with OpenMP, Accelerate and MKL in OS X

May 24, 2013
Compiling R from Source in OS X I set out to find out whether I could speed up R by compiling it from source and: using Apple´s Accelerate Framework enabling OpenMP (which is disabled under OS X and Windows by default, but enabled under Linux) using Intel´s Intel´s Math Kernel Library I also wanted to know how an implicit parallel library,...

## Shiny + Concerto = YES !!!

May 23, 2013
So I have finally gotten beta access to the two most powerful R controlled web application makers in existence and produced very exciting experimental productsA few posts ago I posted a Visual Reasoning Test that I had made by hand and powered wit...

## Robert Hijmans on Spatial Data Analysis

May 23, 2013
Last week at the Davis R Users’ Group Robert Hijmans gave a talk about spatial data analysis in R. Robert is a professor of biogeography at UC Davis and the author of the raster (analysis of gridded data), dismo (species distribution modeling), and geosphere (spherical trigonometry), packages. Robert’s presentation spanned topics including basic...

## Working with shapefiles, projections and world maps in ggplot

May 23, 2013
In this post I show some different examples of how to work with map projections and how to plot the maps using ggplot. Many maps that are using the default projection are shown in the longlat-format, which is far from optimal. Here I show how to use either the Robinson or Winkel Tripel projection. Read more

## 7th R/Rmetrics workshop in Switzerland, June 30-July 4

May 23, 2013
The 7th annual R/Rmetrics Workshop om Computational Finance and Financial Engineering will take place June 30-July 4 in the beatiful alpine setting of Lake Thune, Switzerland. This is an intimate workshop limited to around 50 participants, and features tutorials from leading practitioners in finance with R, with a special focus on the Rmetrics suite of R packages. This year's...

## Highlights of the Milwaukee Workshop on R and Bioinformatics

May 23, 2013
by Joseph Rickert On May 10th and 11th, in honor of this being the International Year of Statistics, the Milwaukee Chapter of the American Statistical Association (MILWASA) held a workshop on cutting edge uses of R in Bioinformatics. One objective of the workshop was to show the "nuts and bolts" details of how R with C++ integration and the...

## Package MatchIt: Balancing experimental data

May 23, 2013
A balanced experimental design is one in which the distribution of the covariates is the same in both the control and treatment groups. However, although achievable in an experimental scenario, for observational data this ideal is seldom attained. The MatchIt package provides a means of pre-processing data so that the treated and control groups are as similar

## Veterinary Epidemiologic Research: Modelling Survival Data – Non-Parametric Analyses

May 23, 2013
Next topic from Veterinary Epidemiologic Research: chapter 19, modelling survival data. We start with non-parametric analyses where we make no assumptions about either the distribution of survival times or the functional form of the relationship between a predictor and survival. There are 3 non-parametric methods to describe time-to-event data: actuarial life tables, Kaplan-Meier method, and

## Generating a Markov chain vs. computing the transition matrix

May 23, 2013
$h\times h$

A couple of days ago, we had a quick chat on Karl Broman‘s blog, about snakes and ladders (see http://kbroman.wordpress.com/…) with Karl and Corey (see http://bayesianbiologist.com/….), and the use of Markov Chain. I do believe that this application is truly awesome: the example is understandable by anyone, and computations (almost any kind, from what we’ve tried) are easy to perform....

## The R-Podcast Episode 13: Interview with Yihui Xie

May 23, 2013
It’s an episode of firsts on the R-Podcast! In this episode recorded on location I had the honor and privilege of interviewing Yihui Xie, author of many innovative packages such as knitr and animation. Some of the topics we discussed include: Yihui’s motivation for creating knitr and some key new features How markdown plays a

## xkcd Style Bubble Plot

May 23, 2013
A package was recently released to generate plots in the style of xkcd using R. Being a big fan of the cartoon, I could not resist trying it out. So I set out to produce something like one of Hans Rosling’s bubble plots. First I needed some data. Spoilt for choice. I scraped some population data broken

## Investment Portfolio Analysis with R Language

May 22, 2013
R has a wide application in finance analysis areas such as time series analysis, portfolio management, and risk management, with its basic functions and many professional packages in Finance. In this article, we will demonstrate how to

## Vote in the KDnuggets poll on Analytics Software

May 22, 2013
The 14th annual KDnuggets poll measuring use of analytics software is open for voting. The poll asks, "What Predictive Analytics, Big Data, Data mining, Data Science software you used in the past 12 months for a real project?" and allows up to 20 choices from commercial software, open source software, and "big data" software. R was the leading choice...

## Big Data Analytics in R – the tORCH has been lit!

May 22, 2013
## Operating on files with R: copy and rename

Nowadays, routinary operations on files, such as renaming or copying, are performed with some mouse clicks. Sometimes, it is useful perform this operations in batch. Linux users perform this operations through the shell. Also Windows users can use the shell, … Continue reading →

## Package-defined S4 generic covered by a base S3 generic in R packages

May 22, 2013
While developing our agop package I encountered some problems with calling S4 generic functions defined in the Matrix package, that were created from “base” S3 generics. I don’t know whether it’s an R bug (tested in R 2.15 and R…Read more ›

## What happened to six million voters?

May 22, 2013
The recent elections in Pakistan on May 11 were a great success by all means. In spite of the threats for violence by Al-Qaeda and its local franchises in Pakistan against those who would vote, millions of Pakistanis indeed stepped out to vote for an elected government. The Election Commission of Pakistan (ECP) claimed a voter turnout of 60%....

## My Prime Sieve – Homage to Yitan Zhang

May 22, 2013
# As a homage to Yitang Zhang who has proven a mind-bending property of Prime Pairs, I have written a prime Sieve to detect all of the prime numbers from 1 to N. # There might very well be a function in the base package that already does this. No...

## Video: R, ProjectTemplate, RStudio and GitHub: Automate the boring bits and get on with the fun stuff

May 22, 2013
This post shares the video from the talk presented on 15th May 2013 by Dr Kendra Vant on ProjectTemplate, github and Rstudio at Melbourne R Users. Overview: Want to minimise the drudge work of data prep? Get started with test … Continue reading →

## Analytical and simulation-based power analyses for mixed-design ANOVAs

May 21, 2013
In this post I show some R-examples on how to perform power analyses for mixed-design ANOVAs. The first example is analytical—and adapted from formulas used in G*Power (Faul et al., 2007), and the second example is a Monte Carlo simulation. Read more

May 21, 2013
The OpenData StackExchange site has just launched in beta, and looks to be a great resource for open data sources. Like StackOverflow for programming and CrossValidated for statistics, OpenData is is a question and answer site for developers and researchers interested in open data. There's no R tag yet (though that would be nice for data sources specifically compatible...

## Getting to the point – an alternative to the bezier arrow

May 21, 2013
An alternative bezier arrow to the regular grid-bezier. Apart from a cool gradient it has the advantages of: exact width, exact start/end points and axis...

## Spatial correlograms in R: a mini overview

May 21, 2013
Spatial correlograms are great to examine patterns of spatial autocorrelation in your data or model residuals. They show how correlated are pairs of spatial observations when you increase the distance (lag) between them - they are plots of some index…Read more →

## Pivot Tables for R: Try sqldf

May 21, 2013
Pivot tables are a a growing staple for analysis in excel yet they remain limited to the functionality which Microsoft has chosen to include. Typical operations are the inclusion of filters, choice over rows, columns, and maths operations. In R … Continue reading →

## R Quick Tip: Shutdown Windows after Script Has Finished

May 21, 2013
Quite often I have long procedures running and want to do this over night. However, my computer would still be running all night after the script has finished. This is easily circumvented by the following lines that I put at the end of such a script:# set working dir# setwd("C:/Users/Kay/Desktop")# long procedure:for(i in 1:1e+5) {cat(i); cat("\n..................\n")}d # save...

## Package party: Conditional Inference Trees

May 21, 2013
I am going to be using the party package for one of my projects, so I spent some time today familiarising myself with it. The details of the package are described in Hothorn, T., Hornik, K., & Zeileis, A. (1999). “party: A Laboratory for Recursive Partytioning” which is available from CRAN. The main workhorse of