## PC-Axis with R: pxR

PC-Axis is a software family consisting of a number of programs for the Windows and Internet environment used to present statistical information. It is used by national and international institutions to publish statistical data. Programs in the PC-Axis family use a particular data file format (see the full PX-Axis data format description). Now the pxR

## BSE Bhavcopy with Delivery Quantity

July 24, 2011
By

One of my TI forum members IV had a requirement for BSE Quotes along with Delivery Quantity. This made me implement "merge" function of R coding (thanks to the great work done by people behind various packages and guidance available on R Mailing lists)...

## Parallel JAGS RNGs

July 23, 2011
By

As a matter of convention, we usually run 3 or 4 chains in JAGS. By default, this gives rise to chains that draw samples from 3 or 4 distinct pseudorandom number generators. I didn’t go and check whether it does things 111,222,333 or 123,123,123, but in any event the “parallel chains” in JAGS are samples

July 23, 2011
By

Earlier this week, Conrad Sanderson issued a minor bug fix release 2.0.2 of his Armadillo library which provides templated C++ code for linear algebra. We wrapped that into a new RcppArmadillo release 0.2.26 and shipped it to CRAN. Due to it being su...

## Passing non-graphical parameters to graphical functions using …

July 23, 2011
By

Argument passing via ... is a great feature of the R language, allowing you to write wrappers around existing functions that do not need to list all the arguments of the wrapped function. ... is used extensively in S3 methods … Continue reading →

## Passing non-graphical parameters to graphical functions using …

July 23, 2011
By

Argument passing via ... is a great feature of the R language, allowing you to write wrappers around existing functions that do not need to list all the arguments of the wrapped function. ... is used extensively in S3 methods and in passing graphical parameters on to graphical functions. When writing you own plot methods, using ... allows the...

## RomanM’s Method

July 22, 2011
By

I’ve succeeded in getting a version of RomanM and JeffId’s Thermal hammer working with version 1.3 of RghcnV3. This is going to be a long post because there is a lot of ground to cover. First, some errata, the “Globe” demo in V1.3 appears to have a missing line, looks like an editor bug, so

## Clustering U.S. Senators using roll call voting data

July 22, 2011
By

For our forthcoming book on machine learning for hackers, John Myles White and I will discuss clustering, and various methods for doing so. One common method for clustering observations

## IBM Netezza: Embrace open source analytics

July 22, 2011
By

Earlier this month Thomas Dinsmore, solutions architect for IBM Netezza’s Advanced Analytics team, had a great blog post on why companies should embrace R as an analytics platform. He says: There are three main reasons R should be part of your enterprise analytics architecture: R has capabilities not available in commercial analytics software Usage of R by analysts is...

## A bit of fun with R

July 22, 2011
By

R isn't just about serious things like model inference and prediction intervals and big analytics. Sometimes, R lets its hair down and just does weird and wonderful things because ... well just because. For example, with a package from Paulo Sonego, it can display your favourite XKCD cartoon: > install.packages("RXKCD", repos="http://R-Forge.R-project.org",type="source") > searchXKCD("support") num title 1 8 Red spiders...

## A Plot of 250 Random Walks

July 22, 2011
By

For some reason I feel like plotting some random walks. Nothing groundbreaking, but hopefully this post will be useful to someone. Here's my R code:# Generate k random walks across time {0, 1, ... , T}T <- 100k <- 250initial.value <- 10GetRa...

## Le Monde puzzle [#28]

July 22, 2011
By

The puzzle of last weekend in Le Monde was about finding the absolute rank of x9 when given the relative ranks of x1,….,x8 and the possibility to ask for relative ranks of three numbers at a time. In R terms, this means being able to use or yet being able to sort the first 8

## Uwe Ligges joins R-core

July 22, 2011
By

TU Dortmund professor Uwe Ligges is now a member of R-core, the group of 20 leading statisticians and computer scientists who oversee the R Project and develop and maintain the source code for the R engine and its core packages. Uwe has been very active in the R project for many years: he maintains the system that builds Windows...

## Parallel random forests using foreach

July 22, 2011
By

There's been some discussion on the kaggle forums and on a few blogs about various ways to parallelize random forests, so I thought I'd add my thoughts on the issue.Here's my version of the 'parRF' function, which is based on the elegant version in the...

## Prepping for useR! 2011 – tty connection update

July 22, 2011
By

I'm putting together my presentation for useR! 2011 titled "Experimenting with a tty connection for R". Hence, I've updated the tty connection patch to work with R versions 2.13.0 and 2.13.1. And, instead of re-listing the patch files and re-writing instructions on their application, I've devoted a small portion of my Code page for this

## A Quick Look At Unemployment

July 21, 2011
By

Labor market tightness is defined as the vacancies or job openings rate divided by the unemployment rate.  The theory goes that as job openings increase relative to the unemployment rate a tightness is created in that workers get the upper hand in...

## Smoothing temporally correlated data

July 21, 2011
By
$Smoothing temporally correlated data$

Something I have been doing a lot of work with recently are time series data, to which I have been fitting additive models to describe trends and other features of the data. When modelling temporally dependent data, we often need … Continue reading →

July 21, 2011
By

R reminds me a lot of English. It’s easy to get started, but very difficult to master. So for all those times I’ve spent… well, forever… trying to figure out the “R way” of doing something, I’m glad to share these quick wins. My recent R tutorial on mining Twitter for consumer sentiment wouldn’t have

## Smoothing temporally correlated data

July 21, 2011
By

Something I have been doing a lot of work with recently are time series data, to which I have been fitting additive models to describe trends and other features of the data. When modelling temporally dependent data, we often need to adjust our fitted models to account for the lack of independence in the model residuals. When smoothing such...

## Showcasing the latest phylogenetic methods: AUTEUR

July 20, 2011
By

While high-speed fish feeding videos may be the signature of the lab, dig a bit deeper and you’ll find a wealth of comparative phylogenetic methods sneaking in.  It’s a natural union — expert functional morphology is the key to good comparative methods, just as phylogenies hold the key to untangling the evolutionary origins of that

## Regional differences on what drives CO2 emissions

July 20, 2011
By

If you are investigating the change of CO2 emissions, then you might ask: Where do the changes occur? Well here is the answer.The staircase plots show the contributing factors to CO2 emissions for each continent. population refers to population effects, gdp_pcap refers to income per capita, energy_intensity refers to energy used per dollar added value, and carbon intensity...

## Slides for Reproducible Research Talk at Interface 2011

July 20, 2011
By

I gave a talk at the Interface Symposium on reproducible research in practice. I went first in the session, so the slides have a bit more background and philosophy. It was a great session; one of Jon Claerbout's colleagues spoke, Sergey Fomel, a founding author of Madagascar; Sorin Mitran from UNC Chapel Hill talked about

July 20, 2011
By

A few days ago I heard a talk about Simpson's paradox, and I decided to write a little example in R:library(MASS) # For multivariate normals# List of (vectors of) meansmu <- list(c(5, 175), c(6.25, 110))# List of covariance matricessigma ...

## Visualizing Kickstarter Projects with R

July 20, 2011
By

Kickstarter, a social funding platform where individuals can chip in cash to get a worthy project going, just celebrated their 10,000th kickstarted project. Kickstart employee Fred Benenson recognized the achievement by visualizing the funding of music, design, art, game and many other kinds of projects using R and ggplot2. For example, here's a chart that shows the increasing rate...

## Showcasing the latest phylogenetic methods: AUTEUR

July 20, 2011
By

While high-speed fish feeding videos may be the signature of the lab, dig a bit deeper and you’ll find a wealth of comparative phylogenetic methods sneaking in.  It’s a natural union — expert functional morphology is the key to good … Continue reading →

## Shorting Mebane Faber

July 19, 2011
By

Although I do not personally know Mebane Faber, I know enough that I do not want to short him. However, I thought it would be insightful to see how the short side of his “A Quantitative Approach To Tactical Asset Allocation” might look.  Once ...

## The Road to Default: The Other Side of the Story

July 19, 2011
By

Okay so I was gliding through the articles of CNBC.com and stumbled upon one titled, "A Downgrade of U.S. Debt Won't Matter as Much as You Think." The argument laid down in this piece is that insurance companies and pension funds are required to hold h...

## Looking for NppToR beta testers.

July 19, 2011
By

NppToR 2.6 is coming with improved flexibility and speed. Testers needed before setting as default.