## Is this good or bad programming?

September 1, 2010
If I come across this kind of code when I'm checking (QCing) code it makes me want to punch the programmer's face. I find that it's impossible to step through and check each dataset with the previous incarnation. Which is how I check what h...

## apply() function and ABM in R

September 1, 2010
I know know...I've been away again...We (myself and Mark Lake) are presenting a paper at the CECD conference and we have still some to stuff to finish...so I'm really, really busy... I'll post asap a much more detailed post on the conference and o...

## Monte Carlo testing of classification groups

September 1, 2010
This is another article on the theme of defining groups in a hierarchical classification. A previous article described homogeneity analysis to visualize how any well any number of groups, defined at the same level accounts for the variability in the dataset, as measured by within-group pairwise distances. Here we will look at testing whether splitting a particular group...

## Using XML package vs. BeautifulSoup

August 31, 2010
A while back I posted something about scraping a webpage using the BeautifulSoup module in Python.  One of the comments to that post was by Larry — a blogger over at IEORTools — suggesting that I take a look at … Continue reading →

## Better than Average

August 31, 2010
The NIST's The Engineering Statistics Handbook includes an Introduction to Time Series Analysis which provides a great way of demonstrating how R can be used to make such calculations.  This post replicates the analys...

## apply functions in R

August 31, 2010
Getting to know the "apply"s in R is extremely handy for using the language efficiently and effectively. Unfortunately, the help files tend to be rather information-dense and are fairly overwhelming for newcomers. A recent blog post by Neil Saunders pr...

## Birds of a feather shop together

August 31, 2010
PREDICTING CONSUMER BEHAVIOR FROM SOCIAL NETWORKS This week, Decision Science News is doing a special cross-posting with Messy Matters. The post below is by Sharad Goel and describes work that he and your Decision Science News editor Dan Goldstein are jointly undertaking at Yahoo! Do you know what the #\$*! your social media strategy is?

## R is indispensable, because it’s reproducible

August 31, 2010
Maria Wolters, self-styled "Science-Mum of two" and speech and language technology researcher, has a great blog post about the one tool she couldn't live without: R. Maria says R is her "favourite tool for analysing experimental results and modelling the resulting patterns of behaviour and preferences", and explains why: R is a programming language for everything statistical. It’s free,...

## Soil Properties Visualized on a 1km Grid

August 31, 2010
Fresno Area Urban Areas vs Irrigated LCC: grey regions are current urban areas A couple of maps generated from a 1km gridded soil property database, derived from SSURGO data where available with holes filled with STATSGO data. Soil properties visualize...

## RClimate Tools for Do It Yourself Climate Trend Analysis

August 31, 2010
In this post I introduce my RClimate functions which allow R users to easily download and plot monthly temperature anomaly data for the 5 major global temperature anomaly data series: GISS, HAD, NOAA, RSS, UAH. Consolidated LOTA Data File In … Co...

## Namespaces and name conflicts

August 31, 2010
R packages ‘igraph’ and ‘network’ are good examples of two R packages providing similar but complementary functionalities for which there are a lot of name conflicts. As for now the ‘igraph’ package has a namespace while the ‘network’ package (version 1.4-1) does not. This became an issue when I was working on the ‘intergraph‘ package.

August 31, 2010
A few weeks ago I suddenly reached the point that every graduate student once thought would never come - time to start writing my thesis. With a blank page and a blinking cursor staring me in the face it's time to compile all of my published and unpubl...

## Even Simpler Multivariate Correlated Simulations

August 31, 2010
So after yesterday’s post on Simple Simulation using Copulas I got a very nice email that basically begged the question, “Dude, why are you making this so hard?” The author pointed out that if what I really want is a Gaussian correlation structure for Gaussian distributions then I could simply use the mvrnorm() function from

August 31, 2010
## Map colors

August 31, 2010
Reader P was kind enough to make us a new color map so I promptly played around with it and other parameters. Need to figure out how to drop the labels and ticks on the “map”  map.axes() is no help. In anycase, I had a day long struggle with my R set up,  its all

August 30, 2010
I’ve had a wonderful summer, very busy, but now I’ve finally had some time to sit down and program some thing on NppToR that I’ve been wanting to get out.  Thanks to Yihui Xie and his wonderful R script for generating auto-completion files, NppToR now has a dynamic Auto-Completion feature like the Dynamic Syntax generation

## Econometrics and R

August 30, 2010
Econometricians seem to be rather slow to adopt new methods and new technology (compared to other areas of statistics), but slowly the use of R is spreading. I’m now receiving requests for references showing how to use R in econometrics, and so I thought it might be helpful to post a few suggestions here. A

## Hyper-g priors

August 30, 2010
$Hyper-g priors$

Earlier this month, Daniel Sabanés Bové and Leo Held posted a paper about g-priors on arXiv. While I glanced at it for a few minutes, I did not have the chance to get a proper look at it till last Sunday. The g-prior was first introduced by the late Arnold Zellner for (standard) linear models,

## The Chosen One

August 30, 2010
Toss one hundred different balls into your basket. Shuffle them up and select one with equal probability amongst the balls. That ball you just selected, it’s special. Before you put it back, increase its weight by 1/100th. Then put it back, mix up the balls and pick again. If you do this enough, at some

## Stochastic Simulation With Copulas in R

August 30, 2010
A friend of mine gave me a call last week and was wondering if I had a little R code that could illustrate how to do a Cholesky decomposition. He ultimately wanted to build a Monte Carlo model with correlated variables. I pointed him to a number of packages that do Cholesky decomp but then

August 30, 2010
Once you've downloaded PDQ with a view to solving your performance-related questions, the next step is getting started using it. Why not have some fun with blocks? Fun-ctional blocks, that is. Since all digital computers and network systems can be considered as a collection of functional blocks and these blocks often contain buffers, their performance can be modeled...

## Taking R to the Limit: Large Datasets; Predictive modeling with PMML and ADAPA

August 30, 2010
During the first part of our meeting, Ryan Rosario presented on the topic of large datasets in R. Video, slides and code of the talk “Taking R to the Limit: Large Datasets” by Ryan Rosario at the Los Angeles area … Continue reading →

## Sweet bar chart o’ mine

August 30, 2010
Last week I was asked to visualise some heart rate data from an experiment. ... The standard way of displaying a time series (that is, a numeric variable that changes over time) is with a line plot. ... The experimenters, however, wanted a bar chart. I hadn't considered this use of a barchart before, so it was interesting...