R is indispensable, because it’s reproducible

August 31, 2010
By

Maria Wolters, self-styled "Science-Mum of two" and speech and language technology researcher, has a great blog post about the one tool she couldn't live without: R. Maria says R is her "favourite tool for analysing experimental results and modelling the resulting patterns of behaviour and preferences", and explains why: R is a programming language for everything statistical. It’s free,...

Read more »

Soil Properties Visualized on a 1km Grid

August 31, 2010
By
Soil Properties Visualized on a 1km Grid

Fresno Area Urban Areas vs Irrigated LCC: grey regions are current urban areas A couple of maps generated from a 1km gridded soil property database, derived from SSURGO data where available with holes filled with STATSGO data. Soil properties visualize...

Read more »

Namespaces and name conflicts

August 31, 2010
By
Namespaces and name conflicts

R packages ‘igraph’ and ‘network’ are good examples of two R packages providing similar but complementary functionalities for which there are a lot of name conflicts. As for now the ‘igraph’ package has a namespace while the ‘network’ package (version 1.4-1) does not. This became an issue when I was working on the ‘intergraph‘ package.

Read more »

Writing my Thesis – Follow me on Twitter

August 31, 2010
By

A few weeks ago I suddenly reached the point that every graduate student once thought would never come - time to start writing my thesis. With a blank page and a blinking cursor staring me in the face it's time to compile all of my published and unpubl...

Read more »

Even Simpler Multivariate Correlated Simulations

August 31, 2010
By
Even Simpler Multivariate Correlated Simulations

So after yesterday’s post on Simple Simulation using Copulas I got a very nice email that basically begged the question, “Dude, why are you making this so hard?” The author pointed out that if what I really want is a Gaussian correlation structure for Gaussian distributions then I could simply use the mvrnorm() function from

Read more »

Zurich 2010: R Course for Students

August 31, 2010
By

(This article was first published on Rmetrics blogs, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Rmetrics blogs. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web...

Read more »

Map colors

August 31, 2010
By
Map colors

Reader P was kind enough to make us a new color map so I promptly played around with it and other parameters. Need to figure out how to drop the labels and ticks on the “map”  map.axes() is no help. In anycase, I had a day long struggle with my R set up,  its all

Read more »

NppToR 2.4.0 Adds Auto-Completion

August 30, 2010
By

I’ve had a wonderful summer, very busy, but now I’ve finally had some time to sit down and program some thing on NppToR that I’ve been wanting to get out.  Thanks to Yihui Xie and his wonderful R script for generating auto-completion files, NppToR now has a dynamic Auto-Completion feature like the Dynamic Syntax generation

Read more »

Econometrics and R

August 30, 2010
By
Econometrics and R

Econometricians seem to be rather slow to adopt new methods and new technology (compared to other areas of statistics), but slowly the use of R is spreading. I’m now receiving requests for references showing how to use R in econometrics, and so I thought it might be helpful to post a few suggestions here. A

Read more »

Hyper-g priors

August 30, 2010
By
Hyper-g priors

Earlier this month, Daniel Sabanés Bové and Leo Held posted a paper about g-priors on arXiv. While I glanced at it for a few minutes, I did not have the chance to get a proper look at it till last Sunday. The g-prior was first introduced by the late Arnold Zellner for (standard) linear models,

Read more »

The Chosen One

August 30, 2010
By
The Chosen One

Toss one hundred different balls into your basket. Shuffle them up and select one with equal probability amongst the balls. That ball you just selected, it’s special. Before you put it back, increase its weight by 1/100th. Then put it back, mix up the balls and pick again. If you do this enough, at some

Read more »

Stochastic Simulation With Copulas in R

August 30, 2010
By
Stochastic Simulation With Copulas in R

A friend of mine gave me a call last week and was wondering if I had a little R code that could illustrate how to do a Cholesky decomposition. He ultimately wanted to build a Monte Carlo model with correlated variables. I pointed him to a number of packages that do Cholesky decomp but then

Read more »

Where to Start with PDQ?

August 30, 2010
By
Where to Start with PDQ?

Once you've downloaded PDQ with a view to solving your performance-related questions, the next step is getting started using it. Why not have some fun with blocks? Fun-ctional blocks, that is. Since all digital computers and network systems can be considered as a collection of functional blocks and these blocks often contain buffers, their performance can be modeled...

Read more »

Where to Start with PDQ?

August 30, 2010
By
Where to Start with PDQ?

Once you've downloaded PDQ with a view to solving your performance-related questions, the next step is getting started using it. Why not have some fun with blocks? Fun-ctional blocks, that is. Since all digital computers and network systems can be considered as a collection of functional blocks and these blocks often contain buffers, their performance can be modeled...

Read more »

Taking R to the Limit: Large Datasets; Predictive modeling with PMML and ADAPA

August 30, 2010
By
Taking R to the Limit: Large Datasets; Predictive modeling with PMML and ADAPA

During the first part of our meeting, Ryan Rosario presented on the topic of large datasets in R. Video, slides and code of the talk “Taking R to the Limit: Large Datasets” by Ryan Rosario at the Los Angeles area … Continue reading →

Read more »

Sweet bar chart o’ mine

August 30, 2010
By
Sweet bar chart o’ mine

Last week I was asked to visualise some heart rate data from an experiment. ... The standard way of displaying a time series (that is, a numeric variable that changes over time) is with a line plot. ... The experimenters, however, wanted a bar chart. I hadn't considered this use of a barchart before, so it was interesting...

Read more »

Example 8.3: pyramid plots

August 30, 2010
By
Example 8.3: pyramid plots

Pyramid plots are a common way to display the distribution of age groups in a human population. The percentages of people within a given age category are arranged in a barplot, often back to back. Such displays can be used distinguish males vs. femal...

Read more »

Wanted: R Analysis of New Scientist Covers

August 30, 2010
By
Wanted: R Analysis of New Scientist Covers

Peter Aldhous and Jim Giles -- from New Scientist's San Francisco bureau -- are looking for a statistician and R user to take part in an interesting data analysis challenge, and also be part of a future article in the magazine. They were inspired by this rather tongue-in-cheek presentation where Sebastian Wernicke analyzed videos, transcripts and ratings of TED...

Read more »

US House Election Results Visualized Five Ways

August 30, 2010
By
US House Election Results Visualized Five Ways

The Democratic major-party vote share of US House elections 2002-2008 visualized 5 different ways.

Read more »

Graphing Highly Skewed Data

August 30, 2010
By
Graphing Highly Skewed Data

Graphing data with a few outliers is challenging, and some solutions are better than others. Here is a comparison of the alternatives.

Read more »

GEO database: curation lagging behind submission?

August 30, 2010
By
GEO database: curation lagging behind submission?

I was reading an old post that describes GEOmetadb, a downloadable database containing metadata from the GEO database. We had a brief discussion in the comments about the growth in GSE records (user-submitted) versus GDS records (curated datasets) over time. Below, some quick and dirty R code to examine the issue, using the Bioconductor GEOmetadb

Read more »

MCMC Diagnostics in R with the coda Package

August 29, 2010
By
MCMC Diagnostics in R with the coda Package

This is a follow up to my recent post introducing the use of JAGS in R through the rjags package. In the comments on that post, Bernd Weiss encouraged me to write a short addendum that describes diagnostic functions that you should use to assess the output from an MCMC sampler. I’ve only been using

Read more »

Beta translation done!

August 29, 2010
By
Beta translation done!

Once my team of four translators had handed back to me all the chapters of the French version of Introducing Monte Carlo Methods with R to me, I had to go over the book to ensure some minimal consistency between the chapters. I started the editing in the plane to Vancouver but did not get

Read more »

SST with Raster. Complete

August 29, 2010
By
SST with Raster. Complete

Update: new zip, correcting bug found by Steve  McIntyre: if(!file.exists(HadSST2ncdf)) downloadHADSST2() if(!file.exists(HadSST2ncdf)) downloadHadSST2() issue pending with another line as well. Checking raster versions. I’ve also, added some code into “downloadHadSST2″ that corrects for the “NA” problem with HadSST. (currently commented out). There is an issue with “ncdf” handling CF standards, which has been addressed in

Read more »

Subset views in R

August 28, 2010
By
Subset views in R

I don’t know how to do this in R. So let me just say why I can’t. I wanted something akin to Boost‘s sub-matrix views, where you can have indexes map back to the original matrix, so you don’t create … Continue reading →

Read more »

Blegging for Data

August 28, 2010
By

I’m in the middle of a new project that involves analyzing the packages that are currently on CRAN. As part of my work, I could really benefit from information about which packages are installed on people’s computers. If you’re willing to part with a bit of your time and privacy, I’d very much appreciate you

Read more »

Patrick Burns is blogging

August 28, 2010
By
Patrick Burns is blogging

Patrick Burns is the author of several helpful R resources, including A Guide for the Unwilling S User, The R Inferno, and S Poetry. He also wrote one of my favorite critiques of Microsoft Excel: Spreadsheet Addiction. His writing is witty, entertain...

Read more »

Mike’s CNC 2010-08-27 18:36:00

August 27, 2010
By

Support the OpenGov idea to create a "Platform for number crunchers across (US Federal) government" HERE. A small team is building a small pilot and I'm happy to report that R appears on many of the posts. If you like the idea (or even if you don't),...

Read more »

Mike’s CNC 2010-08-27 18:36:00

August 27, 2010
By

Support the OpenGov idea to create a "Platform for number crunchers across (US Federal) government" HERE. A small team is building a small pilot and I'm happy to report that R appears on many of the posts. If you like the idea (or even if you don't),...

Read more »