Ubuntu Developer Summit in Barcelona

Due to some things falling into place, I had an opportunity to attend the first two days of last week's Ubuntu Developer Summit in beautiful Barcelona. Somehow, I had never managed to attend a Debian conference either, so it was good to meet a few of the old Debian hands now moving Ubuntu along, as well as a few of the Ubuntu...

Read more »

Nice Interview

May 31, 2009
By
Nice Interview

Here you can read a nice interview with David Smith, REvolution Computing’s Director of Community, statistician and bloggeR.

Read more »

R used by KDD 2009 cup winner of slow challenge

May 31, 2009
By
R used by KDD 2009 cup winner of slow challenge

The results from the KDD Cup 2009 challenge (which we wrote about before) are in, and the winner of the slow challenge used the R statistical computing and analysis platform for their winning submission.

Read more »

R used by KDD 2009 cup winner of slow challenge

May 31, 2009
By
R used by KDD 2009 cup winner of slow challenge

The results from the KDD Cup 2009 challenge (which we wrote about before) are in, and the winner of the slow challenge used the R statistical computing and analysis platform for their winning submission.

Read more »

Emacs: AucTeX + Rubber + Sweave

May 30, 2009
By

I got rubber to work with auctex and sweave (Rnw) files with the help of this. Basically, combined with my other stuff, I tweaked my .emacs file to look like: ;;following is AucTeX with Sweave -- works ;;http://andreas.kiermeier.googlepages.com/essmaterials (setq TeX-file-extensions '("Snw" "Rnw" "nw" "tex" "sty" "cls" "ltx" "texi" "texinfo")) (add-to-list 'auto-mode-alist '("\\.Rnw\\'" . Rnw-mode)) (add-to-list

Read more »

JPM Chase Corporate Challenge 2009

The 28th annual JP Morgan Chase Corporate Challenge race took place a couple of days ago May 21. Participation was down from the record of 23,000 runners set last year at around 17,125. With splendid weather, it is always a nice way to start the Memorial day weekend. We fielded a small but spirited team of nine runners. I finished with...

Read more »

The R Journal, Issue 1 Volume 1

May 29, 2009
By

The R journal just published its inaugural peer-reviewed journal. Aligned with the open-source mantra, the journal is free and openly accessible. The journal features short articles on topics focused on R, including notes about new add-on packages, hints for R newcomers, application reports detailing examples of data analysis with R, and other news items. The current...

Read more »

R tips: Use read.table instead of strsplit to split a text column into multiple columns

May 29, 2009
By
R tips: Use read.table instead of strsplit to split a text column into multiple columns

Someone on the R-help mailing list had a data frame with a column containing IP addresses in quad-dot format (e.g. 1.10.100.200). He wanted to sort by this column and I proposed a solution involving strsplit. But Peter Dalgaard comes up with a much nicer method using read.table on a textConnection object:

Read more »

R tips: Use read.table instead of strsplit to split a text column into multiple columns

May 29, 2009
By
R tips: Use read.table instead of strsplit to split a text column into multiple columns

Someone on the R-help mailing list had a data frame with a column containing IP addresses in quad-dot format (e.g. 1.10.100.200). He wanted to sort by this column and I proposed a solution involving strsplit. But Peter Dalgaard comes up with a much nicer method using read.table on a textConnection object:

Read more »

R Journal 1/1

May 29, 2009
By
R Journal 1/1

R Journal 1/1 is out! Download it from here.

Read more »

Accessing Soil Survey Data via Web-Services

May 28, 2009
By
Accessing Soil Survey Data via Web-Services

Soil Survey Data   Online Querying of NRCS Soil Survey Data Sometimes you are only interested in soils data for a single map unit, component, or horizon. In these cases downloading the entire survey from Soil Data Mart is not worth the effort. An...

Read more »

Making Sense of Large Piles of Soils Information: Soil Taxonomy

May 27, 2009
By
Making Sense of Large Piles of Soils Information: Soil Taxonomy

Western Fresno Soil Hierarchy: partial view of the hierarchy within the US Soil Taxonomic system   Soil...

Read more »

Embeding fonts in figures produced by R

May 27, 2009
By

Some publishers insist that we embed (include) the fonts in each figure. Here is a set of links regarding this issue for figures produced by R:http://tolstoy.newcastle.edu.au/R/help/05/01/10779.htmlhttps://stat.ethz.ch/pipermail/r-help/2006-October/114...

Read more »

Embeding fonts in figures produced by R

May 27, 2009
By

Some publishers insist that we embed (include) the fonts in each figure. Here is a set of links regarding this issue for figures produced by R:http://tolstoy.newcastle.edu.au/R/help/05/01/10779.htmlhttps://stat.ethz.ch/pipermail/r-help/2006-October/114...

Read more »

R and data

May 26, 2009
By
R and data

My fellow bloggers John and Scott have posted recently about the free statistical programming language R.  How does it compare to an expensive language like SAS? If you’ve done any statistical analysis, then you’ll know that getting and cleaning the data is a major step in any project.  SAS does a pretty good job at

Read more »

Free one-day R course at Vanderbilt

May 26, 2009
By

The Vanderbilt Kennedy Center is offering a free (repeat, free) one-day introductory course to the R statistical computing language on June 23, taught by Theresa Scott from the department of Biostatistics. You can find contact/registration info at the link below.Vanderbilt Kennedy Center - An Introduction to the Fundamentals & Functionality of the R LanguageIn case you missed it,...

Read more »

More Recursion in R

May 26, 2009
By

I found another gem in R today. Earlier I commented about how R could do recursion, something that I love. I write some pretty complicated recursion functions in my research, but I also have a bad habit of compulsively reorganizing things. Now I've c...

Read more »

Simple Approach to Converting GRASS DB-backends

May 23, 2009
By

  Premise: The current default database back-end used by the GRASS vector model is DBF (as of GRASS 6.5), however this is probably going to be changed (to SQLite) in GRASS 7. The DBF back-end works OK, however it tends to be very sensitive (i.e. breaks) when reserved words occur in column names or...

Read more »

Temporary Debian mail outage

May 23, 2009
By

It would appear that debian.org rejected mail for maybe up to twelve hours from late yesterday afternoon (Central timezone) to some time shortly after I got up this morning. Things appear to be back to normal, so a big Thanks to the mail admins. If yo...

Read more »

Temporary Debian mail outage

It would appear that debian.org rejected mail for maybe up to twelve hours from late yesterday afternoon (Central timezone) to some time shortly after I got up this morning. Things appear to be back to normal, so a big Thanks to the mail admins. If you happened to have sent me mail to my debian.org address during that time period, you may have...

Read more »

Data.gov

May 21, 2009
By
Data.gov

I am always on the lookout for useful data sources for training in statistics, so I am excited that Data.gov has opened for business. The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the US Government.

Read more »

Data.gov

May 21, 2009
By
Data.gov

I am always on the lookout for useful data sources for training in statistics, so I am excited that Data.gov has opened for business. The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the US Government.

Read more »

Bootstrapping and the boot package in R

May 21, 2009
By
Bootstrapping and the boot package in R

I was recently asked about options for bootstrapping. The following post sets out some applications of bootstrapping and strategies for implementing it in R.I've found bootstrapping useful in several settings:where the statistic I'm interested in is a ...

Read more »

Baby steps with RSRuby in Rails

May 20, 2009
By
Baby steps with RSRuby in Rails

Plotting and charting libraries for Ruby (on Rails) abound. However, few are sophisticated enough for scientists and many are not actively maintained. Plotting in R, on the other hand, is about as sophisticated as it comes. Can we bridge Ruby and R? Yes we can, thanks to Alex Gutteridge’s RSRuby. The next

Read more »

Create multiple graphics in R without multiple calls to pdf / postscript / jpeg / png

May 14, 2009
By

To save multiple graphics, e.g, Rplot001.pdf, Rplot002.pdf, …, Rplot050.pdf, we don’t have to call pdf() 50 times (or any similar function). Use “Rplot%03d.pdf” for filename in pdf() and each plot() call will be saved to a new pdf file. Use dev.off() once at the end to close all devices! check out ?sprintf for more information

Read more »

Nonparametric High-Dimensional Time Series Analysis

Functional Gradient Descent (FGD) is a method of nonparametric time series analysis, useful in particular for estimating conditional mean, variances and covariances for very high-dimensional time series. FGD is a kind of hybrid of nonparametric statis...

Read more »

Negative Scalability Coefficients in Excel

May 12, 2009
By
Negative Scalability Coefficients in Excel

Recently, several performance engineers, who have been applying my universal scalability law (USL) to their throughput measurements, reported a problem whereby their Excel spreadsheet calculations produced a negative value for the coherency parameter (...

Read more »

Packages featured with Inference for R

May 12, 2009
By
Packages featured with Inference for R

quantmod, TTR, and xts were (not so) recently featured on the Inference for R Blog. Inference for R is a Integrated Development Environment (IDE) designed specifically for R.The post gives an example of how to easily perform advanced financial stock a...

Read more »

Analytic Infrastructure – Three Trends

May 11, 2009
By

This is a post about systems, applications, services and architectures for building and deploying analytics. Sometimes this is called analytic infrastructure. In this post, we look at several trends impacting analytic infrastructure. Trend 1. Open source analytics has reached Main Street. R, which was first released in 1996, is now

Read more »