Create annotated GWAS manhattan plots using ggplot2 in R

March 18, 2010
By

A few months ago I showed you in this post how to use some code I wrote to produce manhattan plots in R using ggplot2. The qqman() function I described in the previous post actually calls another function, manhattan(), which has a few options you can s...

Read more »

Webinar: High-Performance Analytics with R and Microsoft HPC Server

March 18, 2010
By

On April 14 I'll be giving a new webinar in partnership with Microsoft on High-Performance Computing with R. I'll be focusing on the new parallel programming capabilities of REvolution R Enterprise 3.1 for Windows, and how to use the features of Microsoft HPC Server to enable computing on clusters. Here's the complete agenda, and you can register at the...

Read more »

Course in San Antonio, Texas

March 18, 2010
By
Course in San Antonio, Texas

Yesterday, I gave my short (3 hours) introduction to computational Bayesian statistics to a group of 25-30 highly motivated students. I managed to cover “only” the first three chapters, as I included some material on Bayes factor approximation and only barely reached Metropolis-Hastings. Here are the slides, modified from the original Bayesian Core slides: (It

Read more »

O’Reilly at OSBC: The future’s in the data

March 17, 2010
By

Tim O'Reilly's keynote talk at OSBC this evening was thought-provoking to say the least. The title of the talk was "The Real Open Source Opportunity", and the surprise for me was that he wasn't talking about Open Source software. Tim's insight, and it's a profound one, is that the next frontier for freedom and openness -- and indeed, the...

Read more »

Tools

March 17, 2010
By
Tools

All the tools I am using at the moment are free of charge. The one that comes to mind first is R. It’s a language for statistical computing which comes with a decent GUI. R comes with some time series support out of the box, but there are plenty of packages (R extensions are called

Read more »

Vanilla Rao-Blackwellisation for revision

March 17, 2010
By
Vanilla Rao-Blackwellisation for revision

The vanilla Rao-Blackwellisation paper with Randal Douc that had been resubmitted to the Annals of Statistics is now back for a revision, with quite encouraging comments: The paper has been reviewed by two referees both of whom comment on the clear exposition and the novelty of the results. Both referees point to the empirical results

Read more »

OSBC blogging

March 17, 2010
By

I'm at the Open Source Business Conference in San Francisco today and tomorrow; I'll report in with updates after the talks. I'm particularly looking forward to the panel discussion on The Shifting Open Source Opportunity moderated by Ashlee Vance, the New York Times reporter who wrote the major story on R last year. (Interesting aside: I learned recently that...

Read more »

Measuring the length of time to run a function

March 17, 2010
By
Measuring the length of time to run a function

This post describes how to time the run time of a R function.

Read more »

Omegahat Statistical Computing » R 2010-03-16 19:28:40

March 16, 2010
By
Omegahat Statistical Computing » R 2010-03-16 19:28:40

Hin-Tak Leung mailed me about a problem with certain malformed XML documents from FlowJo. There are namespace prefixes (prfx:nodeName) with no corresponding namespace declarations (xmlns:prefix=”uri”). How do we fix these? Well, the XML parser can read this but raises errors. We can do nice things to catch these errors and then post-process them. Then we

Read more »

Measuring the length of time to run a function

March 16, 2010
By

When writing R code it is useful to be able to assess the amount of time that a particular function takes to run. We might be interested in measuring the increase in time required by our function as the size of the data increases. To illustrate using the system.time function to calculate the time taken to

Read more »

Interrupting R processes in Ubuntu

March 16, 2010
By

It's funny how things happen. Yesterday I was working away on a project in R and the unenjoyable happens---the process hangs for longer than desired. I operate R in the standard GNOME terminal in Ubuntu and the only way I knew was to close the entire a...

Read more »

Interrupting R processes in Ubuntu

March 16, 2010
By

It's funny how things happen. Yesterday I was working away on a project in R and the unenjoyable happens---the process hangs for longer than desired. I operate R in the standard GNOME terminal in Ubuntu and the only way I knew was to close the entire a...

Read more »

Validating credit card numbers in SAS

March 16, 2010
By
Validating credit card numbers in SAS

Major credit card issuing networks (including Visa, MasterCard, Discover, and American Express) allow simple credit card number validation using the Luhn Algorithm (also called the “modulus 10″ or “mod 10″ algorithm). The following code demonstrates an implementation in SAS. The code also validates the credit card number by length and by checking against a short

Read more »

In search of a random gamma variate…

March 16, 2010
By
In search of a random gamma variate…

One of the most common exersices given to Statistical Computing,Simulation or relevant classes is the generation of random numbers from a gamma distribution. At first this might seem straightforward in terms of the lifesaving relation that exponential and gamma random variables share. So, it’s easy to get a gamma random variate using the fact that

Read more »

Nutritional supplements, ranked

March 16, 2010
By
Nutritional supplements, ranked

One of my favourite shows on TV right now is The Big Bang Theory. For those who haven't seen it: it's like Friends, except instead of New York yuppies, it's PhD physicists and engineers at CalTech. It's nice to see geeks and smart people be the focus (rather than the comic relief) of a sitcom. Also, the equations on...

Read more »

DICOM-to-NIfTI Conversion

March 16, 2010
By
DICOM-to-NIfTI Conversion

Now that the two packages oro.dicom and oro.nifti have been released, we can put them together and perform the much sought after conversion from DICOM format to NIfTI format (entirely in R).  Why?  Because DICOM is the international "standard" for medical imaging data coming off the scanners, but it's not the easiest thing to manipulate on...

Read more »

DICOM-to-NIfTI Conversion

March 16, 2010
By
DICOM-to-NIfTI Conversion

Now that the two packages oro.dicom and oro.nifti have been released, we can put them together and perform the much sought after conversion from DICOM format to NIfTI format (entirely in R).  Why?  Because DICOM is the international "standard" for medical imaging data coming off the scanners, but it's not the easiest thing to manipulate on...

Read more »

Rcpp 0.7.10

March 15, 2010
By

Versions 0.7.7 to 0.7.9 of Rcpp contained a bug: protecting paths with quotes was supposed to help with Windows builds, but did the opposite at least in 'backticks mode' for getting path and/or library information. Using the shQuote() function instead ...

Read more »

Rcpp 0.7.10

March 15, 2010
By

Versions 0.7.7 to 0.7.9 of Rcpp contained a bug: protecting paths with quotes was supposed to help with Windows builds, but did the opposite at least in 'backticks mode' for getting path and/or library information. Using the shQuote() function instead...

Read more »

Solving the rectangle puzzle

March 15, 2010
By
Solving the rectangle puzzle

Given the wrong solution provided in Le Monde and comments from readers, I went to look a bit further on the Web for generic solutions to the rectangle problem. The most satisfactory version I have found so far is Mendelsohn’s in Mathematics Magazine, which gives as the maximal number for a grid. His theorem is

Read more »

Robert Brown and Pollen Particles

March 15, 2010
By
Robert Brown and Pollen Particles

In 1827, the botanist Robert Brown was studying pollen particles as they floated in water. When viewed through a microscope, he observed that the particles seemed to move around as if the were alive. Although he couldn’t have known at the time, the seemingly random motion was caused by the collision of water molecules

Read more »

Visualizing droughts with R

March 15, 2010
By
Visualizing droughts with R

Physicist and weather scientist Joe Wheatley used R to design and create a useful visual representation of how drought affects a region over long time-scales. Instead of charting absolute rainfall (or lack thereof), he instead charts the Standardized Precipitation Index (SPI), where extreme values (above 2 or below -2) indicate extreme wetness or dryness compared to the usual precipitation...

Read more »

Weighting model fit with ctree in party

March 15, 2010
By
Weighting model fit with ctree in party

Conditional inference trees (ctree) in package party allows weighting which is useful when one classification outcome is more important than another. Useful examples are not difficult to imagine: in a marketing direct mailing, a false positive (non-res...

Read more »

The Price of Calculation

March 15, 2010
By

In a world in which the price of calculation continues to decrease rapidly, but the price of theorem proving continues to hold steady or increase, elementary economics indicates that we ought to spend a larger and larger fraction of our time on calculation.1 Over the next ten years, I hope that more and more mathematically

Read more »

Example 7.27: probability question reconsidered

March 15, 2010
By
Example 7.27: probability question reconsidered

In Example 7.26, we considered a problem, from the xkcd blog:Suppose I choose two (different) real numbers, by any process I choose. Then I select one at random (p= .5) to show Nick. Nick must guess whether the other is smaller or larger. Being righ...

Read more »

R Tutorial Series: R Beginner’s Guide and R Bloggers Updates

March 15, 2010
By
R Tutorial Series: R Beginner’s Guide and R Bloggers Updates

1/1/2011 Update: Tal Galili wrote an article that revisits the first year of R-Bloggers and this post was listed as one of the top 14. Therefore, I decided to make a small update to each section. I start by describing the initial series of tutorials th...

Read more »

R Tutorial Series: R Beginner’s Guide and R Bloggers Updates

March 15, 2010
By
R Tutorial Series: R Beginner’s Guide and R Bloggers Updates

1/1/2011 Update: Tal Galili wrote an article that revisits the first year of R-Bloggers and this post was listed as one of the top 14. Therefore, I decided to make a small update to each section. I start by describing the initial series of tutorials th...

Read more »

t-walk on the banana side

March 14, 2010
By
t-walk on the banana side

Following my remarks on the t-walk algorithm in the recent A General Purpose Sampling Algorithm for Continuous Distributions, published by Christen and Fox in Bayesian Analysis that acts like a general purpose MCMC algorithm, Darren Wraith tested it on the generic (10 dimension) banana target we used in the cosmology paper. Here is an output

Read more »

\pi day!

March 14, 2010
By
\pi day!

It’s π-day today so we gonna have a little fun today with Buffon’s needle and of course R. A well known approximation to the value of $latex \pi$ is the experiment tha Buffon performed using a needle of length,$latex l$. What I do in the next is only to copy from the following file the function

Read more »