about boxplot

June 15, 2012
By

From Wiki:"... the bottom and top of the box are always the 25th and 75th percentile (the lower and upper quartiles, respectively), and the band near the middle of the box is always the 50th percentile (the median). But th...

Read more »

Project Euler — problem 10

June 15, 2012
By

Just finish my last assignment for this week. IT’S WEEKEND, officially. Let me take a break to have a look at the tenth problem, another prime problem. It’s no doubt that prime is the center of the number theory and fundamental … Continue reading →

Read more »

Using R in/for Governments

June 15, 2012
By
Using R in/for Governments

Recently British government (by Office  of National Statistics: ONS) just published their version of R manual for analysis of the government survey. The links to PDF and MS word versions of the manual including the R syntax are as below. Note: The R syntax link is not working now. I am contacting the ONS, hope

Read more »

How long does it take to get pregnant?

June 15, 2012
By
How long does it take to get pregnant?

My girlfriend’s biological clock is ticking, and so we’ve started trying to spawn. Since I’m impatient, that has naturally lead to questions like “how long will it take?”. If I were to believe everything on TV, the answer would be easy: have unprotected sex once and pregnancy is guaranteed. A more cynical me suggests that

Read more »

Rounding in R

June 15, 2012
By

Forgive me if you are already aware of this, but I found it quite alarming. I know that most code is interpreted by the computer in binary and we input in decimal, so problems can arise in conversion and with floating point. But the example I have below is so simple that it really surprised me.I was converting...

Read more »

More on birthday probabilities

June 15, 2012
By
More on birthday probabilities

Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year: Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. Chris extracted the birthday frequencies using...

Read more »

Standard, Robust, and Clustered Standard Errors Computed in R

June 15, 2012
By
Standard, Robust, and Clustered Standard Errors Computed in R

Where do these come from? Since most statistical packages calculate these estimates automatically, it is not unreasonable to think that many researchers using applied econometrics are unfamiliar with the exact details of their computation. For the purposes of illustration, I am going to estimate different standard errors from a basic linear regression model: , using the

Read more »

Update to Data on Github Post: Solution to an RCurl problem

June 14, 2012
By
Update to Data on Github Post: Solution to an RCurl problem

A reader of my most recent post tried the R code I had written to download the data set of electoral disproportionality from the GitHub repository. However, it didn’t work for them. After entering disproportionality.data <- getURL(url) they go...

Read more »

useR 2012: my materials

June 14, 2012
By

Just a quick note that I’ve posted the slides, code, and dataset from my useR 2012 talk. I’m having a great time here in Nashville and will write up a conference review soon, with links to the many excellent packages … Continue reading →

Read more »

Pretty Correlation Map of PIMCO Funds

June 14, 2012
By
Pretty Correlation Map of PIMCO Funds

As PIMCO expands beyond fixed income, I thought it might be helpful to look at correlation of PIMCO mutual funds to the S&P 500.  Unfortunately due to the large number of funds, I cannot use the chart.Correlation from PerformanceAnalytics.&nbs...

Read more »

Mindoro Digital Elevation Map

June 14, 2012
By
Mindoro Digital Elevation Map

Saw a map produced by my previous student using a commercial GIS software.Using R raster package and data from diva-gis.org. I produced the same map

Read more »

#4 R: A powerful tool for the geochemist

June 14, 2012
By
#4 R: A powerful tool for the geochemist

  R is an incredibly powerful programming tool used by a large community of people who require an easy to use, light weight, FREE, and fundamentally awesome statistics package! I have recently discovered that R can be used by geologists whose IT skills are often pitied by their more computer literate geophysicist colleagues. Long story short

Read more »

Revolution Newsletter: June 2012

June 14, 2012
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you can read the full June edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. Revolution R Enterprise 6 Now Available! The latest release of Revolution R Enterprise brings...

Read more »

Some interesting ggplot2 tutorials for the social sciences @ Code à la Mode

June 14, 2012
By

Some interesting ggplot2 tutorials for the social sciences @ Code à la Mode

Read more »

Recent R packages for ecology and evolution

June 14, 2012
By

Many R packages/tools have come out recently for doing ecology and evolution. All of the below were described in Methods in Ecology and Evolution, except for spider, which came out in Molecular Ecology Resources. Here are some highlights. mvabund pap...

Read more »

Performance with foreach, doSNOW, and snowfall

June 13, 2012
By

Is it just me, or does the performance of the foreach package with a doSNOW backend operating on a socket grid suck?Here at work, I am helping to setup a cluster of Windows machines for distributed R processing.  We have lots of researchers runnin...

Read more »

Making rApache load rJava

June 13, 2012
By

Here at work I've been in the business of developing webapps using R as the backend computational framework.  The list of parts to get this running is pretty lightweight, just:R Apache 2rApacheI'm not going to cover how to set these things up here...

Read more »

Rook Tutorial at useR! 2012

June 13, 2012
By
Rook Tutorial at useR! 2012

I had such a blast presenting my tutorial on Rook yesterday. Thanks go out to all who attended! All the slides are online here and I’ll be updating my RookTutorial github project with all the great suggestions I got from the attendees. Also, check back soon as I’m planning more postings on Rook. Cheers!

Read more »

Body Weight in the United States – Part 2, "Non Factors"

June 13, 2012
By
Body Weight in the United States – Part 2, "Non Factors"

Sometimes the story isn't what is a trend, but rather what is not a trend. In this second installment about body weight in the U.S., listing what doesn't seem to be contributing factors will help narrow down what might actually be the p...

Read more »

Estimation of hydraulic conductivity and its uncertainty from grain-size data using GLUE and artificial neural networks.

June 13, 2012
By
Estimation of hydraulic conductivity and its uncertainty from grain-size data using GLUE and artificial neural networks.

AbstractVarious approaches exist to relate saturated hydraulic conductivity (Ks) to grain-size data. Most methods use a single grain-size parameter and hence omit the information encompassed by the entire grain-size distribution. This study compares two data-driven modelling methods—multiple linear regression and artificial neural networks—that use the entire grain-size distribution data as input for Ks prediction. Besides the predictive capacity of the methods,...

Read more »

Comparing continuous distributions with R

June 13, 2012
By
Comparing continuous distributions with R

In R we’ll generate similar continuous distributions for two groups and give a brief overview of statistical tests and visualizations to compare the groups. Though the fake data are normally distributed, we use methods for various kinds of continuous distributions. … Continue reading →

Read more »

In case you missed it: May 2012 Roundup

June 13, 2012
By

In case you missed them, here are some articles from May of particular interest to R users. R tops the annual KDNuggets Data Mining Software poll for the first time. R 2.15.1 is scheduled for June 22. (Revolution R Enterprise 6, released on June 5, is based on 2.14.2.) A tutorial uses R, Hadoop, and the RHadoop project to...

Read more »

How to order bars in bar graph (Stack Overflow FAQ)

June 13, 2012
By

How to order bars in bar graph (Stack Overflow FAQ)

Read more »

Why R is Hard to Learn

June 13, 2012
By
Why R is Hard to Learn

The open source R software for analytics has a reputation for being hard to learn. It certainly can be, especially for people who are already familiar with similar packages such as SAS, SPSS or Stata. Training and documentation that leverages … Continue reading →

Read more »

Data on GitHub: The easy way to make your data available

June 13, 2012
By
Data on GitHub: The easy way to make your data available

GitHub is designed for collaborating on coding projects. Nonetheless, it is also a potentially great resource for researchers to make their data publicly available. Specifically you can use it to:store data in the cloud for future use (for free),track ...

Read more »

Twitter unfollowers with R and Rook

Twitter unfollowers with R and Rook

In my last blog I'm following you in Twitter...are you following me back? I show you how to use the Twitter APIs to get a list of the people that you follow but doesn't follow you back.This time, I want to extend the tool as I installed the R...

Read more »

Next R meeting in Paris INSEE: ggplot2 and parallel computing

June 12, 2012
By
Next R meeting in Paris INSEE: ggplot2 and parallel computing

Hi, our group of R users from INSEE, aka FLR, meets monthly in Paris. Next meeting is on Wed 13 (tomorrow), 1-2 pm, room 539 (an ID is needed to come in,  map to access INSEE R), about ggplot2 and parallel computing. Since the first meeting in February, presentations have included hot topics like webscrapping, C in R, RStudio, SQLite

Read more »

Finding word use patterns in Wikileaks cables

June 12, 2012
By
Finding word use patterns in Wikileaks cables

6/18: A follow-up to this post is now available here.Recent DiscoveriesWhen I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look at the site in depth, due to security policies. Now that the ex- is firmly prepended to diplomat...

Read more »

Finding word use patterns in Wikileaks cables

June 12, 2012
By
Finding word use patterns in Wikileaks cables

6/18: A follow-up to this post is now available here.Recent DiscoveriesWhen I was a diplomat, I was always interested in the Wikileaks cables and what could be done with them. Unfortunately, I never got a chance to look at the site in depth, due to security policies. Now that the ex- is firmly prepended to diplomat...

Read more »