Statistics and Computing and ABC

February 23, 2011
By
Statistics and Computing and ABC

Statistics and Computing has received several papers on ABC and plans to make a special ABC issue out of these. All submissions prior to June 2011 that will be accepted will be published in this special issue. The special issue is identified as an article type on the on-line page. In case of questions or

Read more »

sab-R-metrics: Basic Applied Regression (OLS)

February 23, 2011
By
sab-R-metrics: Basic Applied Regression (OLS)

Today, I'll again be using a new data set that can be found here at my website (called 'leagueoutcomes.csv'). The data set includes the standings results of the 2009 season for MLB along with average game attendance by team. I'll use this to go over some basic regression techniques and tools in R. Hopefully this...

Read more »

sab-R-metrics: Basic Applied Regression (OLS)

February 23, 2011
By
sab-R-metrics: Basic Applied Regression (OLS)

Today, I'll again be using a new data set that can be found here at my website (called 'leagueoutcomes.csv'). The data set includes the standings results of the 2009 season for MLB along with average game attendance by team. I'll use this to go over some basic regression techniques and tools in R. Hopefully this...

Read more »

Course: Machine Learning with R

February 22, 2011
By

Starting on March 5 at the Hacker Dojo in Mountain View (CA), Mike Bowles and Patricia Hoffmann will present a course on Machine Learning where R will be the "lingua franca" for looking at homework problems, discussing them and comparing different solution approaches. The class will begin at the level of elementary probability and statistics and from that background...

Read more »

My R setup with Mac OS X

February 22, 2011
By
My R setup with Mac OS X

The eco-system of R is largely Ubuntu and SVN, so Mac users sometimes find themselves a bit out of place, shall we say. But let's not bad high-school memories about not being in the in-crowd keep us from participating in the R world. With just a little...

Read more »

Stochastic approximation in mixtures

February 22, 2011
By
Stochastic approximation in mixtures

On Friday, a 2008 paper on Stochastic Approximation and Newton’s Estimate of a Mixing Distribution by Ryan Martin and J.K. Ghosh was posted on arXiv. (I do not really see why it took so long to post on arXiv a 2008 Statistical Science paper but given that it is not available on project Euclid, it

Read more »

Calling BEDtools from R

February 22, 2011
By

BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRange...

Read more »

Calling BEDtools from R

February 22, 2011
By

BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRange...

Read more »

Get all your Questions Answered

February 22, 2011
By

When I have a question I usually ask the internet before bugging my neighbor. Yet it seems like Google's search results have become increasingly irrelevant over the last few years, and this is especially true for searching anything related to R (and pr...

Read more »

Get all your Questions Answered

February 22, 2011
By

When I have a question I usually ask the internet before bugging my neighbor. Yet it seems like Google's search results have become increasingly irrelevant over the last few years, and this is especially true for searching anything related to R (and pr...

Read more »

September 2011 Arctic Sea Ice Extent Forecast

February 22, 2011
By
September 2011 Arctic Sea Ice Extent Forecast

In this post, I use a quadratic regression model to forecast the  September, 2011  Arctic Sea Ice Extent. The model was developed with  1980 – 2010 data. Links to the R script, source data and  how-to article on polynomial regression … Continue reading →

Read more »

Example 8.26: reading data with variable number of words in a field

February 22, 2011
By
Example 8.26: reading data with variable number of words in a field

A student came with a question about how to snag data from a PDF report for analysis. Once she'd copied things her text file looked like:1 Las Vegas, NV --- 53.3 --- --- 12 Sacramento, CA --- 42.3 --- --- 23 Miami, FL --- 41.8 --- --- 34 Tucson, AZ --...

Read more »

Multithreading in R (or other types of non-sequencial programming)

February 21, 2011
By
Multithreading in R (or other types of non-sequencial programming)

Considering forwarding tick data from MT4 to R requires less than 2ms, but that charting 4 different time-frame (1min, 15min, 30min and 1hour) at each tick-update may require more than 250ms (depending on the number of bars in history), I think it is f...

Read more »

Graphing – margins, titles, mtext, workspace

February 21, 2011
By
Graphing – margins, titles, mtext, workspace

This is a great post, very true, not enough of R’s graphics are well displayed online to really see how to achieve what the often ambiguous ‘help’ information suggests. http://research.stowers-institute.org/efg/R/Graphics/Basics/mar-oma/index.htm I particularly find “mtext(“lol”, outer=T)” to be particularly useful (requires “oma=c(2,2,2,2)” or similar). http://addictedtor.free.fr/graphiques/ This site is somewhat of the way there, but I’ve found

Read more »

Thor vs. Uncanny X-Men vs. Fantastic Four

February 21, 2011
By
Thor vs. Uncanny X-Men vs. Fantastic Four

Three of Marvel’s longest running comic book series’ are Thor, Uncanny X-Men, and Fantastic Four. Using data from 2010, I compare monthly comic book sales for each series. This data only pertains to monthly issues and not trade paperbacks. Furthermore, the series Amazing Spider Man was not considered because it was released twice a month.

Read more »

Use R to view and manipulate the File System

February 21, 2011
By

One of the best ways to learn how to code in R is to view sample scripts that people share. I recently came across this post where Michael uses R to scrape twitter and collect all sorts of great data … Continue reading →

Read more »

Dataset: Wisconsin Union Protester Tweets #wiunion

February 21, 2011
By
Dataset: Wisconsin Union Protester Tweets #wiunion

   I’ve been playing with Twitter data over the last week, archiving Algerian, Egyptian, Iranian, and Chinese tweets.  I thought I’d bring the story a little closer to home this time by archiving tweets from Wisconsin Union protesters on the … Continue reading →

Read more »

Interest Rates’ Influence on 1987

February 21, 2011
By
Interest Rates’ Influence on 1987

One aspect of 1987 that does not deserve enough attention is interest rates.  Higher interest rates constrain economic activity and compete with other investments.  As seen in the chart below, the US 10year Treasury rate climbed 40% from 7% t...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population has a specific attribute, what is the probability that 35 or...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population h...

Read more »

Who did HBGary contact the most?

February 21, 2011
By

Following on from Friday's post about the travails of internet security firm HBGary, R user Michael Bommarito has done an analysis of the leaked emails to find the top 20 most contacted email addresses and the top 20 most referenced internet domains. There are some interesting names on those lists, to be sure. Check them out at the link...

Read more »

New R User Groups in Canada, India

February 21, 2011
By

Three new local R user groups have just been added to the directory: In Québec, the group Plein-R is affilliated with the department of Forestry, Geography and Geomatics at Laval University. Although the group's website is in French, group organizer Etienne Racine says, "Our group is bilingual. Our meetings are in a mix of French and English: we call...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Choropleth tutorial and regression coefficient plots

February 21, 2011
By
Choropleth tutorial and regression coefficient plots

About two weeks ago, I gave short talk at Duke, wherein I presented a brief tutorial on creating choropleth maps in R using ggplot2. Since the code is already written, and the data and shapefiles already hosted online, I thought I would share the tutorial more widely. A .ZIP file containing all the files necessary … Continue reading →

Read more »

R Tutorial Series: Two-Way Repeated Measures ANOVA

February 21, 2011
By
R Tutorial Series: Two-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical two-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct two-way repeated measures ANOVA in R using the Anova() function fr...

Read more »

R Tutorial Series: Two-Way Repeated Measures ANOVA

February 21, 2011
By
R Tutorial Series: Two-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical two-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct two-way repeated measures ANOVA in R using the Anova() function fr...

Read more »

Presentation on Building R Packages

February 21, 2011
By

Last week I gave a presentation to the Melbourne R User Group on Building R Packages. The talk covered a simple package example, and an example of interfacing R with native code. The slides are here: RPackages.pdf. The R community in Melbourne (and Aus...

Read more »

Tracking the Frequency of Twitter Hashtags with R

February 21, 2011
By
Tracking the Frequency of Twitter Hashtags with R

 I’ve posted three examples of Twitter hashtags datasets in the last week: one on China, one on Iran, and one on Algeria.  In order to build these datasets, I needed to obtain older tweets; this is slightly more difficult than … Continue reading →

Read more »