Calling BEDtools from R

February 22, 2011
By

BEDtools suite provides command-line functionality when dealing with genomic coordinate based operations, such as overlapping bed files or getting coverage of a bed file over a genome (similar, not exactly same, functionality in R is provided by IRange...

Read more »

Get all your Questions Answered

February 22, 2011
By

When I have a question I usually ask the internet before bugging my neighbor. Yet it seems like Google's search results have become increasingly irrelevant over the last few years, and this is especially true for searching anything related to R (and pr...

Read more »

Get all your Questions Answered

February 22, 2011
By

When I have a question I usually ask the internet before bugging my neighbor. Yet it seems like Google's search results have become increasingly irrelevant over the last few years, and this is especially true for searching anything related to R (and pr...

Read more »

September 2011 Arctic Sea Ice Extent Forecast

February 22, 2011
By
September 2011 Arctic Sea Ice Extent Forecast

In this post, I use a quadratic regression model to forecast the  September, 2011  Arctic Sea Ice Extent. The model was developed with  1980 – 2010 data. Links to the R script, source data and  how-to article on polynomial regression … Continue reading →

Read more »

Example 8.26: reading data with variable number of words in a field

February 22, 2011
By
Example 8.26: reading data with variable number of words in a field

A student came with a question about how to snag data from a PDF report for analysis. Once she'd copied things her text file looked like:1 Las Vegas, NV --- 53.3 --- --- 12 Sacramento, CA --- 42.3 --- --- 23 Miami, FL --- 41.8 --- --- 34 Tucson, AZ --...

Read more »

Multithreading in R (or other types of non-sequencial programming)

February 21, 2011
By
Multithreading in R (or other types of non-sequencial programming)

Considering forwarding tick data from MT4 to R requires less than 2ms, but that charting 4 different time-frame (1min, 15min, 30min and 1hour) at each tick-update may require more than 250ms (depending on the number of bars in history), I think it is f...

Read more »

Graphing – margins, titles, mtext, workspace

February 21, 2011
By
Graphing – margins, titles, mtext, workspace

This is a great post, very true, not enough of R’s graphics are well displayed online to really see how to achieve what the often ambiguous ‘help’ information suggests. http://research.stowers-institute.org/efg/R/Graphics/Basics/mar-oma/index.htm I particularly find “mtext(“lol”, outer=T)” to be particularly useful (requires “oma=c(2,2,2,2)” or similar). http://addictedtor.free.fr/graphiques/ This site is somewhat of the way there, but I’ve found

Read more »

Thor vs. Uncanny X-Men vs. Fantastic Four

February 21, 2011
By
Thor vs. Uncanny X-Men vs. Fantastic Four

Three of Marvel’s longest running comic book series’ are Thor, Uncanny X-Men, and Fantastic Four. Using data from 2010, I compare monthly comic book sales for each series. This data only pertains to monthly issues and not trade paperbacks. Furthermore, the series Amazing Spider Man was not considered because it was released twice a month.

Read more »

Use R to view and manipulate the File System

February 21, 2011
By

One of the best ways to learn how to code in R is to view sample scripts that people share. I recently came across this post where Michael uses R to scrape twitter and collect all sorts of great data … Continue reading →

Read more »

Dataset: Wisconsin Union Protester Tweets #wiunion

February 21, 2011
By
Dataset: Wisconsin Union Protester Tweets #wiunion

   I’ve been playing with Twitter data over the last week, archiving Algerian, Egyptian, Iranian, and Chinese tweets.  I thought I’d bring the story a little closer to home this time by archiving tweets from Wisconsin Union protesters on the … Continue reading →

Read more »

Interest Rates’ Influence on 1987

February 21, 2011
By
Interest Rates’ Influence on 1987

One aspect of 1987 that does not deserve enough attention is interest rates.  Higher interest rates constrain economic activity and compete with other investments.  As seen in the chart below, the US 10year Treasury rate climbed 40% from 7% t...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population has a specific attribute, what is the probability that 35 or...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population h...

Read more »

Who did HBGary contact the most?

February 21, 2011
By

Following on from Friday's post about the travails of internet security firm HBGary, R user Michael Bommarito has done an analysis of the leaked emails to find the top 20 most contacted email addresses and the top 20 most referenced internet domains. There are some interesting names on those lists, to be sure. Check them out at the link...

Read more »

New R User Groups in Canada, India

February 21, 2011
By

Three new local R user groups have just been added to the directory: In Québec, the group Plein-R is affilliated with the department of Forestry, Geography and Geomatics at Laval University. Although the group's website is in French, group organizer Etienne Racine says, "Our group is bilingual. Our meetings are in a mix of French and English: we call...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Choropleth tutorial and regression coefficient plots

February 21, 2011
By
Choropleth tutorial and regression coefficient plots

About two weeks ago, I gave short talk at Duke, wherein I presented a brief tutorial on creating choropleth maps in R using ggplot2. Since the code is already written, and the data and shapefiles already hosted online, I thought I would share the tutorial more widely. A .ZIP file containing all the files necessary … Continue reading →

Read more »

R Tutorial Series: Two-Way Repeated Measures ANOVA

February 21, 2011
By
R Tutorial Series: Two-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical two-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct two-way repeated measures ANOVA in R using the Anova() function fr...

Read more »

R Tutorial Series: Two-Way Repeated Measures ANOVA

February 21, 2011
By
R Tutorial Series: Two-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical two-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct two-way repeated measures ANOVA in R using the Anova() function fr...

Read more »

Presentation on Building R Packages

February 21, 2011
By

Last week I gave a presentation to the Melbourne R User Group on Building R Packages. The talk covered a simple package example, and an example of interfacing R with native code. The slides are here: RPackages.pdf. The R community in Melbourne (and Aus...

Read more »

Tracking the Frequency of Twitter Hashtags with R

February 21, 2011
By
Tracking the Frequency of Twitter Hashtags with R

 I’ve posted three examples of Twitter hashtags datasets in the last week: one on China, one on Iran, and one on Algeria.  In order to build these datasets, I needed to obtain older tweets; this is slightly more difficult than … Continue reading →

Read more »

Child health metrics

February 20, 2011
By
Child health metrics

In analysis of Child Health data, generally z-scores or percentile groupings are used as children do not growth is not linear. The CDC (Center for Disease Control and Prevention) have released tables of data for calculating these z-scores and percentiles, and here are some scripts for R to calculate these in your sample. CLICK HERE

Read more »

R Optimisation Tips using Optim and Maximum Likelihood

February 20, 2011
By

This post summarises some R modelling tips I picked up at AMPC2011. I got some tips from a tutorial on parameter estimation put on by Scott Brownfrom the Newcastle Cognition Lab. The R code used in the tutorial is available directly hereor from the ...

Read more »

R Optimisation Tips using Optim and Maximum Likelihood

February 20, 2011
By
R Optimisation Tips using Optim and Maximum Likelihood

This post summarises some R modelling tips I picked up atAMPC2011.I got some tips from a tutorial on parameter estimationput on by Scott Brownfrom the Newcastle Cognition Lab.The R code used in the tutorial is available directly hereor from the confer...

Read more »

Does the Student based confidence interval have any interest in practice ?

February 20, 2011
By
Does the Student based confidence interval have any interest in practice ?

Friday in the course of statistics, we started the section on confidence interval, and like always, I got a bit confused with the degrees of freedom of the Student (should it be or ?) and which empirical variance (should we consider the one wher...

Read more »

R versus Matlab in Mathematical Psychology

February 20, 2011
By

I recently attended the 2011 Australasian Mathematical Psychology Conference. This post summarises a few thoughts I had on the use of R, Matlab and other tools in mathematical psychology flowing from discussions with researchers at the conference. I w...

Read more »

R versus Matlab in Mathematical Psychology

February 20, 2011
By
R versus Matlab in Mathematical Psychology

I recently attended the 2011 Australasian Mathematical Psychology Conference.This post summarises a few thoughts I had on the use of R, Matlab and othertools in mathematical psychology flowing from discussions with researchers atthe conference.I wanted...

Read more »

Converting MATLAB and R date and time values

February 20, 2011
By

For some unknown reason, MATLAB codes its date/time values as the number of elapsed days starting from January 1 in the year 0000. R uses the equally arbitrary, but much more widespread POSIX/Unix epoch as a reference for time keeping, so that R’...

Read more »