Thor vs. Uncanny X-Men vs. Fantastic Four

February 21, 2011
By
Thor vs. Uncanny X-Men vs. Fantastic Four

Three of Marvel’s longest running comic book series’ are Thor, Uncanny X-Men, and Fantastic Four. Using data from 2010, I compare monthly comic book sales for each series. This data only pertains to monthly issues and not trade paperbacks. Furthermore, the series Amazing Spider Man was not considered because it was released twice a month.

Read more »

Use R to view and manipulate the File System

February 21, 2011
By

One of the best ways to learn how to code in R is to view sample scripts that people share. I recently came across this post where Michael uses R to scrape twitter and collect all sorts of great data … Continue reading →

Read more »

Dataset: Wisconsin Union Protester Tweets #wiunion

February 21, 2011
By
Dataset: Wisconsin Union Protester Tweets #wiunion

   I’ve been playing with Twitter data over the last week, archiving Algerian, Egyptian, Iranian, and Chinese tweets.  I thought I’d bring the story a little closer to home this time by archiving tweets from Wisconsin Union protesters on the … Continue reading →

Read more »

Interest Rates’ Influence on 1987

February 21, 2011
By
Interest Rates’ Influence on 1987

One aspect of 1987 that does not deserve enough attention is interest rates.  Higher interest rates constrain economic activity and compete with other investments.  As seen in the chart below, the US 10year Treasury rate climbed 40% from 7% t...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population has a specific attribute, what is the probability that 35 or...

Read more »

Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

February 21, 2011
By
Using R for Introductory Statistics, Chapter 5, hypergeometric distribution

This is a little digression from Chapter 5 of Using R for Introductory Statistics that led me to the hypergeometric distribution. Question 5.13 A sample of 100 people is drawn from a population of 600,000. If it is known that 40% of the population h...

Read more »

Who did HBGary contact the most?

February 21, 2011
By

Following on from Friday's post about the travails of internet security firm HBGary, R user Michael Bommarito has done an analysis of the leaked emails to find the top 20 most contacted email addresses and the top 20 most referenced internet domains. There are some interesting names on those lists, to be sure. Check them out at the link...

Read more »

New R User Groups in Canada, India

February 21, 2011
By

Three new local R user groups have just been added to the directory: In Québec, the group Plein-R is affilliated with the department of Forestry, Geography and Geomatics at Laval University. Although the group's website is in French, group organizer Etienne Racine says, "Our group is bilingual. Our meetings are in a mix of French and English: we call...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Access all UCSC wiggle tracks from R and your terminal

February 21, 2011
By

rtracklayer package allows you to access most of the UCSC wiggle tracks from R. However, there is another way which might more practical in situations where you need to summarize the wig track scores over a given set of genomic coordinates. Although yo...

Read more »

Choropleth tutorial and regression coefficient plots

February 21, 2011
By
Choropleth tutorial and regression coefficient plots

About two weeks ago, I gave short talk at Duke, wherein I presented a brief tutorial on creating choropleth maps in R using ggplot2. Since the code is already written, and the data and shapefiles already hosted online, I thought I would share the tutorial more widely. A .ZIP file containing all the files necessary … Read more

Read more »

R Tutorial Series: Two-Way Repeated Measures ANOVA

February 21, 2011
By
R Tutorial Series: Two-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical two-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct two-way repeated measures ANOVA in R using the Anova() function fr...

Read more »

R Tutorial Series: Two-Way Repeated Measures ANOVA

February 21, 2011
By
R Tutorial Series: Two-Way Repeated Measures ANOVA

Repeated measures data require a different analysis procedure than our typical two-way ANOVA and subsequently follow a different R process. This tutorial will demonstrate how to conduct two-way repeated measures ANOVA in R using the Anova() function fr...

Read more »

Presentation on Building R Packages

February 21, 2011
By

Last week I gave a presentation to the Melbourne R User Group on Building R Packages. The talk covered a simple package example, and an example of interfacing R with native code. The slides are here: RPackages.pdf. The R community in Melbourne (and Aus...

Read more »

Tracking the Frequency of Twitter Hashtags with R

February 21, 2011
By
Tracking the Frequency of Twitter Hashtags with R

 I’ve posted three examples of Twitter hashtags datasets in the last week: one on China, one on Iran, and one on Algeria.  In order to build these datasets, I needed to obtain older tweets; this is slightly more difficult than … Continue reading →

Read more »

Child health metrics

February 20, 2011
By
Child health metrics

In analysis of Child Health data, generally z-scores or percentile groupings are used as children do not growth is not linear. The CDC (Center for Disease Control and Prevention) have released tables of data for calculating these z-scores and percentiles, and here are some scripts for R to calculate these in your sample. CLICK HERE

Read more »

R Optimisation Tips using Optim and Maximum Likelihood

February 20, 2011
By

This post summarises some R modelling tips I picked up at AMPC2011. I got some tips from a tutorial on parameter estimation put on by Scott Brownfrom the Newcastle Cognition Lab. The R code used in the tutorial is available directly hereor from the ...

Read more »

R Optimisation Tips using Optim and Maximum Likelihood

February 20, 2011
By
R Optimisation Tips using Optim and Maximum Likelihood

This post summarises some R modelling tips I picked up atAMPC2011.I got some tips from a tutorial on parameter estimationput on by Scott Brownfrom the Newcastle Cognition Lab.The R code used in the tutorial is available directly hereor from the confer...

Read more »

Does the Student based confidence interval have any interest in practice ?

February 20, 2011
By
Does the Student based confidence interval have any interest in practice ?

Friday in the course of statistics, we started the section on confidence interval, and like always, I got a bit confused with the degrees of freedom of the Student (should it be or ?) and which empirical variance (should we consider the one wher...

Read more »

R versus Matlab in Mathematical Psychology

February 20, 2011
By

I recently attended the 2011 Australasian Mathematical Psychology Conference. This post summarises a few thoughts I had on the use of R, Matlab and other tools in mathematical psychology flowing from discussions with researchers at the conference. I w...

Read more »

R versus Matlab in Mathematical Psychology

February 20, 2011
By
R versus Matlab in Mathematical Psychology

I recently attended the 2011 Australasian Mathematical Psychology Conference.This post summarises a few thoughts I had on the use of R, Matlab and othertools in mathematical psychology flowing from discussions with researchers atthe conference.I wanted...

Read more »

Converting MATLAB and R date and time values

February 20, 2011
By

For some unknown reason, MATLAB codes its date/time values as the number of elapsed days starting from January 1 in the year 0000. R uses the equally arbitrary, but much more widespread POSIX/Unix epoch as a reference for time keeping, so that R’...

Read more »

Vectorize!

February 20, 2011
By
Vectorize!

Here is an email sent by one of my students a few days ago: Do you know how to integrate a function with an  “if”? For instance: >X=rnorm(100) >Femp=function(x){ +   return(sum(X<x)) +} >integrate(Femp,0,1)$value does not work. My reply was that the fundamental reason it does not work is that integrate (or curve for instance)

Read more »

Dataset: Tweets from the Chinese Protests #cn220

February 20, 2011
By
Dataset: Tweets from the Chinese Protests #cn220

  Earlier this week, I posted a ~100k tweet dataset on the #25bahman protests in Iran.  The corresponding figure of frequencies showed a strong presence on Twitter, with over 500 tweets per 5 minute period at peak.  You can download the … Continue reading →

Read more »

New R features in Bio7 1.5

February 20, 2011
By
New R features in Bio7 1.5

Bio7 1.5 has been released and comes with new functionalities for R. For all who don’t know Bio7 here is a short description: Bio7 is a integrated development environment for ecological modelling based on the Rich-Client-Platform concept of the Java IDE Eclipse. The Bio7 platform contains several perspectives which arrange several views for a special

Read more »

UseR! 2011 in Warwick

February 20, 2011
By
UseR! 2011 in Warwick

This year useR! conference will take place in Warwick, on August 16-18.  It is being organised by the department of Statistics and funded by CRiSM and Revolution Analytics (providers of the R tee-shirt!). I wish I could attend but mid-August is usually associated with genuine (post-JSM) family vacations. Filed under: R, Statistics, University life Tagged:

Read more »

Talking R through Java

February 20, 2011
By
Talking R through Java

Today I played a bit with JRI as part of rJava, a Java-R-interface. Here you can learn how to setup for Debian/Ubuntu/akins.

Read more »

Open-sourcing some of my automation code

February 19, 2011
By
Open-sourcing some of my automation code

To automate my trading I use a mix of scripts. Everything goes – R, Python, shell, C++, etc. For some time now I have been satisfied with the tools I have created. They run once a day, gather data from EODDATA, update the database, run some R magic to decide what needs to be done

Read more »

Software tools for data analysis – an overview

February 19, 2011
By
Software tools for data analysis – an overview

by Szilard Pafka Discussions on various software tools (C, C++, Perl, Python, Unix shell, R, Matlab, SAS, SPSS, Excel, databases, Hadoop etc.) used in data analysis. Szilard Pafka (founder and co-organizer of the Los Angeles R users group) presents an … Continue reading →

Read more »