## ttrTests This is a Test Test 3:Data Snoopy

September 30, 2011
THIS IS NOT INVESTMENT ADVICE.  IT IS JUST AN EXAMPLE AND WILL LIKELY LOSE LOTS OF MONEY IF YOU PURSUE WHAT IS DISCUSSED.  READER IS RESPONSIBLE FOR THEIR OWN GAINS OR LOSSES.  IF YOU ARE AN UNLIKELY WINNER, I WOULD LOVE TO HEAR YOUR STO...

## R tutorial on visualizations/graphics

September 30, 2011
Rolf Lohaus, a Huxley postdoctoral fellow here in the EEB dept at Rice University, gave our R course a talk on basic visualizations in R this morning.Enjoy!

## R 2.13.2 released

September 30, 2011
The R core team announced today that R 2.13.2 is now available: The byte pixies have rolled up R-2.13.2.tar.gz at 9:00 this morning. This is intended to be the final release of the 2.13 series, for the benefit of those apprehensive of putting 2.14.x into production use. This update fixes a number of minor bugs (for example, pch="." will...

## Rcpp 0.9.7

September 30, 2011
A fresh maintenance release version 0.9.7 of Rcpp went onto CRAN and into Debian earlier today. This release contains two contributed fixes. The first, suggested by Darren Cook via the rcpp-devel mailing list, corrects how we had set up excepti...

## Microfinance Map of India – another go…

September 30, 2011
I gave it another go, trying to get a map that looks a bit nicer. This time, I tried to compute something like a density or intensity in a certain area. On the previous map, this was not visible very well. I used ggplot2 and a bit of R code, together with RGoogleMaps to produce the

## Monitoring Productivity II – the Others

September 30, 2011
In previous Monitoring Productivity Experiment post I looked into the hours I spent in computer, now will look into the hours Others spend in computer, which is far more interesting :) To find things like what day people spend more time on computer, ho...

## Setting the initial view of a motion chart in R

September 30, 2011
Following on from my article about accessing and plotting World Bank data with R I want to talk about how to change the initial view of a motion chart.Over the last couple of weeks I have been asked a view times how to do this. For instance Stephen O'G...

September 29, 2011
I searched around to see if there was a blog post somewhere describing how to customize one’s .rprofile but was surprised to find just one outdated post. So here is quick intro on the topic. If you are a power R user, you already know about what it does. For those of you that don’t,

## Modelling with R: part 1

September 29, 2011
When I started work about 3 months ago, I didn't know much more than loading data and executing standard Econometric commands in R. But now I feel much much much more confident in using R for work, for research, for puzzles, and sometimes just for fun....

## Obama 2012 campaigning with analytics

September 29, 2011
The campaign to re-elect US president Barack Obama is hiring -- and the RDataMining blog noticed that several of the open positions seek R skills. If you want to be a Communications Analyst, Digital Strategy Analyst, or Statistical Modeling Analyst and you know R, there may be a job opening for you. Just goes to show there's no corner...

## Googling Bayes’ pictures

September 29, 2011
I am writing way too many posts in a row on Google tools. I promise I will think about something else soon. I find amusing the possibility to launch a search in Google images by just dragging a picture into … Continue reading →

## A brief introduction to R for SAS and SPSS users

September 29, 2011
If you've used SAS or SPSS and want a jump-start into the basics of the popular R language, next week's webinar, Introduction to R for SAS and SPSS Users will be of interest to you. While R, SAS and SPSS are all three software systems for data analysis and graphics, the underlying concepts in R are quite different to...

## Connect JAVA to R part 2

September 29, 2011
To follow on from the earlier post on using R through Java, it is even easier to get jri up and running as a NetBeans module. Why is this useful? Well the platform that the NetBeans IDE is built on … Continue reading →

## Paired sample t-test in R

September 28, 2011
Let’s walk through using R and Student’s t-test to compare paired sample data. The book Statistics: The Exploration & Analysis of Data (6th edition, p505) presents the longitudinal study “Bone mass is recovered from lactation to postweaning in adolescent mothers … Continue reading →

## ttrTests This is a Test–Test 1 and Test 2

September 28, 2011
Just to remind everyone, THIS IS NOT INVESTMENT ADVICE AND ANY ACTIONS TAKEN BASED ON THIS DISCUSSION WILL PROBABLY RESULT IN SIGNIFICANT LOSSES. We had fun with the ttrTests package in two previous posts ttrTests: Its Great Thesis and Incredible Poten...

## The R Graph Gallery goes social

September 28, 2011
The R Graph Gallery, the website from Romain François that showcases hundreds of examples of data visualization with R, has new social features. Now, when you find a graph or chart you find appealing or useful, you can "Like" it on Facebook or "+1" it on Google+. This should be a great way of highlighting the best charts and...

## Is the “Long Tail” a Useless Concept?

In response to my last post, “The Long Tail of the Pareto Distribution,” Neil Gunther had the following comment:            “Unfortunately, you've fallen into the trap of using the ‘long tail’ misnomer. If you think about it, it can't possibly be the length of the tail that sets distributions like Pareto and Zipf apart; even the negative exponential and Gaussian...

## Data Science: a literature review

September 28, 2011
Just what is Data Science, anyway? Here's one take: Ever since the term "Data Scientist" was coined by DJ Patil and Jeff Hammerbacker in 2009, there's been a vigorous debate on what the team actually means. More than 80% of statisticians consider themselves data scientists, but Data Science is more than just Statistics. (My own take is that Data...

## Polyploidy in sugarcane

September 28, 2011
While reading UseR conference abstracts I came across this sentence: "Sugarcane is polypoid, i.e., has 8 to 14 copies of every chromosome, with individual alleles in varying numbers." Vau! This generates really complex genotype system. Say we have biallelic gene with alleles being A and B. In diploids the possible genotypes are AA, AB, and BB. Given the...

## Bessel integral

September 28, 2011
$Bessel integral$

Pierre Pudlo and I worked this morning on a distribution related to philogenic trees and got stuck on the following Bessel integral where In is the modified Bessel function of the first kind. We could not find better than formula 6.611(4) in Gradshteyn and Ryzhik. which is for a=0… Anyone in for a closed form

## Using transparency for data count intuition

September 27, 2011
This is an illustration of representing point count in a graphic using transparency. This is easy to do in ggplot2 if you use one of the barchart type of geoms.  However I think there are other situations where it would be useful to apply aesthetics based on point count. Since Hadley did a lot of

## World Tourism Day, and Google Public Data Explore

September 27, 2011
Today is the World Tourism Day! So let’s speak about some tourism related datasets – and others. Among other nice functions, Google offers a Public Data Explore in a beta version which provides a collection of datasets from OECD, IMF, Eurostat, … Continue reading →

## Five new local R user groups

September 27, 2011
Looks like there's been a lot of activity in the R user community in the Northern hemisphere now that the summer break is over. I've just added several new groups to the Local R User Group Directory: Tokyo, Japan: The Tokyo.R R study group has already had 17 meetings, but has just been added to the directory. Shanghai/East China:...

## Tikz Introduction

September 27, 2011
The pgf drawing package for LaTeX provides facilities for drawing simple of complicated pictures within a LaTeX document. There are many options available within the package and in this post we consider some of the basics to get up and running. Fast Tube by Casper As with all LaTeX documents we need to select a

## Basic line chart with ggplot2

September 27, 2011
ggplot2 is a package for R which easily draws plots that are easier on the eyes than R’s built-in plotting functions, though the grammar is different than what is commonly used in R. This code demonstrates how to prepare a … Continue reading →

## Ghastly R code

September 27, 2011
My R package, R/qtl, contains about 33k lines of R code (and 21k lines of C code). Some of it is quite good; some of it is terrible. Here’s another example of the terrible. I’ve long needed to revise the function scantwo, for performing a two-dimensional genome scan for pairs of loci. I was looking

## Project Euler: problem 6

September 27, 2011
The sum of the squares of the first ten natural numbers is,12 + 22 + ... + 102 = 385The square of the sum of the first ten natural numbers is,(1 + 2 + ... + 10)2 = 552 = 3025Hence the difference between the sum of the squares o...

## Example 9.7: New stuff in SAS 9.3– Frailty models

September 27, 2011
Shared frailty models are a way of allowing correlated observations into proportional hazards models. Briefly, instead of l_i(t) = l_0(t)e^(x_iB), we allow l_ij(t) = l_0(t)e^(x_ijB + g_i), where observations j are in clusters i, g_i is typically norma...