## opentick alternatives

November 5, 2009
I've been getting a bit of traffic from people searching for opentick (the defunct company), so I've started a list of similar (but non-free) data providers. I'm not affiliated with any of these vendors, and the list is in no particular order. I'll u...

## R has a JSON package

November 5, 2009
Named rjson, appropriately. It’s quite basic just now, but contains methods for interconversion between R objects and JSON. Something like this: > library(rjson) > data <- list(a=1,b=2,c=3) > json <- toJSON(data) > json "{\"a\":1,\"b\":2,\"c\":3}" > cat(json, file="data.json") Use cases? I wonder if RApache could be used to build an API that serves R data in JSON format? Posted in

## Show me the mean(ing)…

November 5, 2009
$Show me the mean(ing)…$

Well testing a bunch of samples for the largest population mean isn’t that common yet a simple test is at hand. Under the obvious title “The rank sum maximum test for the largest K population means” the test relies on the calculation of the sum of ranks under the combined sample of size , where

## Scivews-K got updated again

November 4, 2009
With the recent update, I was able to get it working properly.Interestingly, while it works on my Vista 64-bit, it does not work on my Ubuntu 64-bit. I have no idea what is going on.

## R’s xtabs for total weighted read coverage

November 4, 2009
Samtools and its BioPerl wrapper Bio::DB:Sam prefer to give read coverage on a depth per base pair basis. This is typically an array of depths, one for every position that has at least one read aligned. OK, works for me. But how can we quickly see which targets (in my case transcripts) have the greatest total weighted read coverage...

## Split, apply, and combine in R using PLYR

November 4, 2009
While flirting around with previously mentioned ggplot2 I came across an incredibly useful set of functions in the plyr package, made by Hadley Wickham, the same guy behind ggplot2.  If you've ever used MySQL before, think of "GROUP BY", but here you can arbitrarily apply any R function to splits of the data, or write one yourself. Imagine you have...

## LondonR tomorrow night

November 2, 2009
LondonR Date: Tuesday 3rd November Time: 6pm – 9.30pm Venue: Shooting Star Public house, 129 City Rd London, EC1, United Kingdom +44 20 7929 6818 Introduction: Richard Pugh - mangosolutions 6.15pm: Richard Saldanha - R in the ...

## Welsh test by permutations

November 1, 2009
WelshPerm <- function(response,variable,nperm=999,...){ base <- oneway.test(response~variable,...) base.p <- base$p.value base.W <- base$statistic count <- 1 # Permutation loop for(i in 1:nperm){ SAMPLE <- sample(response) we...

## R Tutorial Series: Summary and Descriptive Statistics

November 1, 2009
Summary (or descriptive) statistics are the first figures used to represent nearly every dataset. They also form the foundation for much more complicated computations and analyses. Thus, in spite of being composed of simple methods, they are essential ...

## Adventures with Comcast: Part ohnoesnotanotherone in an ongoing series

October 31, 2009
Regular readers of this blog (yes, both of you!) may remember the computer/broadband/ directory that this post appears in as the collection of my Comcastic (yeah right) experiences with my ISP. But I think this week may top everything. I'll just...

## Sorting a matrix/data.frame on a column

October 30, 2009
mat.sort <- function(mat,n) { mat),] <- mat return(mat) } a <- matrix(rnorm(100),ncol=10) mat.sort(a,1)

## useR! 2006

October 30, 2009
Last week, I attended the 2006 UseR! conference: here is a (long) summary of some of the talks that took place in Vienna -- since there were up to six simultaneous talks, I could not attend all of them... In this note: 0. General remarks 1. Tutorial:...

## R graphics

October 30, 2009
I just finished reading Paul Murrel's book, "R graphics". There are two graphical systems in R: the old ("classical" -- in the "graphics" package) one, and the new ("trellis", "lattice", "grid" -- in the "lattice" and "grid" packages) one. The first...

## The grammar of graphics (L. Wilkinson)

October 30, 2009
Though this book is supposed to be a description of the graphics infrastructure a statistical system could provide, you can and should also see it as a (huge, colourful) book of statistical plot examples. The author suggests to describe a statistical...

## Statistics with R

October 30, 2009
I have just uploaded the new version of my "Statistics with R": http://zoonek2.free.fr/UNIX/48_R/all.htmlThe previous version was one year and a half old, so in spite of the fact I have not had much time to devote to it in the past two years, it might...

## Use R 2009 Conference

October 30, 2009
I did not attend the conference this year, but just read through the presentations. There is some overlap with other R-related conferences, such as R in Finance or the Rmetrics workshop. http://www.agrocampus-ouest.fr/math/useR-2009/ http://www.rinfin...

## Decimal log scale on a plot

October 30, 2009
R only does natural (neperian) log scales by default, and this is lame. Here is a simple code to do decimal log scale, pretty much a requirement for scientists… The force(x/y)lim options works for natural and log scales (for the later case, you need to specify the power of 10 that you want : c(-2,2)

## Long R, Short Excel

October 29, 2009
R is very speedy statistical package that's like an F-18A Hornet, versus Excel which is like a paper airplane. R is professional sports, Excel is Pop Warner. R is Mona Lisa, Excel is stick figures. R is ... okay, you get the idea. I'm long R, and short...

## Income inequality and partisan voting in the United States

October 29, 2009
Lane Kenworthy, Yu-Sung Su, and I write: Income inequality in the United States has risen during the past several decades. Has this produced an increase in partisan voting differences between rich and poor? We examine trends from the 1940s thr...

## Simple R figures

October 29, 2009
http://www.harding.edu/fmccown/R/This comes very handy.

## Kicking Ass with plyr

October 29, 2009
Tonight (October 29, 2009) at 5:30 PM is the Chicago R meetup at Jaks tap. Here’s more info.  I’ll be making a presentation based on my earlier blog post about plyr. The presentation will only be 8 minutes long so I’ve had to pick and choose my info carefully. OK, who am I kidding? I

## Go long on close and sell on open

October 29, 2009
I found a description of supposed to be profitable strategy on Bloomberg. The strategy is simple – buy S&P500 index on close and sell it on next day open. So, I tested this claim and got nice P/L curve: Yes, since 1993 this strategy has generated the profit >300%. But, neither commissions or slippage are included:) Let’s

## Bioconductor 2.5 is out

October 29, 2009
For all bioinformaticians and R users out there: the Bioconductor project  for the analysis and comprehension of genomic data is out! A lot of interesting new stuff! See the full announcement here.

## Tips for Using StatET and Eclipse for Data Analysis in R

October 29, 2009
My favourite editor for conducting analysis in R is the StatET plug-in for Eclipse. This post discusses an assortment of tips and tricks that I've discovered to make this editing environment even better.SearchSearch (Control + H): I maintain projects i...

## Using the “foreign” package for data conversion

October 27, 2009
I was in a rush to convert a SPSS data into Stata format. Somehow my Stattransfer v.8 for Linux was lost and I did not want pause my work and go back to Windows just to get this one file converted. So fire Emacs+ESS+R, load the "foreign" package, did t...