## (more on) Pattern Matching for Transcription Factor Binding Sites

February 2, 2011
By

I have published some initial script scribblings on this task about a week ago. After another week I'm posting some better formed and annotated code. The Biostrings and BSGenomes packages are new to me and I've gone through many many iterations and ex...

## A legitimate use for the stupidest variable name ever

February 2, 2011
By

The help page to make.names describes how to make a valid variable name in R: A syntactically valid name consists of letters, numbers and the dot or underline characters and starts with a letter or the dot not followed by a number. Names such as ‘”.2way”’ are not valid, and neither are the reserved words.

## Plotting images on a grid using R or Python

February 2, 2011
By

A thread depicting how to insert a png image in a plot, thanks to Stackoverflow: plotting-images-on-a-grid. A very basic tip, still useful to someone.

## Plotting images on a grid using R or Python

February 2, 2011
By

A thread depicting how to insert a png image in a plot, thanks to Stackoverflow: plotting-images-on-a-grid. A very basic tip, still useful to someone.

## Charting For Fun

February 2, 2011
By

Interesting Charts Making Lemonade If you are working on the FREE eMetrics pass, and you really should if you need a free pass, I created some charts based on the sample data. These data are limited in terms of the … Continue reading →Charting For Fun is a post from: MichaelDHealy.com

## Annotated source code

February 1, 2011
By

We programmers are told that reading code is a good idea. It may be good for you, but it's hard work. Jeremy Ashkenas has come up with a simple tool that makes it easier: docco. Ashkenas is also behind underscore.js and coffeescript, a dialect of ja...

## Annotated source code

February 1, 2011
By

We programmers are told that reading code is a good idea. It may be good for you, but it's hard work. Jeremy Ashkenas has come up with a simple tool that makes it easier: docco. Ashkenas is also behind underscore.js and coffeescript, a dialect of ja...

## Teach Yourself How to Create Functions in R

February 1, 2011
By

As you can tell from my previous posts, I am diving in head first into learning how to program (and simplify) my analytical life using R.  I have always learned by example and have never really prospered from the “learn from scratch” school of thought.  As I follow along with some other fellow R programmers,

## Atmospheric Temperature Structure : 2 – Stratospheric Cooling

February 1, 2011
By

In this  post I review the temperature structure of the atmosphere and lower stratosphere temperature (TLS) anomaly trends. Temperature Structure in the Atmosphere In post 1 of this series, I developed this RClimate chart of temperature soundings which I update … Continue reading →

## Revolution R Enterprise 4.2 now available

February 1, 2011
By

Today we're pleased to announce the availability of the latest update to the Revolution R family, Revolution R Enterprise 4.2. This release includes all of the capabilities of the most powerful statistical software available, open-source R (version 2.11.1), plus additional components for big data analysis, integration, user experience and more. Version 4.2 includes a number of new features, including:...

## Introductory R Books

January 31, 2011
By

Here's a link to another blog compiling information and recommendations are introductory books on R (not statistics books that use R).  I thought this might be useful for people.http://csgillespie.wordpress.com/2011/01/28/r-programming-books-updated/

## Tricks to manage memory in an R session

January 31, 2011
By

Unless you're using an out-of-memory solution to manage large data objects (such as the RevoScaleR package in Revolution R Enterprise), then R always allocates memory for every object in your working session. If you're working with many objects (or even just a few large objects) then you'll need to take care to manage R's memory usage to avoid the...

## sab-R-metrics: Some Extra Visualization Customization

January 31, 2011
By

Last post, I described a number of ways to show your data on a scatter plot. Ricky Zanker at THT has a similar post today for those looking to get some extra exposure and another take on R programming. Today, I plan to extend on this with a little more customization. First, if you've missed...

## sab-R-metrics: Some Extra Visualization Customization

January 31, 2011
By

Last post, I described a number of ways to show your data on a scatter plot. Ricky Zanker at THT has a similar post today for those looking to get some extra exposure and another take on R programming. Today, I plan to extend on this with a little more customization. First, if you've missed...

## Tick data retrieval

January 31, 2011
By

I just published Java based code to pull tick data from Interactive Brokers. There are thousands tools to get tick data from IB, but I had one feature in mind. You can get maximum 50 quotes per second from Interactive Brokers (its IB limitation for TWS API) . Imagine a situation, when there is a

## DataMarket

January 31, 2011
By

I have just discovered yet another public data site www.datamarket.com. Most of the data are time-series. It collects together things like World bank, Eurostat, Gapminder into the one place. It also allows you to download data as csv files or to creat...

## R Tutorial Series: Two-Way ANOVA with Pairwise Comparisons

January 31, 2011
By

By extending our one-way ANOVA procedure, we can test the pairwise comparisons between the levels of several independent variables. This tutorial will demonstrate how to conduct pairwise comparisons in a two-way ANOVA. Tutorial FilesBefore we begin, yo...

## Example 8.23: Expanding latent class model results

January 31, 2011
By

In Example 8.21 we described how to fit a latent class model to data from the HELP dataset using SAS and R (using poLCA(), and then followed up in example 8.22 using randomLCA(). In both entries, we classified subjects based on their observed (manifes...

## R Tutorial Series: Two-Way ANOVA with Pairwise Comparisons

January 31, 2011
By

By extending our one-way ANOVA procedure, we can test the pairwise comparisons between the levels of several independent variables. This tutorial will demonstrate how to conduct pairwise comparisons in a two-way ANOVA. Tutorial FilesBefore we begin, yo...

## A gentle introduction to R

January 31, 2011
By

Whenever a post on this blog requires some data analysis and perhaps a chart or two, my tool of choice is the versatile statistical programming package R. Developed as an open-source implementation of an engine for the S programming language, R is therefore free. Since commercial mathematical packages can costs thousands of dollars, this alone

## Good riddance to Excel pivot tables

January 30, 2011
By

Excel pivot tables have been how I have reorganized data...up until now. These are just a couple of examples why R is superior to Excel for reorganizing data:################ Good riddance to pivot tables ############library(reshape2)library(plyr)&nbsp...

## ABC model choice not to be trusted [3]

January 30, 2011
By

On Friday, I received a nice but embarrassing email from Xavier Didelot. He indeed reminded me that I attended the talk he gave at the model choice workshop in Warwick last May, as, unfortunately but rather unsurprisingly giving my short span memory!, I had forgotten about it! Looking at the slides he joined to his

## Code: parsing Slovenian exchange rate data

January 30, 2011
By

﻿Some time ago I found myself in need of daily exchange rates for the Slovenian Tolar (though I can’t now remember why). Unfortunately, I wasn’t able to find the data in a readily usable format at the Bank of Slovenia … Continue reading →

## Data Mining with WEKA

January 30, 2011
By

There are a number of good open source projects for statistics and data mining, for example the software WEKA developed at the University of Waikato. The description on their website states that: Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or

## Statistical Computing and Graphics Newsletter

January 30, 2011
By

The new issue (Vol. 21, No. 2) is out now. Featured articles are: barNest: Illustrating nested summary measures by Jim Lemon and Ofir Levy You say “graph invariant,” I say “test statistic” by Carey E. Priebe, Glen A. Coppersmith and Andrey Rukhin Computation in Large-Scale Scientific and Internet Data Applications is a Focus of MMDS 2010

## Tab completion

January 30, 2011
By

Let's say your hands are aching from too much typing in of variables. What to do? Get a keyboard tray and learn proper ergonomics, of course.But what if you just want to reduce the amount of typing in of variables you do for reasons of laziness...err...

## R exam

January 30, 2011
By
$R exam$

I spent most of my Saturday perusing R codes to check the answers written by my students to the R exam I gave two weeks ago… The outcome is mostly poor, even though some managed to solve a fair part of the long problem. Except for the few hopeless cases who visibly never wrote a

## Boxplots and Beyond – Part I

Boxplots are a simple and reasonably popular way of summarizing the range of variation of a real-valued variable across different subsets of data.  Typical examples might include diastolic blood pressure across a group of patients, broken dow...