## Plots in R and the ImageJ visualization

April 1, 2010
By

If you plot data in R and you would like to display the same data in the ImageJ view it is necessary to transfer the data matrix to ImageJ. The first thing which can be noticed is that the image data is displayed rotated because of the Bio7 approach to transfer data forth and back

## abbreviating personality measures in R: a tutorial

March 31, 2010
By

A while back I blogged about a paper I wrote that uses genetic algorithms to abbreviate personality measures with minimal human intervention. In the paper, I promised to put the R code I used online, so that other people could download and use it. I put off doing that for a long time, because the

## Social Media Analytics Research Toolkit (SMART@znmeb) Is Moving Into Private Beta

March 31, 2010
By

Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...

March 31, 2010
By

Adam Bonica, a grad student in political science at NYU, recently published a ranking of the political slant of various professions, based on the amount and recipient (Republican or Democratic) of political donations by lawyers, lobbyists, physicians and many other occupations. This paper (PDF) gives the complete analysis, but the chart below (created using the ggplot2 graphics package in...

## Why isn’t my 2X Ultra ETF keeping pace with the market and what is path asymmetry (R ex)?

March 31, 2010
By

I've been reading a few articles lately, lambasting ultra ETFs for not keeping up with markets and ascribing the problem to weird unexplainable reasons such as portfolio derivative re-balancing and negative drift. I thought it would be nice to revisit...

## Predicting April month return

March 31, 2010
By

Bespoke blogged about average monthly returns of the DJI and emphasized April. Before jumping on that information, let’s check some weak points. In that post, only average returns are presented. We need at least extreme points (min;max) and confidence ranges. Second problem – the normal market have upward trend and we need to get rid of

## Lotka-Volterra model ~ intro

March 30, 2010
By
$Lotka-Volterra model ~ intro$

So many know about the Lotka-Volterra model (i.e. the predator-prey model) in ecology. This model portrays two species, the predator (y) and the prey (x), interacting each other in limited space. The prey grows at a linear rate () and gets eaten by the predator at the rate of (). The predator gains a certain

## Some Code for Dumping Data from Twitter Gardenhose

March 30, 2010
By

Gardenhose is a Streaming API feed that continuously sends a sample (roughly 15% according to Ryan Sarver at the 140tc in September 2009) of all tweets to feed recipients. This is some code for dumping the tweets to files named by date and hour. It is in PHP which is not my favorite language, but works nonetheless. I received...

## TTR_0.20-2 on CRAN

March 30, 2010
By

An updated version of TTR is now on CRAN. It fixes a couple bugs and includes a couple handy tweaks. Here's the full contents of the CHANGES file:TTR version 0.20-2 Changes from version 0.20-1NEW FEATURES:Added VWAP and VWMA (thanks to Brian Peterson...

## Scientists misusing Statistics

March 30, 2010
By

In ScienceNews this month, there's controversial article exposing the fact that results claimed to be "statistically significant" in scientific articles aren't always what they're cracked up to be. The article -- titled "Odds Are, It's Wrong" is interesting, but I take a bit of an issue with the sub-headline, "Science fails to face the shortcomings of Statistics". As it...

## Example 7.30: Simulate censored survival data

March 30, 2010
By

To simulate survival data with censoring, we need to model the hazard functions for both time to event and time to censoring. We simulate both event times from a Weibull distribution with a scale parameter of 1 (this is equivalent to an exponential ra...

## Smoothing time series with R

March 29, 2010
By

Smoothing is a statistical technique that helps you to spot trends in noisy data, and especially to compare trends between two or more fluctuating time series. It's a useful visualization tool that I'm pleased to see cropping up more and more in statistical graphics on the Web -- it's now a staple in econometric charts and is heavily used...

## Looking for Software Paths in Windows Registry

March 28, 2010
By

hen we want to call external programs in R under Windows, we often need to know the paths of these programs. For instance, we may want to know where ImageMagick is installed, as we need the convert (convert.exe) utility to convert images to other formats, or where OpenBUGS is installed because we need this path

## Example 7.29: Bubble plots colored by a fourth variable

March 27, 2010
By

In Example 7.28, we generated a bubble plot showing the relationship among CESD, age, and number of drinks, for women. An anonymous commenter asked whether it would be possible to color the circles according to gender. In the comments, we showed simp...

## Finance::YahooQuote 0.24

March 26, 2010
By

Having espoused rule number one in regression testing in the post about yesterday's bug fix upload 0.23, we can now add rule number zero: Do not introduce a new error by omitting the trailing semicolon. I guess it shows that I don't really program in...

## Finance::YahooQuote 0.24

March 26, 2010
By

Having espoused rule number one in regression testing in the post about yesterday's bug fix upload 0.23, we can now add rule number zero: Do not introduce a new error by omitting the trailing semicolon. I guess it shows that I don't really program in P...

## Finance::YahooQuote 0.24

March 26, 2010
By

Having espoused rule number one in regression testing in the post about yesterday's bug fix upload 0.23, we can now add rule number zero: Do not introduce a new error by omitting the trailing semicolon. I guess it shows that I don't really program in...

## Rcpp 0.7.11

March 26, 2010
By

A new versions 0.7.11 of Rcpp is awaiting inclusion into CRAN and Debian. It is also available from here. This version fixes a somewhat serious bug uncovered by Doug Bates when working with vectors of strings. We also added a few new accessor function...

## Rcpp 0.7.11

March 26, 2010
By

A new versions 0.7.11 of Rcpp is awaiting inclusion into CRAN and Debian. It is also available from here. This version fixes a somewhat serious bug uncovered by Doug Bates when working with vectors of strings. We also added a few new accessor func...

## ‘R’ = dna.translate("AGG") . A custom C function for R, My notebook.

March 26, 2010
By

In the following post, I will show how I've implemented a custom C function for R. This C function will translate a DNA to a protein. I'm very new to 'R' so feel free to make any comment about the code.C codeThe data in 'R' are stored in an opaque stru...

## ‘R’ = dna.translate("AGG") . A custom C function for R, My notebook.

March 26, 2010
By

In the following post, I will show how I've implemented a custom C function for R. This C function will translate a DNA to a protein. I'm very new to 'R' so feel free to make any comment about the code.C codeThe data in 'R' are stored in an opaque stru...

## Code Highlights in WordPress

March 26, 2010
By

I’ve come across a very useful plugin for WordPress which highlights code in posts using GeSHi called WP-Syntax. This plugin is easy to use and adds highlights simply by putting the appropriate tags around code blocks. For instance, we can make the following R code much more readable by using WP-Syntax. ## Generate 100

## Predicting Pizza

March 26, 2010
By

What's the secret to the best pizza in New York? That's what statistical consultant and R user Jared Lander sought to find out, by analyzing the rankings of NY pizza joints at MenuPages.com, and building a regression model for ratings based on variables like localion, price, number of reviews, and pizza-oven type (gas, coal or wood)? Here's a scatterplot...

## Summarising data using dot plots

March 26, 2010
By

A dot plot is a type of display that compares counts, frequencies, totals or other summary measures for a series of categories. The dot plot can be arranged with the categories either on the vertical or horizontal axis of the display to allow comparising between the different categories as well as comparison within categories where

## BioMart (and biomaRt)

March 26, 2010
By

I’ve been vaguely aware of BioMart for a few years. Inexplicably, I’ve only recently started to use it. It’s one of the most useful applications I’ve ever used. The concept is simple. You have a set of identifiers that describe a biological object, such as a gene. These are called filters. They have values –

## Finance::YahooQuote 0.23

March 25, 2010
By

Rule number one in regression testing is to not depend on volatile data. Which I seem to have violated in file t/02simple.t in the Perl package Finance::YahooQuote. Which lead the automated Perl test scripts to remind me for a few days now that the f...

## Finance::YahooQuote 0.23

March 25, 2010
By

Rule number one in regression testing is to not depend on volatile data. Which I seem to have violated in file t/02simple.t in the Perl package Finance::YahooQuote. Which lead the automated Perl test scripts to remind me for a few days now that the fu...

## Finance::YahooQuote 0.23

March 25, 2010
By

Rule number one in regression testing is to not depend on volatile data. Which I seem to have violated in file t/02simple.t in the Perl package Finance::YahooQuote. Which lead the automated Perl test scripts to remind me for a few days now that the f...

## How Misinformed are Tea Party Protesters About Tax Policy?

March 25, 2010
By
$How Misinformed are Tea Party Protesters About Tax Policy?$

For those of you used to reading about international relations, I apologize for the following brief foray into American politics. It appears that the American Enterprise Institute and David Frum have decided to (abruptly) part ways. Before David left, however, he and his team of interns provided some interesting statistical insight into the