abbreviating personality measures in R: a tutorial

March 31, 2010
A while back I blogged about a paper I wrote that uses genetic algorithms to abbreviate personality measures with minimal human intervention. In the paper, I promised to put the R code I used online, so that other people could download and use it. I put off doing that for a long time, because the

Social Media Analytics Research Toolkit Is Moving Into Private Beta

March 31, 2010
Download "Getting Started with the Social Media Analytics Research Toolkit" (pdf, 1.25 megabytes) Download the Social Media Analytics Research Toolkit My Social Media Analytics Research Toolkit is about to move into private beta. What's in the release?...

March 31, 2010
Adam Bonica, a grad student in political science at NYU, recently published a ranking of the political slant of various professions, based on the amount and recipient (Republican or Democratic) of political donations by lawyers, lobbyists, physicians and many other occupations. This paper (PDF) gives the complete analysis, but the chart below (created using the ggplot2 graphics package in...

Why isn’t my 2X Ultra ETF keeping pace with the market and what is path asymmetry (R ex)?

March 31, 2010
I've been reading a few articles lately, lambasting ultra ETFs for not keeping up with markets and ascribing the problem to weird unexplainable reasons such as portfolio derivative re-balancing and negative drift. I thought it would be nice to revisit...

Predicting April month return

March 31, 2010
Bespoke blogged about average monthly returns of the DJI and emphasized April. Before jumping on that information, let’s check some weak points. In that post, only average returns are presented. We need at least extreme points (min;max) and confidence ranges. Second problem – the normal market have upward trend and we need to get rid of

Lotka-Volterra model ~ intro

March 30, 2010
So many know about the Lotka-Volterra model (i.e. the predator-prey model) in ecology. This model portrays two species, the predator (y) and the prey (x), interacting each other in limited space. The prey grows at a linear rate () and gets eaten by the predator at the rate of (). The predator gains a certain

Some Code for Dumping Data from Twitter Gardenhose

March 30, 2010
Gardenhose is a Streaming API feed that continuously sends a sample (roughly 15% according to Ryan Sarver at the 140tc in September 2009) of all tweets to feed recipients. This is some code for dumping the tweets to files named by date and hour. It is in PHP which is not my favorite language, but works nonetheless. I received...

TTR_0.20-2 on CRAN

March 30, 2010
An updated version of TTR is now on CRAN. It fixes a couple bugs and includes a couple handy tweaks. Here's the full contents of the CHANGES file:TTR version 0.20-2 Changes from version 0.20-1NEW FEATURES:Added VWAP and VWMA (thanks to Brian Peterson...

Scientists misusing Statistics

March 30, 2010
In ScienceNews this month, there's controversial article exposing the fact that results claimed to be "statistically significant" in scientific articles aren't always what they're cracked up to be. The article -- titled "Odds Are, It's Wrong" is interesting, but I take a bit of an issue with the sub-headline, "Science fails to face the shortcomings of Statistics". As it...

Example 7.30: Simulate censored survival data

March 30, 2010
To simulate survival data with censoring, we need to model the hazard functions for both time to event and time to censoring. We simulate both event times from a Weibull distribution with a scale parameter of 1 (this is equivalent to an exponential ra...