Data Aggregation in R: plyr, sqldf and data.table

April 28, 2011
By
Data Aggregation in R: plyr, sqldf and data.table

I’ve also previously put up a couple of posts about aggregating data in R. In this post, I’m going to be trying some other alternative methods for aggregating the dataset. Before I begin, I’d like to thank Matthew Dowle for highlighting these to me. It’s a bit daunting at first, deciding which method of aggregating data is

Read more »

Day #30-31 errorbars here, errorbars there

April 28, 2011
By

Today I have been playing with the errorbars from knime. To recreate the plot from http://flyordie.sin.khk.be/2011/04/20/day-27-a-lot-of-graphics-in-one-place/ I had to be able to create 2 y-axis, and multiple plots on 1 graph. At the end of the day I ...

Read more »

Job Search Part 5: It’s Policy Time!

April 27, 2011
By
Job Search Part 5: It’s Policy Time!

This is the last post of this special mini-series on the job search and matching theory of unemployment. I will probably be extremely distracted for the next few months, including a month-long vacation in Europe to shake the horrors of undergrad off me...

Read more »

“Inside” Functors — Multiple Arguments

April 27, 2011
By
“Inside” Functors — Multiple Arguments

Again for HTML reasons this has been taken to http://strugglingthroughproblems.blogspot.com/2011/04/inside-functors-multiple-arguments.html

Read more »

A test of Ledoit-Wolf versus a factor model

April 27, 2011
By
A test of Ledoit-Wolf versus a factor model

Statistical factor models and Ledoit-Wolf shrinkage are competing methods for estimating variance matrices of returns.  So which is better?  This adds a data point for answering that question. Previously There are past blog posts on: the idea of variance matrices factor models of variance The data in this post are from the blog posts: “Weight … Continue reading...

Read more »

How to make 3-D graphics from SAS data

April 27, 2011
By
How to make 3-D graphics from SAS data

The blog SAS Analysis shows how to create 3-D images from SAS data ... using R: Some SAS programmers like to use SAS/IML to call R’s functions . However, it seems that SAS/IML fails to work with the latest versions of R since 2.12 . Others tend to play tricks to call R into SAS’s data step...

Read more »

DST is a b!tch, be careful with POSIX in a stupid timezone

April 27, 2011
By
DST is a b!tch, be careful with POSIX in a stupid timezone

At the department we have been analyzing some transaction data for some time. We got a new dataset with lots of transactions. Once you need interpurchase (IPT) times, posix is quite useful, as you can easily difference transactions to generate IPTs.So ...

Read more »

VideoLectures.net Recommender System Competition

April 27, 2011
By
VideoLectures.net Recommender System Competition

A Guest post to R-bloggers by Bart Blaszczyk. * * * * * * * * This week a new data competition for the best recommendation system begins. Similar in a form to the famous Netflix Prize, asks data scientists, algorithm geeks and statisticians to devise the most accurate algorithm that suggests in personalized way what movies may be of interest for visitors...

Read more »

Day #28 ggplot2 in knime

April 27, 2011
By

If you haven’t read yesterday’s post, I advise you to do so, because this is the fix of yesterday. Day #27: A lot of graphics in one place I found out how to use ggplot2 in knime. Say, for example, your code is this: library(ggplot2) myplot...

Read more »

Day #30-31 errorbars here, errorbars there

April 27, 2011
By

Today I have been playing with the errorbars from knime. To recreate the plot from http://flyordie.sin.khk.be/2011/04/20/day-27-a-lot-of-graphics-in-one-place/ I had to be able to create 2 y-axis, and multiple plots on 1 graph. At the end of the day I ...

Read more »

Bug Collector

April 26, 2011
By
Bug Collector

Most are quite unamused to find an ant infestation in their kitchen around this time of year. Time to start spraying that stuff that's not supposed to be harmful to humans but that you always wonder if it is anyway. Some bugs, such as ant attacks or so...

Read more »

Plotting Maps with R

April 26, 2011
By
Plotting Maps with R

I stumbled upon this tutorial while not studying and I thought it would be fun to try and plot maps of the San Francisco Bay Area household income, education, population density, poverty, etc...To do this I needed a Shapefile for the Bay Area zip codes similar to the London borough file used in...

Read more »

“Inside” Functors

April 26, 2011
By
“Inside” Functors

So, WordPress doesn’t like RweaveHTML, so this is posted at http://strugglingthroughproblems.blogspot.com/2011/04/inside-functors.html

Read more »

Great FAJ Article on Statistical Measure of Financial Turbulence Part 3

April 26, 2011
By
Great FAJ Article on Statistical Measure of Financial Turbulence Part 3

Building on posts Great FAJ Article on Statistical Measure of Financial Turbulence and Great FAJ Article on Statistical Measure of Financial Turbulence Part 2, I will now build a system incorporating a new correlation-based measure of turbulence and a ...

Read more »

Revolution R Enterprise 4.3 now available

April 26, 2011
By

We've just released the latest update to Revolution R Enterprise, version 4.3. This release includes version 2.12.2 of the open-source R engine linked with high-performance libraries, and packages it with additional functionality from Revolution Analytics for big data statistics, web services deployment, a Windows UI for R programming and debugging, and much more. Detailed information about the new release...

Read more »

New functions for linear model inference in Revolution R Enterprise 4.3

April 26, 2011
By

The latest release of Revolution R Enterprise shows how Revolution Analytics’ package for big data, RevoScaleR, is continuing add new capabilities for Big Data statistics. RevoScaleR removes the limits on the size of the data that can be processed in R through the use of the highly efficient .Xdf binary file format. Xdf stores data by rows within columns...

Read more »

Great FAJ Article on Statistical Measure of Financial Turbulence Part 2

April 26, 2011
By
Great FAJ Article on Statistical Measure of Financial Turbulence Part 2

I did not intend for this to be a multi-part series, but after some clear thinking at the beach over the weekend, I decided that it needed some more analysis.  For those of you that read the article or know Mahalanobis distance, the measure I pre...

Read more »

Automatically Save Your Plots to a Folder

April 26, 2011
By
Automatically Save Your Plots to a Folder

Suppose you're working on a problem that involves a loop for calculations. At each iteration inside the loop, you want to construct a plot. Not only do you want to see the plot, but you would like to save each plot for a presentation, report or paper...

Read more »

Automatically Save Your Plots to a Folder

April 26, 2011
By
Automatically Save Your Plots to a Folder

Suppose you're working on a problem that involves a loop for calculations. At each iteration inside the loop, you want to construct a plot. Not only do you want to see the plot, but you would like to save each plot for a presentation, report or paper...

Read more »

Bayesian job in Cambridge

April 26, 2011
By
Bayesian job in Cambridge

Here is an email that could appeal to some readers: Job in Cambridge MRC-BSU – Bayesian statistician Career development fellow MRC Biostatistics Unit, Cambridge We are offering an exciting opportunity to work on Bayesian models for infectious disease dynamics. A statistician is required to contribute to a programme of research to develop inferential approaches to

Read more »

R Bloggers

April 26, 2011
By
R Bloggers

I recently found a great resource for R in the blogosphere, the R Bloggers Blog Aggregator. Basically, the site aggregates posts from a bunch of blogs about R (like this one!) into a giant feed of uses for R. If you are interested in learning more ab...

Read more »

R Bloggers

April 26, 2011
By
R Bloggers

I recently found a great resource for R in the blogosphere, the R Bloggers Blog Aggregator. Basically, the site aggregates posts from a bunch of blogs about R (like this one!) into a giant feed of uses for R. If you are interested in learning more ab...

Read more »

Running Phylip’s contrast application for trait pairs from R

April 26, 2011
By
Running Phylip’s contrast application for trait pairs from R

Here is some code to run Phylip's contrast application from R and get the output within R to easily manipulate yourself. Importantly, the code is written specifically for trait pairs only as the regular expression work in the code specifically grabs da...

Read more »

Statistical Practice in Epidemiology using R

April 26, 2011
By
Statistical Practice in Epidemiology using R

This is a long running course which usually takes place in Tartu, Estonia. This year we are hosting it at IARC in Lyon, France. The course is intended for epidemiologists and statisticians who wish to use R  for statistical modelling … Continue reading →

Read more »

Adonis (PERMANOVA) – Assumptions

April 26, 2011
By
Adonis (PERMANOVA) – Assumptions

Before you use PERMANOVA (R-vegan function adonis) you should read the user notes for the original program by the author (Marti J. Anderson) who first came up with this method. An important assumtption for PERMANOVA is same "multivariate spread&qu...

Read more »

Designing and Analyzing Studies with Optmatch and RItools (Part 1)

April 25, 2011
By

I am currently writing a brief “how-to” for the APSA Section on Experimental Research newsletter on using Optmatch and RItools. The complete paper (a work in progress) can be found on my github page. I have the basics of the paper sketched in, but I would love to get feedback from the online R community,...

Read more »

A Tiny Model of Evolution

April 25, 2011
By
A Tiny Model of Evolution

I've always wanted to write a(n overly) simple model of evolution. The assumptions are minimalistic: only one species, for which each individual's genotype is represented as a one-dimensional real number, e.g. 7.4. Now, the fun stuff: I define a fu...

Read more »

4 lines of R to get you started using the Rook web server interface

April 25, 2011
By
4 lines of R to get you started using the Rook web server interface

Jeffrey Horner's new Rook package provides a new interface for developing R-based web applications. Rook allows the same application to run in R's built-in web server or (soon) in the rApache module. This post shows how easy it is to use the package's Rhttpd class to get started with Rook.

Read more »

Job Search Part 4: Timing Beveridge Curve Movements During A Recession

April 25, 2011
By
Job Search Part 4: Timing Beveridge Curve Movements During A Recession

This economics blogger feels like he would be cheating the reader if he did not include recent work done by Barnichon and Figura (2010) on timing movements in the unemployment rate during recessions. That is why this is part 4 of my special 5 part mini...

Read more »