Example 9.15: Bar chart with error bars ("Dynamite plot")

November 22, 2011
By
Example 9.15: Bar chart with error bars ("Dynamite plot")

The "dynamite plot", a bar chart plotting the a mean with a error bar, is one of the most reviled types of image among statisticians. Reasons to dislike them are numerous, and are nicely summarized here. (Edward Tufte also suggests they be avoided.) ...

Read more »

Do we need to deal with ‘big data’ in R?

November 22, 2011
By
Do we need to deal with ‘big data’ in R?

David Smith at the Revolutions blog posted a nice presentation on “big data” (oh, how I dislike that term). It is a nice piece of work and the Revolution guys manage to process a large amount of records, starting with … Continue reading →

Read more »

Sermon Sentiment Analysis

November 22, 2011
By
Sermon Sentiment Analysis

Matt Chandler vs. Mark Driscoll I came across an interesting API from Viral Heat which is capable of “Sentiment Analysis.” This analysis is designed to capture the sentiment of a statement by ranking it on a scale from -1 to 1. For instance, a chipper sentence like “The smell of roses makes me giddy!” is

Read more »

RPostgreSQL 0.2-0, 0.2-1 and an unsung Open Source hero

November 21, 2011
By

RPostgreSQL goes back to a topic suggestion I had made for the Google Summer of Code 2008, and specifically for the R Project participation that year. And Sameer Kumar Prayaga (whom I then mentored for the project) did a fine job that summer putting t...

Read more »

Revolution R Enterprise 5.0 now available for free academic download

November 21, 2011
By

Revolution R Enterprise 5.0, which we announced last week, is now available for free download to students and faculty at academic institutions worldwide. If you've downloaded Revolution R Enterprise via the academic program before and are on the mailing list, you will have already received an email with download instructions; if not, just complete the form linked below and...

Read more »

Popular Baby Names Walk-Through Part 2 – Graphing the fast movers

November 21, 2011
By
Popular Baby Names Walk-Through Part 2 – Graphing the fast movers

I will assume you have read through part 1 and have the csv file loaded. While we covered some basic graphing in the last post i hope to get into a little more of the data crunching. Specifically I am interested in the names which where driven by a spe...

Read more »

A Simple R Script for Traders

November 21, 2011
By
A Simple R Script for Traders

Now that we've got the Python implementation under out belts, let's do the same thing with R. We'll still be able to pass command-line arguments to get a quick look at what our stock of interest is doing during the day. And we start with the familiar i...

Read more »

Functional and Parallel time series cross-validation

November 21, 2011
By
Functional and Parallel time series cross-validation

Rob Hyndman has a great post on his blog with example on how to cross-validate a time series model.  The basic concept is simple:  You start with a minimum number of observations (k), and fit a model (e.g. an arima model) to those observation...

Read more »

Magical Russell 2000

November 21, 2011
By
Magical Russell 2000

I have marveled at the magical Russell 2000 in Crazy RUT, but I am still surprised at its behavior through this selloff.  With a 20-day move of 30% (6% in one hour) and big outperformance to the developed and developing world, the Russell 2000 con...

Read more »

Quick-R Gets a Blog

November 21, 2011
By
Quick-R Gets a Blog

After maintaining the  Quick-R website (R tutorials and jumpstart) for the past 5 years, I’ve decided to add a blog so that I can go into more detail on topics related to practical data analysis. The statMethods blog will contain articles … Continue reading →

Read more »

Asynchrony in market data

November 21, 2011
By
Asynchrony in market data

Be careful if you have global daily data. The issue Markets around the world are open at different times.  November 21 for the Tokyo stock market is different from November 21 for the London stock market.  The New York stock market has yet a different November 21. The effect The major effect is that correlations … Continue reading...

Read more »

Popular Baby Names Walk-Through Part 1 – Web Scrapping and ggploting

November 20, 2011
By
Popular Baby Names Walk-Through Part 1 – Web Scrapping and ggploting

This is the first walk-through I have posted. Reading these types of posts has been incredibly helpful as I have been learning R and other useful tools in the Unix universe. Hopefully you find it helpful. First, I have been watching Google Python Video...

Read more »

Indexing Nested Lists

November 20, 2011
By
Indexing Nested Lists

I’ve long searched for a somewhat efficient approach to indexing nested lists and/or environments and here’s my best solution so far. For me, being able to compute such an index is the crucial part in order to actually manage such nested structures (which are very helpful in a lot of scenarios where formal classes are … Continue reading...

Read more »

Cross Pollination from Systematic Investor

November 20, 2011
By
Cross Pollination from Systematic Investor

After reading the fine article Style Analysis from Systematic Investor and What we can learn from Bill Miller and the Legg Mason Value Trust from Asymmetric Investment Returns, I thought I should combine the two in R with the FactorAnalytics package.&n...

Read more »

Matrix Performance in R

November 20, 2011
By

I've been working on an example of the new Graph Template Language from SAS.  As I don't have direct access to SAS 9.2, I've been developing via email with a friend that does.In the meantime, I thought I would start to investigate some of the performance properties of R.  I work in the financial risk industry and I often...

Read more »

Interactive presentations with deck.js

November 20, 2011
By
Interactive presentations with deck.js

Data analysis is often an iterative and interactive process. However, when I present about this subject, I feel often limited by the presentation software I use. It doesn't matter if I use LaTeX/PDF, PowerPoint or Keynote. In all cases it is either ver...

Read more »

Tikz absolute positioning

November 20, 2011
By

When working with a tikz drawing within LaTeX document we might want to locate an object using an absoute position on the page rather than leaving LaTeX to make the decision for us. The use of nodes and the current.page label in conjunction with some other parameters attached to the tikz drawing will allow us

Read more »

RcppArmadillo 0.2.30 (and 0.2.29)

November 20, 2011
By

A few days ago, Conrad Sanderson released the first pre-release version of what will be Armadillo 2.4.*, giving it the 2.3.91 release handle. We folded this into RcppArmadillo release 0.2.30, with Romain making a few adjustments to our template stru...

Read more »

CloudStat: Learn & Do R Language on the Cloud

November 19, 2011
By

Hi! My fellow useRs! I’m making a web-based R Language platform ( http://cloudst.at/ ) for my students. My aim is to decrease the learning curve of learning R and collaboration. With CloudStat, there is no more download, installation, update and mai...

Read more »

Keep your files in sync for free

November 19, 2011
By
Keep your files in sync for free

It is not uncommon to have two computers at work, four at home and a server out on the wild, wild internet (that's what we have, anyway ... wait, we forgot one in London). How to keep all these files in sync? Here are our file synchronization tips.

Read more »

Data is everywhere!

November 19, 2011
By
Data is everywhere!

I was writing earlier today that I am getting really fed to using the same datasets over and over again. Of course using the same data over time with different methods (eg look this) serves really well on a comparison scope but still we can use other data in a web world. For example, you

Read more »

Data is everywhere!

November 19, 2011
By
Data is everywhere!

I was writing earlier today that I am getting really fed to using the same datasets over and over again. Of course using the same data over time with different methods (eg look this) serves really well on a comparison scope but still we can use other data in a web world. For example, you ...read more

Read more »

Public vote open for Mendely-PLoS Binary Battle: vote rOpenSci!

November 19, 2011
By
Public vote open for Mendely-PLoS Binary Battle: vote rOpenSci!

http://www.surveygizmo.com/s3/722753/Mendeley-PLoS-Binary-Battle-Public-Vote

Read more »

randu dataset, part 2

November 19, 2011
By
randu dataset, part 2

In my last post I have plotted randu dataset to show that all its points lie on 15 parallel planes. But I was not fully satified with the solution and decided to show this numerically.It can be done in four steps:identifying four points lying...

Read more »

Plotting randu dataset

November 18, 2011
By
Plotting randu dataset

Recently I have stumbled on help description of randu data from datasets package. It contains pseudorandom numbers that are flawed. Help says that "In three dimensional displays it is evident that the triples fall on 15 paralle...

Read more »

Let the Lagging Lead

November 18, 2011
By
Let the Lagging Lead

THIS IS NOT INVESTMENT ADVICE AND WILL PROBABLY WIPE OUT ALL YOUR MONEY IF PURSUED.  While exploring utilities, I discovered a strange phenomenon that I have not quite thoroughly understood, but I attribute to the business cycle.  If I dust o...

Read more »

Analyzing birth rates from census data from RevoScaleR

November 18, 2011
By
Analyzing birth rates from census data from RevoScaleR

In yesterday's webinar, "New Features in Revolution R Enterprise 5.0 to Support Scalable Data Analysis", Sue Ranney demonstrated the features of the RevoScaleR big data analysis package included with Revolution R Enterprise. In the webinar, she showed how to use the rxImport function to import big data sets from SAS, SPSS or ODBC, how to use the rxDataStep function...

Read more »

My talk on doing phylogenetics in R

November 18, 2011
By

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc.Please comment if you have code for doing bayesian phylogenetic inference in R.  ...

Read more »

My talk on doing phylogenetics in R

November 18, 2011
By
My talk on doing phylogenetics in R

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc.Please comment if you have code for doing bayesian phylogenetic inference in R.  ...

Read more »