If you are writing a book on Bayesian statistics

November 23, 2011
By

This post is somewhat marginal to R in that there are several statistical systems that could be used to tackle the problem. Bayesian statistics is one of those topics that I would like to understand better, much better, in fact. … Continue reading →

Read more »

Andrew gone NUTS!

November 23, 2011
By
Andrew gone NUTS!

Matthew Hoffman and Andrew Gelman have posted a paper on arXiv entitled “The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo” and developing an improvement on the Hamiltonian Monte Carlo algorithm called NUTS (!). Here is the abstract: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the

Read more »

DecisionStats review of Revolution R Enterprise 5.0

November 23, 2011
By

At the DecisionStats blog, Ajay Ohri has published his review of Revolution R Enterprise 5.0. The review includes a slideshow highlighting some of the features of the new release, including the expanded code snippets manager and the new cluster job manager. It's well worth checking out if you'd like a quick overview of what's new in the latest release....

Read more »

The ‘swst’ package to print statistical results in Sweave

November 23, 2011
By

When I was making the slides for a lecture on using Sweave to incorporate R and LaTeX I was unpleasantly surprised at how tedious it can be to extract statistical values and print them in proper LaTeX code. For example, consider a … Continue reading →

Read more »

Materials of the “LaTeX for Psychological Researchers” course

November 23, 2011
By

I have given a course on using LaTeX for psychological researchers. This course consisted of four lectures in which I discussed the following subjects: how to obtain a LaTeX distribution How to use LaTeX to write professional scientific reports How to use … Continue reading →

Read more »

Prague Half Marathon Ranking: 2% or 25% missing?

November 23, 2011
By
Prague Half Marathon Ranking: 2% or 25% missing?

I am a regular participant of Prague International Half Marathon. In a mass event like this the horde of runners needs a long time to reach the starting line. To make the times mutually comparable the “start time” is measured and afterwards subtracted from the “finish time”. Also the crowd is organized to corridors in such...

Read more »

An R function to analyze your Google Scholar Citations page

November 23, 2011
By
An R function to analyze your Google Scholar Citations page

Google scholar has now made Google Scholar Citations profiles available to anyone. You can read about these profiles and set one up for yourself here. I asked John Muschelli and Andrew Jaffe to write me a function that would download my Google Scholar...

Read more »

Updating CODA – a possible post-doctoral project

November 23, 2011
By
Updating CODA – a possible post-doctoral project

If you are looking for a post-doctoral position in statistical computing in 2012 then you may be able to help me with rewriting the coda package for R.  But if you are interested, you don’t have much time to apply.The … Continue reading →

Read more »

Volume by Price Charts using R

November 23, 2011
By
Volume by Price Charts using R

R-Bloggers is a wonderful site which offers some great ideas for analysis.While I have been busy of late, hence could not do much with R, I was inspired by this post by Eric Nguyen on Volume by Price chart. This chart can be used with a great effe...

Read more »

Why we need to deal with big data in R

November 22, 2011
By

Responding to the birth rates analysis in the post earlier this week on big-data analysis with Revolution R Enterprise, Luis Apiolaza asks at the Quantum Forests blog, do we really need to deal with big data in R? My basic question is why would I want to deal with all those 100 million records directly in R? Wouldn’t it...

Read more »

Misleading Statistics: Too much risk without a financial adviser?

November 22, 2011
By
Misleading Statistics: Too much risk without a financial adviser?

This popular article references a report by financial consulting firms that makes a fairly convincing argument (even though they mostly neglect inferential statistics, and some parts of their argument are misleading, or otherwise not convincing) that 401(k) participants who accept "help" from financial experts take less risk and have better returns than those who do

Read more »

Magical RUT with GIST

November 22, 2011
By
Magical RUT with GIST

In search of better ways to post my R code, I finally discovered how GIST can help make my R blogging easier.  I know I am way behind, and I apologize to my loyal readers for my shortcomings.  Here is yesterday’s Magical Russell 2000 code u...

Read more »

Time series cross-validation 2

November 22, 2011
By
Time series cross-validation 2

In my previous post, I shared a function for parallel time-series cross-validation, based on Rob Hyndman's code.  I thought I'd expand on that example a little bit, and share some additional wrapper functions I wrote to test other forecasting&nbsp...

Read more »

Example 9.15: Bar chart with error bars ("Dynamite plot")

November 22, 2011
By
Example 9.15: Bar chart with error bars ("Dynamite plot")

The "dynamite plot", a bar chart plotting the a mean with a error bar, is one of the most reviled types of image among statisticians. Reasons to dislike them are numerous, and are nicely summarized here. (Edward Tufte also suggests they be avoided.) ...

Read more »

Do we need to deal with ‘big data’ in R?

November 22, 2011
By
Do we need to deal with ‘big data’ in R?

David Smith at the Revolutions blog posted a nice presentation on “big data” (oh, how I dislike that term). It is a nice piece of work and the Revolution guys manage to process a large amount of records, starting with … Continue reading →

Read more »

Sermon Sentiment Analysis

November 22, 2011
By
Sermon Sentiment Analysis

Matt Chandler vs. Mark Driscoll I came across an interesting API from Viral Heat which is capable of “Sentiment Analysis.” This analysis is designed to capture the sentiment of a statement by ranking it on a scale from -1 to 1. For instance, a chipper sentence like “The smell of roses makes me giddy!” is

Read more »

RPostgreSQL 0.2-0, 0.2-1 and an unsung Open Source hero

November 21, 2011
By

RPostgreSQL goes back to a topic suggestion I had made for the Google Summer of Code 2008, and specifically for the R Project participation that year. And Sameer Kumar Prayaga (whom I then mentored for the project) did a fine job that summer putting t...

Read more »

Revolution R Enterprise 5.0 now available for free academic download

November 21, 2011
By

Revolution R Enterprise 5.0, which we announced last week, is now available for free download to students and faculty at academic institutions worldwide. If you've downloaded Revolution R Enterprise via the academic program before and are on the mailing list, you will have already received an email with download instructions; if not, just complete the form linked below and...

Read more »

Popular Baby Names Walk-Through Part 2 – Graphing the fast movers

November 21, 2011
By
Popular Baby Names Walk-Through Part 2 – Graphing the fast movers

I will assume you have read through part 1 and have the csv file loaded. While we covered some basic graphing in the last post i hope to get into a little more of the data crunching. Specifically I am interested in the names which where driven by a spe...

Read more »

A Simple R Script for Traders

November 21, 2011
By
A Simple R Script for Traders

Now that we've got the Python implementation under out belts, let's do the same thing with R. We'll still be able to pass command-line arguments to get a quick look at what our stock of interest is doing during the day. And we start with the familiar i...

Read more »

Functional and Parallel time series cross-validation

November 21, 2011
By
Functional and Parallel time series cross-validation

Rob Hyndman has a great post on his blog with example on how to cross-validate a time series model.  The basic concept is simple:  You start with a minimum number of observations (k), and fit a model (e.g. an arima model) to those observation...

Read more »

Magical Russell 2000

November 21, 2011
By
Magical Russell 2000

I have marveled at the magical Russell 2000 in Crazy RUT, but I am still surprised at its behavior through this selloff.  With a 20-day move of 30% (6% in one hour) and big outperformance to the developed and developing world, the Russell 2000 con...

Read more »

Quick-R Gets a Blog

November 21, 2011
By
Quick-R Gets a Blog

After maintaining the  Quick-R website (R tutorials and jumpstart) for the past 5 years, I’ve decided to add a blog so that I can go into more detail on topics related to practical data analysis. The statMethods blog will contain articles … Continue reading →

Read more »

Asynchrony in market data

November 21, 2011
By
Asynchrony in market data

Be careful if you have global daily data. The issue Markets around the world are open at different times.  November 21 for the Tokyo stock market is different from November 21 for the London stock market.  The New York stock market has yet a different November 21. The effect The major effect is that correlations … Continue reading...

Read more »

Popular Baby Names Walk-Through Part 1 – Web Scrapping and ggploting

November 20, 2011
By
Popular Baby Names Walk-Through Part 1 – Web Scrapping and ggploting

This is the first walk-through I have posted. Reading these types of posts has been incredibly helpful as I have been learning R and other useful tools in the Unix universe. Hopefully you find it helpful. First, I have been watching Google Python Video...

Read more »

Indexing Nested Lists

November 20, 2011
By
Indexing Nested Lists

I’ve long searched for a somewhat efficient approach to indexing nested lists and/or environments and here’s my best solution so far. For me, being able to compute such an index is the crucial part in order to actually manage such nested structures (which are very helpful in a lot of scenarios where formal classes are … Continue reading...

Read more »

Cross Pollination from Systematic Investor

November 20, 2011
By
Cross Pollination from Systematic Investor

After reading the fine article Style Analysis from Systematic Investor and What we can learn from Bill Miller and the Legg Mason Value Trust from Asymmetric Investment Returns, I thought I should combine the two in R with the FactorAnalytics package.&n...

Read more »

Matrix Performance in R

November 20, 2011
By

I've been working on an example of the new Graph Template Language from SAS.  As I don't have direct access to SAS 9.2, I've been developing via email with a friend that does.In the meantime, I thought I would start to investigate some of the performance properties of R.  I work in the financial risk industry and I often...

Read more »

Interactive presentations with deck.js

November 20, 2011
By
Interactive presentations with deck.js

Data analysis is often an iterative and interactive process. However, when I present about this subject, I feel often limited by the presentation software I use. It doesn't matter if I use LaTeX/PDF, PowerPoint or Keynote. In all cases it is either ver...

Read more »