R-bloggers

December 5, 2011
By

For a long time, I have relied on R-bloggers for new, interesting, arcane, and all around useful information related to R and statistics. Now my R-related material is appearing there. If you use the R package at all, R-bloggers should be in your feed a...

Read more »

The Art of R Programming – my two cents

December 5, 2011
By
The Art of R Programming – my two cents

What makes this book different from other books about R is stated clearly by the author Norman Matloff in the introduction: "This book is not a compendium of the myriad types of statistical methods that are available in the wonderful R package. It r...

Read more »

The volatility mystery continues

December 5, 2011
By
The volatility mystery continues

How do volatility estimates based on monthly versus daily returns differ? Previously The post “The mystery of volatility estimates from daily versus monthly returns” and its offspring “Another look at autocorrelation in the S&P 500″ discussed what appears to be an anomaly in the estimation of volatility from daily versus monthly data. In recent times … Continue reading...

Read more »

I may have been hasty…

December 4, 2011
By

I think one of the real reasons that I haven't liked R is the default interface blows (sucks, whatever).  I just discovered the Eclipse plugin StatET.  This things rules.  Contextual help, completion, object browser, data browser, etc. &...

Read more »

Steve Jobs’ 2005 Stanford Commencement Address

December 4, 2011
By

Given that there are almost 13 million views of Steve Jobs’ commencement address, I am certain that I missed this video when it went viral. I am glad that I did not see it until now because I may not have appreciated his words of wisdom. And although...

Read more »

Improved Moving Average?

December 4, 2011
By
Improved Moving Average?

When @quantfblog started following me on Twitter, I was delighted to discover their papers Papailias, Fotis and Thomakos, Dimitrios D., An Improved Moving Average Technical Trading Rule (September 11, 2011). Available at SSRN: http://ssrn.com/abstract...

Read more »

Introducing Biostatistics to first year LCG students

December 4, 2011
By
Introducing Biostatistics to first year LCG students

Around two weeks ago I gave a talk via skype to the first year students from the Undergraduate Program on Genomic Sciences (LCG in Spanish) from the National Autonomous University of Mexico (UNAM in Spanish). The talk was under the context of the Introduction to Bioinformatics Seminar Series whose goal is to familiarize the new students with the bioinformatics...

Read more »

Non-PD Matrices in R, Cont.

December 3, 2011
By

Let me preface this post by saying I am getting frustrated with R.  The syntax is not intuitive and the performance for matrix operations is slow.  Using Octave, a free Matlab clone, I can get over 6 Gflops on things that R is doing at less than 2.  After this post, I will focus on the statistical functions of R...

Read more »

Visualizing Unemployment Data

December 3, 2011
By
Visualizing Unemployment Data

So recently Bureau of Labor Statistics released the Oct. 2011 unemployment data. This is not a discussion of it’s validity nor it’s impact, but it is a post on how to visualize it. This post is also for my posterity, I’ve wanted to be able to do this for a while, and it’ll serve as

Read more »

On the (statistical) road, workshops and R

December 3, 2011
By
On the (statistical) road, workshops and R

Things have been a bit quiet at Quantum Forest during the last ten days. Last Monday (Sunday for most readers) I flew to Australia to attend a couple of one-day workshops; one on spatial analysis (in Sydney) and another one … Continue reading →

Read more »

Comparing model selection methods

December 2, 2011
By
Comparing model selection methods

The standard textbook analysis of different model selection methods, like cross-validation or validation sample, focus on their ability to estimate in-sample, conditional or expected test error. However, the other interesting question is to compare the...

Read more »

O’Reilly’s Data Science Kit – Books

December 2, 2011
By
O’Reilly’s Data Science Kit – Books

It is not as if I don't have enough books (and material on the web) to read. But this list compiled by the O'Reilly team should make any data analyst salivate.http://shop.oreilly.com/category/deals/data-science-kit.doThe Books and Video included in the...

Read more »

Easy cell statistics for factorial designs

December 2, 2011
By
Easy cell statistics for factorial designs

A common task when analyzing multi-group designs is obtaining descriptive statistics for various cells and cell combinations. There are many functions that can help you accomplish this, including aggregate() and by() in the base installation, summaryBy() in the doBy package, and … Continue reading →

Read more »

Applications of R in Business Contest: Final Entries

December 2, 2011
By

The revision period for the Applications of R in Business Contest is now at a close, and the competitors have finalized their entries for a chance at $20,000 in prizes from Revolution Analytics. We're now in the judging phase, where the finalists will be rated on applicability to business, innovation and persuasiveness by an independent panel of judges from...

Read more »

Week in Review 021211 R Language

Week in Review 021211 R Language

Happy last month of 2011. I will fly to Sydney to present a paper at the 24th Australasian Finance & Banking Conference on next Thursday, so we may not have a review next week. However, feel free to contact me @a_biao for sharing any useful post. This week's review is highly concentrated on

Read more »

Working with Wisconsin Voter Data in Access 2007; Analysis with R.

December 2, 2011
By

Computer Assisted Reporting This technical note describes manipulation/analysis of Wisconsin voter registration data from June 2011. Wisconsin voter registration data can be purchased from the Wisconsin Government Accountability Board for $12,500, whic...

Read more »

Wasting away again in Martingaleville

December 1, 2011
By
Wasting away again in Martingaleville

Alright, I better start with an apology for the title of this post. I know, it’s really bad. But let’s get on to the good stuff, or, perhaps more accurately, the really frightening stuff. The plot shown at the top of this post is a simulation of the martingale betting strategy. You’ll find code for

Read more »

Backtesting with Short positions

December 1, 2011
By
Backtesting with Short positions

I want to illustrate Backtesting with Short positions using an interesting strategy introduced by Woodshedder in the Simple, Long-Term Indicator Near to Giving Short Signal post. This strategy was also analyzed in details by MarketSci in Woodshedder’s Long-Term Indicator post. The strategy uses the 5 day rate of change (ROC5) and the 252 day rate

Read more »

Interviews on Revolution R Enterprise 5.0

December 1, 2011
By

For those looking for more background behind the updates in Revolution R Enterprise 5.0, there are now a couple of interviews online where I talk about the new release. At IT Business Edge ("Revolution Analytics' Goal: Make R Analysis Enterprise-Friendly"), I had a chat with Loraine Lawson about how Revolution R Enterprise fits within the analytics stack, its big-data...

Read more »

A Friday round-up

December 1, 2011
By
A Friday round-up

Just a brief selection of items that caught my eye this week. Note that this is a Friday as opposed to Friday, lest you mistake this for a new, regular feature. 1. R/statistics ggbio A new Bioconductor package which builds on the excellent ggplot graphics library, for the visualization of biological data. R development master

Read more »

C++ is dead. Long live C++

December 1, 2011
By
C++ is dead. Long live C++

During the summer I was contacted by a hedge fund from Bahamas. The fund was looking for someone with R language skills on-site and insisted for phone interview. Besides obvious questions about finance, statistics, coding and how many tennis balls can fit in Boeing 747 (ok, this question was omitted), they wanted to know if

Read more »

NG Spreads returns, a reliable earner.

December 1, 2011
By
NG Spreads returns, a reliable earner.

Is Drawdown the Biggest Determinant of System Success?

December 1, 2011
By
Is Drawdown the Biggest Determinant of System Success?

In all my system development, I still have not been able to determine what universal underlying conditions significantly improve a system’s chances of outperforming buy-and-hold.  Also, I have found very little discussion, so maybe R with some h...

Read more »

Fitting distributions with R

December 1, 2011
By
Fitting distributions with R

Fitting distribution with R is something I have to do once in a while.A good starting point to learn more about distribution fitting with R is Vito Ricci's tutorial on CRAN. I also find the vignettes of the actuar and fitdistrplus packag...

Read more »

Logistic Regression Explained

December 1, 2011
By
Logistic Regression Explained

Logistic regression is a type of regression used when the dependant variable is binary or ordinal (e.g. when the outcome is either “dead” or “alive”). It is commonly used for predicting the probability of occurrence of an event, based on several predictor variables that may either be numerical or categorical. For example, suppose a researcher

Read more »

Producing Google Map Embeds with R Package googleVis

December 1, 2011
By
Producing Google Map Embeds with R Package googleVis

(1) for producing html code for a Google Map with R-package googleVis do something like: library(googleVis)df <- data.frame(Address = c("Innsbruck", "Wattens"), Tip = c("My Location 1", "My Location 2"))mymap <- gvisMap(df, "Addre...

Read more »

More Dabblings With Local Sentencing Data

December 1, 2011
By
More Dabblings With Local Sentencing Data

In Accessing and Visualising Sentencing Data for Local Courts I posted a couple of quick ways in to playing with Ministry of Justice sentencing data for the period July 2010-June 2011 at the local court level. At the end of the post, I wondered about how to wrangle the data in R so that I

Read more »

Path from root to leaf node in mvpart

December 1, 2011
By
Path from root to leaf node in mvpart

I was recently asked by a R user about how one could extract the “rule” in a classification/regression tree. The requirement was to obtain the path traced from the root node to the leaf nodes and obtain all the paths or “rules” path.rpart() function in the mvpart package provides this convenience library(mvpart) # Create a

Read more »

quantum forest

December 1, 2011
By
quantum forest

Thanks to a link on R-bloggers, I was introduced to Luis Apiolaza’s blog, Quantum Forest, which covers data analyses and R comments he encounters in his research as a quantitative forester/geneticist. And he works at the University of Canterbury, Christchurch, where I first taught from Bayesian Core in 2006. Which may be why he chose

Read more »