Backtesting a Simple Stock Trading Strategy: Part 3

October 17, 2011
By
Backtesting a Simple Stock Trading Strategy: Part 3

Note: This post is NOT financial advice!  This is just a fun way to explore some of the capabilities R has for importing and manipulating data.   In a previous post, I examined a simple stock trading strategy: Find the high point over the la...

Read more »

Tikz Nodes

October 17, 2011
By
Tikz Nodes

Nodes are used in tikz to place content in a picture as part of a LaTeX document. Fast Tube by Casper When creating a tikz picture the origin is assumed to be at (0,0) and objects are placed with positioning relative to the origin on the picture. If we wanted to add a grid with

Read more »

Installing rgdal on a Mac

October 16, 2011
By

So, installing rgdal, which is an important R package for spatial data analysis can be a bit of a pain on the mac. Here are two ways to make it happen.   The Easy Way In R run: install.packages('rgdal',repos="http://www.stats.ox.ac.uk/pub/RWin") The Hard Way Download and install GDAL 1.8 Complete and  PROJ framework v4.7.0-2   from: http://www.kyngchaos.com/software/frameworks%29 Download the latest version of rgdal from CRAN.

Read more »

Running SQL Queries in R With the SQLDF Package

October 16, 2011
By

  The sqldf package can be used to run sql queries on R data frames. The user simply needs to specify a sql statement enclosed by quotation marks within the sqldf() function. In the follow R code, you see various ways of using the sqldf package to run sql queries on R data frames. The sql

Read more »

Geo-doodlers – Paul Butler and FlowingData

October 16, 2011
By
Geo-doodlers – Paul Butler and FlowingData

I found this great R-Visualization example via an R-Blogger post that xingmowang made. (One more good reason for why it is important to read lots of field-related blogs!)Here's the image:If this was merely eye-candy, I would have enjoyed it, but not in...

Read more »

Linear mixed models in R

October 16, 2011
By

A substantial part of my job has little to do with statistics; nevertheless, a large proportion of the statistical side of things relates to applications of linear mixed models. The bulk of my use of mixed models relates to the … Continue reading →

Read more »

R tells you where weapons go

October 16, 2011
By
R tells you where weapons go

As an ameturer programmer (one without proper trainings in any mainstream programming language — C and Java) , the more I use R the more I understand the saying — “You are only bounded by your imagination”. The other day I … Continue reading →

Read more »

pscl 1.04 live on CRAN

October 15, 2011
By

Update to my pscl package, now on CRAN. Biggest change: fixing a bug in the way MCMC draws for item parameters were being stored and summarized by ideal.

Read more »

National Gallery of Ireland

October 15, 2011
By
National Gallery of Ireland

During a short if profitable visit to Dublin for a SFI meeting on Tuesday/Friday, I had the opportunity to visit the National Gallery of Ireland in my sole hour of free time (as my classy hotel was very close). The building itself is quite nice, being well-inserted between brick houses from the outside, while providing

Read more »

More on higher moments: rolling skewness of S&P 500 daily returns

October 15, 2011
By
More on higher moments: rolling skewness of S&P 500 daily returns

In this post, Portfolio Probe explores a way to decide whether market kurtosis and skewness are predictable. Market skewness, in naive financial modeling, is some kind of measure of (as-)symmetrical distribution of (daily) returns around the average market return. A higher skewness would tend to indicate a denser distribution of higher returns, compared to lower

Read more »

Once you’re comfortable with 2-arrays and 2-matrices, you…

October 15, 2011
By
Once you’re comfortable with 2-arrays and 2-matrices, you…

Once you’re comfortable with 2-arrays and 2-matrices, you can move up a dimension or two, to 4-arrays or 4-tensors. You can move up to a 3-array / 3-tensor just by imagining a matrix which “extends back into the blackboard”. Like a 5 × 5 ma...

Read more »

Once you’re comfortable with 2-arrays and 2-matrices, you…

October 15, 2011
By
Once you’re comfortable with 2-arrays and 2-matrices, you…

Once you’re comfortable with 2-arrays and 2-matrices, you can move up a dimension or two, to 4-arrays or 4-tensors. You can move up to a 3-array / 3-tensor just by imagining a matrix which “extends back into the blackboard”. Like a 5 × 5 ma...

Read more »

Principal component analysis : Use extended to Financial economics : Part 1

October 15, 2011
By
Principal component analysis : Use extended to Financial economics : Part 1

While working for my Financial economics project I came across this elegant tool called Principal component analysis (PCA)which is an extremely powerful tool when it comes to reducing the dimentionality of a data set comprising of highly correlated var...

Read more »

Random art on the web

October 15, 2011
By
Random art on the web

Since we explored some statitics of an abstract painting with Pierre (we even have an article in Variances last issue!), I became more sensitive to art linked to randomness. Here are some pointers to related websites I have digged out. Random.org, mentioned here by Pierre, is, at it reads, a true random number service that

Read more »

Free auditing of Stanford AI and Machine Learning Courses w/Peter Norvig

October 14, 2011
By
Free auditing of Stanford AI and Machine Learning Courses w/Peter Norvig

Just wanted to notify viewers of a few great courses that are being offered free for auditing and/or participation by well known industry experts, including co-author of the classic text on AI, 'Artificial Intelligence: A Modern Approach,' Peter Norvig...

Read more »

Maximum Loss and Mean-Absolute Deviation risk measures

October 14, 2011
By
Maximum Loss and Mean-Absolute Deviation risk measures

During construction of typical efficient frontier, risk is usually measured by the standard deviation of the portfolio’s return. Maximum Loss and Mean-Absolute Deviation are alternative measures of risk that I will use to construct efficient frontier. I will use methods presented in Comparative Analysis of Linear Portfolio Rebalancing Strategies: An Application to Hedge Funds by

Read more »

Trading Mean Reversion with Augen Spikes

October 14, 2011
By
Trading Mean Reversion with Augen Spikes

One of the more interesting things I have come across is the idea of looking at price changes in terms of recent standard deviation, a concept put forward by Jeff Augen. The gist is to express a close to close return as a function of the standard devia...

Read more »

New food web dataset

October 14, 2011
By
New food web dataset

So, there is a new food web dataset out that was put in Ecological Archives here, and I thought I would play with it. The food web is from Otago Harbour, an intertidal mudflat ecosystem in New Zealand. The web contains 180 nodes, with 1,924 links. Fu...

Read more »

Implementing K-means clustering for Hadoop in R and Java

October 14, 2011
By
Implementing K-means clustering for Hadoop in R and Java

At the Bay Area R User Group meeting this week, Antonio Piccolboni gave an overview of the design goals and implementation of the RHadoop Project packages that connect Hadoop and R: rhdfs, rhbase and rmr: (The image above was captured from Antionio's slides.) The most revealing part of the talk for me was the comparison of implementing the K-means...

Read more »

Tomorrow: ACM Data Mining Camp at eBay

October 14, 2011
By

If you're in the Bay Area, tomorrow would be a great day to head down to San José for the ACM Data Mining Camp. Hundreds of data scientists, data hackers and data miners will be there for a fun "unconference", with talks and practical sessions organized on the spot according to demand. Revolution Analytics is proud to be a...

Read more »

Mining Lending Club’s Goldmine of Loan Data Part I of II – Visualizations by State

October 14, 2011
By
Mining Lending Club’s Goldmine of Loan Data Part I of II – Visualizations by State

I have a few friends that keep bragging about their 14% annual returns by investing their money with Lending Club, a peer-to-peer lending service that cuts out the complexities and difficulties of getting approved for a loan through a bank. To give you an idea of the sheer amount of volume Lending Club has been

Read more »

Another Mystery: sas7bdat != sd2

October 14, 2011
By

I received an email from a very inconvenienced statistician a few weeks ago. The problem was an old data file with the extension .sd2. Apparently, this is an obsolete data storage format used by past versions of SAS. A quick glance at the file contents revealed that this sd2 formatted file is incompatible with the

Read more »

principles of uncertainty

October 13, 2011
By
principles of uncertainty

“Bayes Theorem is a simple consequence of the axioms of probability, and is therefore accepted by all as valid. However, some who challenge the use of personal probability reject certain applications of Bayes Theorem.“  J. Kadane, p.44 Principles of uncertainty by Joseph (“Jay”) Kadane (Carnegie Mellon University, Pittsburgh) is a profound and mesmerising book on

Read more »

plyr, ggplot2 and triathlon results, part II

October 13, 2011
By
plyr, ggplot2 and triathlon results, part II

I ended my previous post by mentioning how one could imagine other ways of looking at the triathlon data with plyr and ggplot2. I couldn’t help but carry on playing with it so here are more stats and graphs from … Continue reading →

Read more »

System in 10 Minutes After Twitter

October 13, 2011
By
System in 10 Minutes After Twitter

On Twitter last night, I spotted @milktrader from www.algorithmzoo.com doing some range research on equity indexes.  I offered a tweet on the crazy Russell 2000 17% move over 7 days.  Within 10 minutes, I discovered a signal that worked very ...

Read more »

Maximum likelihood

October 13, 2011
By
Maximum likelihood

This post is one of those ‘explain to myself how things work’ documents, which are not necessarily completely correct but are close enough to facilitate understanding. Background Let’s assume that we are working with a fairly simple linear model, where … Continue reading →

Read more »

There’s a lot to like about R

October 13, 2011
By

I once heard John Chambers (the inventor of the S language, and member of the R Core Group) say, "Show me a programming language no-one complains about, and I'll show you a language no-one uses". The R language has its fair share of complainants, to be sure -- and that's to be expected for a language with more than...

Read more »

Waiting in line, waiting on R

October 13, 2011
By
Waiting in line, waiting on R

I should state right away that I know almost nothing about queuing theory. That’s one of the reasons I wanted to do some queuing simulations. Another reason: when I’m waiting in line at the bank, I tend to do mental calculations for how long it should take me to get served. I look at the

Read more »

Example 9.9: Simplifying R using the mosaic package (part 1)

October 13, 2011
By
Example 9.9: Simplifying R using the mosaic package (part 1)

While both SAS and R are powerful systems for statistical analysis, they can be frustrating to new users or those learning statistics for the first time. RThe mosaic package is designed to help simplify the interface for such new users, while allowing ...

Read more »