Monthly Archives: October 2012

EDA Before CDA

October 6, 2012
By
EDA Before CDA

One Paragraph Summary Always explore your data visually. Whatever specific hypothesis you have when you go out to collect data is likely to be worse than any of the hypotheses you’ll form after looking at just a few simple visualizations of that data. The most effective hypothesis testing framework in existence is the test of

Read more »

R-bloggers

October 6, 2012
By

R-bloggers provides a great service, aggregating a universe of blogs which contribute aRticles on R and using R (marked using an "R"-tag.This is a nice community service creating a one-stop shop for readers to learn about R, but also a great idea for a...

Read more »

A quick introduction to ggplot()

October 5, 2012
By
A quick introduction to ggplot()

I gave a short talk today to the about ggplot. This what I presented. Additional resources at the bottom of this post ggplot is an R package for data exploration and producing plots. It produces fantastic-looking graphics and allows one to slice and dice one’s data in many different ways. Comparing with base...

Read more »

Style your R charts like the Economist, Tableau … or XKCD

October 5, 2012
By
Style your R charts like the Economist, Tableau … or XKCD

As we noted last month, the new Themes feature in ggplot2 helps you customize the design of R charts to your liking. Now, R user Jeffrey Arnold has built on this feature to create standardized themes to make R graphics looks like those from major publications and other software systems. You can use his ggthemes package to make your...

Read more »

How to read BSMAP methylation ratio files into R via methylKit

October 5, 2012
By

BSMAP is an aligner for bisulfite sequencing reads. It outputs aligned reads as well as methylation ratios per base (via methratio.py script). The methylation ratios can be read into R via methylKit package and regular methylKit analysis can ...

Read more »

DIY ZeroAccess GeoIP Plots

October 5, 2012
By
DIY ZeroAccess GeoIP Plots

Since F-Secure was #spiffy enough to provide us with GeoIP data for mapping the scope of the ZeroAccess botnet, I thought that some aspiring infosec data scientists might want to see how to use something besides Google Maps & Google Earth to view the data. If you look at the CSV file, it’s formatted as

Read more »

Running motivation #An R amusement

October 5, 2012
By
Running motivation #An R amusement

Henry John-Alder told me once that in a marathon, twice as runners cross the line at 2h 59m than at 3h 00m. He pointed out that this anomaly in the distribution of finishers per minute (roughly normal shaped) is due … Continue reading →

Read more »

Calculating distances (across matrices)

October 5, 2012
By
Calculating distances (across matrices)

This Gist is mostly for my future self, as a reminder of how to find distances between each row in two different matrices. To create a distance matrix from a single matrix, the function dist(), from the stats package is sufficient. There are times, ho...

Read more »

How to upgrade R in Ubuntu 12.04

October 4, 2012
By

Open your sources.list file in geditsudo gedit /etc/apt/sources.listand add the following line:deb http://cran.cnr.berkeley.edu/bin/linux/ubuntu/ precise/Note that you don't have to use that mirror. You may use any mirror from the list here : http://cran.r-project.org/mirrors.htmlAdd the secure APT key to your system with one commandsudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9Update your sources and upgrade your installationsudo apt-get update...

Read more »

Permanent Portfolio – Simple Tools

October 4, 2012
By
Permanent Portfolio – Simple Tools

I have previously described and back-tested the Permanent Portfolio strategy based on the series of posts at the GestaltU blog. Today I want to show how we can improve the Permanent Portfolio strategy perfromance using following simple tools: Volatility targeting Risk allocation Tactical market filter First, let’s load the historical prices for the stocks(SPY), gold(GLD),

Read more »