Visualizing iOS Text Editors

April 17, 2012
By
Visualizing iOS Text Editors

The other day Brett Terpstra posted a gigantic and quite beautifully-executed feature comparison of all of the text editors available for iOS devices. The table is really terrific and also a bit overwhelming, as there’s so much data. On the bus h...

Read more »

Quickly Explore the Penn World Tables in R

April 17, 2012
By
Quickly Explore the Penn World Tables in R

The Penn World Tables are one of the greatest source of worldwide macroeconomic data, but dealing with its web interface is somewhat cumbersome. Fortunately, the data is also available as a R package on CRAN. Having some tools at hand … Continue reading →

Read more »

More Spectra patterns (1ª derivative)

April 17, 2012
By
More Spectra patterns (1ª derivative)

In the case of the first derivative for the absortion band, the maximum becomes a cero crossing.Using SG filters, we can calculate it with R, and to see, like in the last posts, the Corrgram matrix.Corrgram for the first derivative for this band:L...

Read more »

Get your large SQL data in ff swiftly

Get your large SQL data in ff swiftly

The ff package is great when you are working with large data in R. Data in corporate environments are usually not that large that a Hadoop system is needed to handle it but the data are mostly large enough to make R choke on it's RAM.  T...

Read more »

Montreal R Workshop: Quantile Regression

April 17, 2012
By
Montreal R Workshop: Quantile Regression

Stewart Biology Building, McGill University (Rm N4/17) Monday, April 24, 2012  14h-16h Dr. Arthur Charpentier (UQàM) In this workshop we will examine difference concepts related to quantiles, and practical issues based on R codes. This workshop will present quantile regression, and the idea of iterative least square estimation. It will present an illustration on climate

Read more »

Pair Trading: Quick Update

April 17, 2012
By

I've been working on different projects lately and my time for this blog, unfortunately, has been close to zero. But that's going to change. Don't expect new post every day, but there should be a new post at least in every two weeks. Anyway, let's get back to the point of this post. One of the readers contacted

Read more »

Revolution Analytics Spring Webinar Series

April 17, 2012
By

The webinar team at Revolution Analytics has put together a great program over the next couple of months. With a mix of guest speakers and Revolution Analytics staff, this series will cover topics as diverse as Big Data with R and Hadoop, integrating R with MS Office, spatial statistics with R, data mining with R, retail marketing analytics, and...

Read more »

The (Un)disputed Champion of Psychotherapy – Clinical psychologists and their theoretical orientations

April 17, 2012
By
The (Un)disputed Champion of Psychotherapy – Clinical psychologists and their theoretical orientations

Cognitive Behavioral Therapy is the psychological treatment of choice for many, if not all, mental disorders. Nonetheless a majority of US clinical psychologist do not primarily identify themselves as either cognitive or behavioral therapists. Looking at data from PubMed publication counts a clear picture emerges; psychodynamic researchers might just be research loafers.

Read more »

Calculating the mixing matrix and assortativity coefficient with igraph in R

April 16, 2012
By

The mixing matrix of a graph gives the density of edges between vertices with different characteristics. The mixing matrix for a given igraph object can be calculated using the following function: The assortativity coefficient, based on Newman’s paper, can be … Continue reading →

Read more »

Math Spectra Patterns

April 16, 2012
By
Math Spectra Patterns

I was working today with "R" to get more patterns with the Corrgram. In the demo raw spectra I wanted this time to look to a band as much Gaussian as possible. I select it and trim the spectra to that region treated with the MSC ("Multiple Scatter Corr...

Read more »

Word cloud alternatives

April 16, 2012
By
Word cloud alternatives

Here is an alternative to word clouds that makes it easier to get insights, but also has some of the aesthetic appeal of the traditional word cloud. My first attempt at this looked pretty bad and this is not too much better, but hopefully someone else will help improve it. library(languageR) # get english word

Read more »

Installing R’s maps package on Ubuntu

April 16, 2012
By

I recently ran into trouble trying to install the R maps package on Ubuntu 10.04.  Here's the error I was getting: ** arch - gcc -std=gnu99 -O3 -pipe  -g    Gmake.c   -o GmakeGmake.c: In function ‘get_lh’:Gmake.c:111: warning: cast from pointer to integer of different sizeGmake.c:113: warning: cast from pointer to integer of different sizeGmake.c: In function ‘main’:Gmake.c:211: warning: cast from...

Read more »

Installing R’s maps package on Ubuntu

April 16, 2012
By
Installing R’s maps package on Ubuntu

I recently ran into trouble trying to install the R maps package on Ubuntu 10.04.  Here's the error I was getting: ** arch - gcc -std=gnu99 -O3 -pipe  -g    Gmake.c   -o GmakeGmake.c: In function ‘get_lh’:Gmake.c:...

Read more »

R Quickie: Custom Panel Functions and Default Arguments

April 16, 2012
By

Sometimes the basic functionality in lattice graphics isn't enough. Custom "panel functions" are one approach to fully customizing the lattice graphics system. Two examples are given below illustrating how to define an (inline) custom panel function fo...

Read more »

A thought on Linear Models on Stocks

April 16, 2012
By
A thought on Linear Models on Stocks

Timely Portfolio has a nice post about linear models sytems for stock. The idea follows from the steps below: Get the weekly closing values of the S&P 500. Choose a time window (i.e. 25 weeks) and for each window, linearly regress the subset of closing values Choose an investment strategy based on the residuals, the

Read more »

How NOAA uses R to forecast river flooding

April 16, 2012
By
How NOAA uses R to forecast river flooding

Thanks to the lower-than-usual snowfall over most of the US this past winter, there's low risk of major flooding as the snow melts this Spring (for the first time in four years!). Nonetheless, being able to forecast river flood events is of critical importance to local emergency managers, water & electric utilities, river navigation companies, and the US Army...

Read more »

Example 9.27: Baseball and shrinkage

April 16, 2012
By
Example 9.27: Baseball and shrinkage

To celebrate the beginning of the professional baseball season here in the US and Canada, we revisit a famous example of using baseball data to demonstrate statistical properties. In 1977, Bradley Efron and Carl Morris published a paper about the Jame...

Read more »

Benford’s Law

April 16, 2012
By
Benford’s Law

Here is a quick quiz. If you visit the Wikipedia page List of countries by GDP, you will find three lists ranking the countries of the world in terms of their Gross Domestic Product (GDP), each list corresponding to a different source of the data. If you pick the list according to the CIA (let’s

Read more »

Information flows like water

April 16, 2012
By
Information flows like water

Guiding a ship, it takes more than your skill Spark David Rowe’s Risk column this month is about data leverage. The idea is that you are leveraging your data if you are using it to answer questions that are too demanding of information. The piece reminded me of a talk that Dave gave a few … Continue reading...

Read more »

Borrowing Ideas from Timely Portfolio

April 15, 2012
By
Borrowing Ideas from Timely Portfolio

I want to highlight two great Visualization techniques I discovered by reading the fine blog from Timely Portfolio. First method is based on the lm System on Nikkei with New Chart. Let’s visualize Strategy’s Long/Short/Not Invested periods by highlighting the underlying (i.e. buy & hold) with green/red/gray. Following is a sample code that implements this

Read more »

Significance Test for Kendall’s Tau-b

April 15, 2012
By
Significance Test for Kendall’s Tau-b

A variation of the standard definition of Kendall correlation coefficient is necessary in order to deal with data samples with tied ranks. It known as the Kendall’s tau-b coefficient and is more effective in determining whether two non-parametric data samples with ties are correlated. read more

Read more »

The Popularity of Statistical Packages

April 15, 2012
By

No matter what your favourite statistical package is, you'll find this post by Robert Muenchen highly informative.Robert concludes that:"By most of the measures discussed here, R is competing well with the commercial software vendors. However, I advise not over generalizing from this data. SAS and SPSS continue to dominate the corporate world and Stata is doing quite...

Read more »

ggplot2 Time Series Heatmaps

April 15, 2012
By
ggplot2 Time Series Heatmaps

How do you easily get beautiful calendar heatmaps of time series in ggplot2? E.g:From MarginTaleI was impressed by the lattice-based  implementation from Paul Bleicher of Humedica, which you can find referenced in http://blog.revolutionanalytics.c...

Read more »

The R-Podcast Episode 5: Basic Package Management

April 15, 2012
By

After a brief delay here’s episode 5 of the R-Podcast. In this episode: R 2.15.0 released, listener feedback, and discussion on basic package management. I discuss helpful resources for finding packages, installation procedures, and how to determine what packages are installed in your R system, among other considerations. If you are interested in providing a

Read more »

Registration for R/Finance 2012 is Open

April 15, 2012
By
Registration for R/Finance 2012 is Open

Registration has been open for a while, but I wanted to point out the pre-conference seminars. Registrations are strong this year, so if you’re interested you’ll need to sign up before they sell out. Register here… As you probably know by now, the fourth annual R/Finance conference for applied finance using R will be held

Read more »

Visualization of Reading Level Frequency by Congressional Bill Stage

April 15, 2012
By
Visualization of Reading Level Frequency by Congressional Bill Stage

  Here’s a fun example of how you might use my data on Congressional bill length and complexity.  Imagine you want to understand the empirical distribution of Flesch-Kincaid reading level for Congressional bills and how this distribution is related to … Continue reading →

Read more »

R can write R code, too

April 14, 2012
By

In a recent blog post by CMastication, a little meme puzzle is presented with the introduction that a preschooler could solve it in 5-10 minutes, a programmer in an hour. I took the bait. The original problem goes like this: … Continue reading →

Read more »

Linguistic Notation Inside of R Plots!

April 14, 2012
By
Linguistic Notation Inside of R Plots!

So, I've been playing around with learning knitr, which is a Sweave-like R package for combining LaTeX and R code into one document. There's almost no learning curve if you already use Sweave, and I find a lot of knitr's design and usage to be a lot nicer.I wasn't going to make a blog post or tutorial about...

Read more »

Sweeping through data in R

April 14, 2012
By
Sweeping through data in R

How do you apply one particular row of your data to all other rows?Today I came across a data set which showed the revenue split by product and location. The data was formated to show only the split by product for each location and the overall split by...

Read more »