Extracting an image chunk from a collection of Large MrSid Images

June 4, 2012
By

Recently needed to extract a small "chunk" from a collection of adjacent MrSid mosaics, each about 4Gb in size. Once again, GDAL came to the rescue, and saved much time and agony wile working with very large, compressed, and proprietary-format files. T...

Read more »

Generate Quasi-Poisson Distribution Variable

June 4, 2012
By

Most of regression methods assume that the response variables follow some exponential distribution families, e.g. Guassian, Poisson, Gamma, etc. However, this assumption was frequently violated in real world data by, for example, zero-inflated overdispersion problem. A number of methods were developed to deal with such problem, and among them, Quasi-Poisson and Negative Binomial are the most popular methods perhaps due...

Read more »

Announcing The R markdown Package

June 4, 2012
By

Many of you have heard about RStudio’s latest release and it’s new R Markdown feature. Today, I’d like to announce the markdown package for R, a tool for converting Markdown documents to HTML, created in collaboration with RStudio. It...

Read more »

Messy matters explores the probability of winning of basketball…

June 4, 2012
By
Messy matters explores the probability of winning of basketball…

Messy matters explores the probability of winning of basketball game when you’re ahead by x points y minutes before the end of the game.

Read more »

How to Convert Sweave LaTeX to knitr R Markdown: Winter Olympic Medals Example

June 4, 2012
By
How to Convert Sweave LaTeX to knitr R Markdown: Winter Olympic Medals Example

The following post shows how to manually convert a Sweave LaTeX document into a knitr R Markdown document. The post (1) reviews many of the required changes; (2) provides an example of a document converted to R Markdown format based on an analysis of Winter Olympic Medal data up to and including 2006; and (3) discusses the pros...

Read more »

Slidify: Things are coming together fast

June 4, 2012
By
Slidify: Things are coming together fast

Tools for using R/RStudio as a one-stop shop for research and presentation have been coming out quickly. I think this one has a good shot of being included in future releases of RStudio: The other day I ran across a new R package called slidify by Ramn...

Read more »

Variability in maximum drawdown

June 4, 2012
By
Variability in maximum drawdown

Maximum drawdown is blazingly variable. Psychology Probably the most salient feature that an investor notices is the amount lost since the peak: that is, the maximum drawdown. Just because drawdown is noticeable doesn’t mean it is best to notice. Statistics The paper “About the statistics of the maximum drawdown in financial time series” explores drawdown … Continue reading...

Read more »

PDF slides and R code examples on Data Mining and Exploration

June 4, 2012
By
PDF slides and R code examples on Data Mining and Exploration

by Yanchang Zhao, RDataMining.com There are some nice slides and R code examples on Data Mining and Exploration at http://www.inf.ed.ac.uk/teaching/courses/dme/, which are listed below. PDF Slides: - Overview of Data Mining http://www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/datamining_intro4up.pdf - Visualizing Data http://www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/visualisation4up.pdf - Decision trees http://www.inf.ed.ac.uk/teaching/courses/dme/2012/slides/classification4up.pdf … Continue reading →

Read more »

Make R analysis Modules just like MS Excel Templates: Derivative Calculator study case

June 4, 2012
By

In this video tutorial, it will show you how to make R analysis Modules just like MS Excel Templates using the Building Derivative Calculator app as study case. Let’s say we wanted to know the derivative of tan(x^2 + 3), commonly, we will use th...

Read more »

Grid2Polygons

June 4, 2012
By
Grid2Polygons

I’d like to introduce you to the Grid2Polygons function; an R function for converting sp spatial objects from class SpatialGridDataFrame to SpatialPolygonsDataFrame. The significance of this conversion is that spatial polygons can be transformed ...

Read more »

Obtaining a protein-protein interaction network for a gene list in R

June 3, 2012
By
Obtaining a protein-protein interaction network for a gene list in R

Building a network of interaction between a bunch of genes can help a great deal in understanding the relationships between the seemingly disparate elements from your list. It can seems challenging at first to build such network but it's less complicat...

Read more »

How to draw a curve() with ggplot2

June 3, 2012
By
How to draw a curve() with ggplot2

ggplot2 improves the graphics drawn with R. A (very) short adaptation time is needed to find how to make graphs equivalent to the ones of graphics. For example, to draw the curve of a function, there is no function similar to curve(). You have to use qplot() by setting the stat and geom arguments as

Read more »

Universal portfolio, part 3

June 3, 2012
By
Universal portfolio, part 3

After the theoretical analysis, section 8 of Universal Portfolios provides examples.  We now use logopt and R to reproduce them, the first three in this post.The examples of Universal Portfolios use a long time series...

Read more »

Screencast: The Making of 3dfcc505dc

June 3, 2012
By
Screencast: The Making of 3dfcc505dc

It was all going so well. Until my MacBook began experiencing memory issues. At around the 20 minute mark (which is close to the end), I lost some video explaining the use of the density plot auto-creation super-wizard function. The good news is that i...

Read more »

NBA Playoff Predictions Update 2 and Results (3-1)

June 3, 2012
By
NBA Playoff Predictions Update 2 and Results (3-1)

This is my second follow-up to my previous two posts which were about predicting NBA games with an algorithm, and my first update to the algorithm. The algorithm's record is now 3-1, as it correctly predicted Boston and Oklahoma City as winners of the...

Read more »

Posts about ggplot2 on r-bloggers

June 3, 2012
By

You can also get your fix of ggplot2 on r-bloggers (where this blog is also syndicated): http://www.r-bloggers.com/search/ggplot

Read more »

Removing "Y" outliers from the "Validation Set"

June 3, 2012
By
Removing "Y" outliers from the "Validation Set"

This is a new video about how to monitor and interpret statistics and graphics for validation. Removing "Y" outliers from Validation SetPrevious videos about the Monitor function  for validation:Should I adjust the Bias?Monitor: Adding "RER" and "...

Read more »

NBA Playoff Predictions Update 2 and Results (3-1)

June 3, 2012
By
NBA Playoff Predictions Update 2 and Results (3-1)

This is my second follow-up to my previous two posts which were about predicting NBA games with an algorithm, and my first update to the algorithm. The algorithm's record is now 3-1, as it correctly predicted Boston and Oklahoma City as winners of their past games. Upcoming things to do Sadly, I have been a bit busy, and I...

Read more »

Making interactive slides with Org mode and googleVis in R

June 3, 2012
By
Making interactive slides with Org mode and googleVis in R

There’s been a lot of justifiable excitement in the R community about Yihui Xie’s great work, and most recently the incorporation of his knitr package into the RStudio software. Knitr is seen, justifiably, as a worthy successor to SWeave for … Continue reading →

Read more »

Coding a dynamic systems and controlling it via a graphical user interface

June 3, 2012
By
Coding a dynamic systems and controlling it via a graphical user interface

My work, in the past year, has consisted mostly of coding dynamic models in R, models which I will soon be exporting to a server-based R implementation, possibly thanks to rApache.I ususally run my models through an input file where I specify all param...

Read more »

R script to manipulate health data

June 3, 2012
By

Here is the code that fixed up the World Bank data export for use in Tableau. The databank spits out everything in an untidy format for grouping and aggregating. The reshape2 and plyr packages  make it easy to manipulate the whole set … Continue reading →

Read more »

NBA Playoff Predictions Update 2 and Results (3-1)

June 3, 2012
By
NBA Playoff Predictions Update 2 and Results (3-1)

This is my second follow-up to my previous two posts which were about predicting NBA games with an algorithm, and my first update to the algorithm. The algorithm's record is now 3-1, as it correctly predicted Boston and Oklahoma City as winners of the...

Read more »

R and theater

June 3, 2012
By
R and theater

You might ask what R has to do with theater. I assure you it has. I act in the theater group ‘ndescenze. We will soon present (actually, we just performed) a show based on the Marx Brothers Radio Shows. We shuffle actors and characters during the show (we like it complicated!) and we needed to

Read more »

Visualizing car brand choices in ggplot2

June 2, 2012
By
Visualizing car brand choices in ggplot2

Visualizing car brand choices in ggplot2:

Read more »

Fama-MacBeth and Cluster-Robust (by Firm and Time) Standard Errors in R

June 2, 2012
By
Fama-MacBeth and Cluster-Robust (by Firm and Time) Standard Errors in R

Ever wondered how to estimate Fama-MacBeth or cluster-robust standard errors in R? It can actually be very easy. First, for some background information read Kevin Goulding’s blog post, Mitchell Petersen’s programming advice, Mahmood Arai’s paper/note and code (there is an earlier version of the code with some more comments in it). For more formal references you may want to

Read more »

Pasting Excel data into R on a Mac

June 2, 2012
By
Pasting Excel data into R on a Mac

When starting out with R, getting data in and out can be a bit of a pain. It should take long to work out a convenient method – depending on what OS you use and what other packages you work with. In my case I prefer to work with Excel spreadsheets (which are versatile and

Read more »

Project Euler — problem 6

June 2, 2012
By
Project Euler — problem 6

It’s midnight officially. Let me solve the sixth problem before bed. This is a quick one. The sum of the squares of the first ten natural numbers is, 12 + 22 + … + 102 = 385. The square of … Continue reading →

Read more »

11 Million Yellow Slips – City of Toronto Parking Tickets, 2008-2011

June 2, 2012
By
11 Million Yellow Slips – City of Toronto Parking Tickets, 2008-2011

IntroductionI don't know about you, but I really hate getting parking tickets. Sometimes I feel like it's all just a giant cash grab. Really? I can't park there between the hours of 11 and 3, but every other time is okay? Well, why the hell not?Bu...

Read more »

Visualizing car brand choices in ggplot2

June 2, 2012
By
Visualizing car brand choices in ggplot2

I always like to read new posts at chartsnthings as they always inspire me with new ideas for data visualization. Yesterday I have read an article on choices of car brands by members of parliament in Poland in Gazeta.pl. It contains a simple ...

Read more »