Bar Charts and Segmented Bar Charts in R

September 2, 2012
By
Bar Charts and Segmented Bar Charts in R

Here are a couple of tutorials I’ve written to help anyone who’s interested in learning how to produce simple bar charts or simple segmented bar charts in R, given that you have

Read more »

Stan

September 2, 2012
By
Stan

Andrew Gelman has announced the release of Stan version 1.0.0 and its R interface RStan.  Stan – named after Stanislaw Ulam, the inventor of the Monte Carlo method – is a new MCMC program that represents a major technological leap … Continue reading →

Read more »

Call for contribution: the RDataMining package – an R package for data mining

September 2, 2012
By
Call for contribution: the RDataMining package – an R package for data mining

Join the RDataMining project to build a comprehensive R package for data mining http://www.rdatamining.com/package We have started the RDataMining project on R-Forge to build an R package for data mining. The package will provide various functionalities for data mining, with … Continue reading →

Read more »

Discrete colors in ggplot

September 1, 2012
By
Discrete colors in ggplot

Have you ever wanted an easy way to generate continuous color pallettes for a discrete factor? I came across a question over on Stackoverflow about how add color to a ggplot figure. I often find myself with lot’s of categories that are discrete when I want a continuous color plot. This can be achieved by writing a quick...

Read more »

New Attribution Functions for PortfolioAnalytics

September 1, 2012
By
New Attribution Functions for PortfolioAnalytics

Another Google Summer of Code (GSoC) project this summer focused on creating functions for doing returns-based performance attribution. I’ve always been a little puzzled about why this functionality wasn’t covered already, but I think that most analysts do this kind of work in Excel. That, of course, has its own perils. But beyond the workflow

Read more »

Getting data on your government

September 1, 2012
By
Getting data on your government

I created an R package a while back to interact with some APIs that serve up data on what our elected represenatives are up to, including the New York Times Congress API, and the Sunlight Labs API. What kinds of things can you do with govdat? Here ...

Read more »

Add Text Annotations to ggplot2 Faceted Plot

August 31, 2012
By
Add Text Annotations to ggplot2 Faceted Plot

In my experience with R learners there are two basic types. The “show me the code and what it does and let me play” type and the “please give me step by step directions” type. I’ve broken the following tutorial … Continue reading →

Read more »

On School Boards and Policy Shocks

August 31, 2012
By
On School Boards and Policy Shocks

The dissertation process has many steps. The prospectus or proposal is one of the last. Awhile ago I was lucky enough to have my dissertation proposal defense and pass!  My project is seeking to understand the linkage between political activity at...

Read more »

Compiling Source Packages on R 2.15.x / Mac OS 10.5.x

August 31, 2012
By

For some reason R is not happy with its 64-bit cousin when installing source packages: * installing *source* package ‘XMLSchema’ ... ** R ** inst ** preparing package for lazy loading ** help *** installing help indices ** building package indices ** installing vignettes ** testing if installed package can be loaded *** arch - i386 *** arch - x86_64 /Library/Frameworks/R.framework/Resources/bin/R: line 259: /Library/Frameworks/R.framework/Resources/bin/exec/x86_64/R: Bad CPU type in executable /Library/Frameworks/R.framework/Resources/bin/R: line 259:...

Read more »

RStan: Fast, multilevel Bayesian modeling in R

August 31, 2012
By

For the last decade or so, the go-to software for Bayesian statisticians has been BUGS (and later the open-source incarnation, OpenBugs, or JAGS). BUGS is used for multi-level modeling: using a specialized notation, you can define random variables of various distributions, set Bayesian priors for their parameters, and create the network of relationships that describe how the random variables...

Read more »

Border bias and weighted kernels

August 31, 2012
By
Border bias and weighted kernels

With Ewen (aka @3wen), not only we have been playing on Twitter this month, we have also been working on kernel estimation for densities of spatial processes. Actually, it is only a part of what he was working on, but that part on kernel estimation...

Read more »

Compiling rgdal on Mac OS 10.5

August 31, 2012
By

Why do I always forget how to do this? R CMD INSTALL rgdal --configure-args="--with-gdal-config=/Library/Frameworks/GDAL.framework/Versions/Current/unix/bin/gdal-config --with-proj-include=/Library/Frameworks/PROJ.framework/Versions/4/Headers/ --with-proj-lib=/Library/Frameworks/PROJ.framework/Versions/Current/unix/lib/" You will need to adjust the paths based on your version of the GDAL and Proj4 frameworks. read more

Read more »

Russell 2000 Softail Fat Boy

August 31, 2012
By
Russell 2000 Softail Fat Boy

If the Russell 2000 were a motorcycle, maybe it should be a Harley-Davidson Softail Fat Boy.  I have explored the exception case of the Russell 2000 in quite a few posts More Exploration of Crazy RUT Where are the Fat Tails? Crazy RUT but I st...

Read more »

Border bias and weighted kernels

August 31, 2012
By
Border bias and weighted kernels

With Ewen (aka @3wen), not only we have been playing on Twitter this month, we have also been working on kernel estimation for densities of spatial processes. Actually, it is only a part of what he was working on, but that part on kernel estimation has been the opportunity to write a short paper, that can now be downloaded on hal. The problem...

Read more »

Another bunch of R (and JAGS) scripts

August 31, 2012
By
Another bunch of R (and JAGS) scripts

Probably sooner than I expected, I have managed to also upload the codes for the examples in Chapter 5 of the book, which deals with doing Bayesian health economic evaluations. Basically, there are 3 examples, which sort of represent the main clas...

Read more »

PCA or Polluting your Clever Analysis

August 31, 2012
By
PCA or Polluting your Clever Analysis

When I learned about principal component analysis (PCA), I thought it would be really useful in big data analysis, but that's not true if you want to do prediction. I tried PCA in my first competition at kaggle, but it delivered bad results. This post illustrates how PCA can pollute good predictors.When I started examining this problem,...

Read more »

Cluster Multiple Images with ImageJ and R

August 31, 2012
By

30.08.2012 With Bio7 1.6 it is possible to send multiple images from ImageJ to R without the need to open them in the Graphical User Interface of ImageJ for speed improvements. With a simple script written in Java, Groovy or BeanShell a new Bio7 API command can be used (see below) to transfer images and

Read more »

Getting ecology and evolution journal titles from R

August 31, 2012
By

So I want to mine some #altmetrics data for some research I'm thinking about doing. The steps would be: Get journal titles for ecology and evolution journals. Get DOI's for all papers in all the above journal titles. Get altmetrics data on each DO...

Read more »

Follow-Up: Making a Word Cloud for a Search Result from GScholar_Scraper_3.1

August 30, 2012
By
Follow-Up: Making a Word Cloud for a Search Result from GScholar_Scraper_3.1

Here's a short follow-up on how to produce a word cloud for a search result from GScholarScraper_3.1:# File-Name: GScholarScraper_3.1.R# Date: 2012-08-22# Author: Kay Cichini# Email: [email protected]# Purpose: Scrape Google Scholar search result# ...

Read more »

My Course Wish List at CMSE next year

August 30, 2012
By
My Course Wish List at CMSE next year

Here is the list  of courses I wish to teach next year at Chiang Mai School of Economics, not so sure about the demand there! Undergraduate (B.Econ.) ECON 304: Economics Statistics (with R) ECON 408: Research Design in Economics ECON 417: Managerial Economics ECON 419: Economic Theory and Entrepreneurship ECON 443: Industrial Economics ECON 4xx: Introduction to

Read more »

Stan is fast

August 30, 2012
By

10,000 iterations for 4 chains on the (precompiled) efficiently-parameterized 8-schools model: > date () "Thu Aug 30 22:12:53 2012" > fit3 date () "Thu Aug 30 22:12:55 2012" > print (fit3) Inference for Stan model: anon_model. 4 chains: each with iter=10000; warmup=5000; thin=1; 10000 iterations saved. mean se_mean sd 2.5% 25% 50% 75% The post Stan...

Read more »

Spearman’s Rho

August 30, 2012
By

Spearman’s Rho Rank Correlation There are generally three types of correlation that a researcher may encounter: Pearson’s r, Kendall’s Tau, and Spearman’s Rho.  They each have their own uses and applications depending on the da...

Read more »

Three ways of visualizing the growth of Walmart

August 30, 2012
By
Three ways of visualizing the growth of Walmart

It's a wonderful thing when people make interesting data sets available to the public. When Thomas Jones wrote a paper in Econometrics about the growth of US retail giant Walmart, he made the data he collected about every Walmart store opening in history (location and date) available to the public. Since then, several people have used different techniques to...

Read more »

A Stan is Born

August 30, 2012
By

Stan 1.0.0 and RStan 1.0.0 It’s official. The Stan Development Team is happy to announce the first stable versions of Stan and RStan. What is (R)Stan? Stan is an open-source package for obtaining Bayesian inference using the No-U-Turn sampler, a variant of Hamiltonian Monte Carlo. It’s sort of like BUGS, but with a different language The post A...

Read more »

Making matrices with zeros and ones

August 30, 2012
By

So I was trying to figure out a fast way to make matrices with randomly allocated 0 or 1 in each cell of the matrix. I reached out on Twitter, and got many responses (thanks tweeps!). Here is the solution I came up with. See if you can tell why it...

Read more »

Another Great Google Summer of Code 2012 R Project

August 30, 2012
By
Another Great Google Summer of Code 2012 R Project

Tradeblotter announced the very nice features that will be added to the PerformanceAnalytics package as a result of the Google Summer of Code (GSOC) 2012 project: “…Matthieu commenced to produce dozens of new functions, extend several more existin...

Read more »

Visually weighted regression in R (à la Solomon Hsiang)

August 30, 2012
By
Visually weighted regression in R (à la Solomon Hsiang)

, and also the discussions on the Statistical Modeling, Causal

Read more »

F1 2012 Mid-Season Review

August 30, 2012
By
F1 2012 Mid-Season Review

Rather belatedly, I got around to posting a series of posts summarising the Formula One season to date: F1 2012 Mid-Season Review – Grid/Classification Analysis: for example, how do the drivers’ grid and final classifications compare? F1 2012 Mid-Season Review – Pit Stops: for example, how does pit stop performance across the teams compare? F1

Read more »

Late to the ggplot2 party

August 29, 2012
By

I have resisted learning the popular R graphics package, ggplot2. I dismissed ggplot2 as primarily useful for exploratory graphics and rationalized my avoidance of ggplot2 by assuming that it would require just as many (or more) lines of code as the R base package to whip the default plots into publication-quality figures. The few times

Read more »