Global Migration Maps

March 17, 2011
By
Global Migration Maps

 Migrations of people have existed for millennia and oc

Read more »

basic ggplot2 network graphs

March 17, 2011
By
basic ggplot2 network graphs

I have been looking around on the web and have not found anything yet related to using ggplot2 for making graphs/networks. I put together a few functions to make very simple graphs. The bipartite function especially is not ideal, as of course we only w...

Read more »

Having a problem with R-2.12.2 64-bit and "gam’ package!

March 17, 2011
By
Having a problem with R-2.12.2 64-bit and "gam’ package!

While working with some pitch location data recently, I ran across something strange when using my new computer (with R-2.12.2 64-bit) versus my work computer (with R-2.11.1 x64). Both are 64-bit computers, but I got the new one for portability (it's a laptop) and speed.Anyway, I had been doing some work in the office with Pitch F/X data,...

Read more »

Having a problem with R-2.12.2 64-bit and "gam’ package!

March 17, 2011
By
Having a problem with R-2.12.2 64-bit and "gam’ package!

While working with some pitch location data recently, I ran across something strange when using my new computer (with R-2.12.2 64-bit) versus my work computer (with R-2.11.1 x64). Both are 64-bit computers, but I got the new one for portability (it's a laptop) and speed.Anyway, I had been doing some work in the office with Pitch F/X data,...

Read more »

Canabalt Revisited: Gamma Distributions, Multinomial Distributions and More JAGS Goodness

March 16, 2011
By
Canabalt Revisited: Gamma Distributions, Multinomial Distributions and More JAGS Goodness

Introduction Neil Kodner recently got me interested again in analyzing Canabalt scores statistically by writing a great post in which he compared the average scores across iOS devices. Thankfully, Neil’s made his code and data freely available, so I’ve been revising my original analyses using his new data whenever I can find a free minute.

Read more »

How the New York Times uses R for Data Visualization

March 16, 2011
By

The New York Times introduced R to the world with a feature article in 2009, and has been using R for many years to support its pioneering presentation data analysis and visualization, under the direction of graphics editor Amanda Cox. Last week, the New York R User Group's featured speaker was Amanda Cox, where she presented ... how R...

Read more »

Updates to SoilWeb Mobile: Distance from Nearest Map Unit Boundary

March 16, 2011
By

Working on some new ideas on how map unit data can be summarized on small screens-- particularly for our mobile version of SoilWeb. The distance from the nearest map unit polygon boundary is now printed above mini soil profile sketches. This gives the ...

Read more »

Textural triangle plot in R

March 16, 2011
By

Hi,these days I'm working with soil textural data and one of the key point of these data is the presentation of the results.The best way is a old-school texture triangle!!!Because I like to do all my stuff in R, instead of opening draw software such as...

Read more »

Textural triangle plot in R

March 16, 2011
By

Hi,these days I'm working with soil textural data and one of the key point of these data is the presentation of the results.The best way is a old-school texture triangle!!!Because I like to do all my stuff in R, instead of opening draw software such as...

Read more »

sab-R-metrics: Brief Sidetrack for Scatterplot Matrices

March 16, 2011
By
sab-R-metrics: Brief Sidetrack for Scatterplot Matrices

In my last two posts I talked about Ordinary Least Squares, then extended this discussion to the multiple predictor case and briefly talked about some of the problems that may arise. These problems can include omitted variable bias, heteroskedasticity, non-normality, and multicollinearity. Most of these problems are relatively minor in practice and have easy fixes,...

Read more »

sab-R-metrics: Brief Sidetrack for Scatterplot Matrices

March 16, 2011
By
sab-R-metrics: Brief Sidetrack for Scatterplot Matrices

In my last two posts I talked about Ordinary Least Squares, then extended this discussion to the multiple predictor case and briefly talked about some of the problems that may arise. These problems can include omitted variable bias, heteroskedasticity, non-normality, and multicollinearity. Most of these problems are relatively minor in practice and have easy fixes,...

Read more »

Installing StatET

March 16, 2011
By

For coding in R i wanted something simple. Like using it in Eclipse. StatET was the perfect solution for me. It mqkes you able to call a console of R into Eclipse and do everything from Eclipse. So let’s install it. Since I had some troubles inst...

Read more »

Machine Learning Ex4 – Logistic Regression and Newton’s Method

March 16, 2011
By
Machine Learning Ex4 – Logistic Regression and Newton’s Method

Exercise 4 is all about using Newton's Method to implement logistic regression on a classification problem. For all this to make sense i suggest having a look at Andrew Ng machine learning lectures on openclassroom. We start with a dataset representing 40 students who were admitted to college and 40 students who were not admitted, and their corresponding...

Read more »

Nice simple notes on running R in parallel by Geyer

March 15, 2011
By

Here, by Charles J. Geyer.

Read more »

Nice simple notes on running R in parallel by Geyer

March 15, 2011
By

Here, by Charles J. Geyer.

Read more »

Visualizing Growth of a Retail Chain

March 15, 2011
By
Visualizing Growth of a Retail Chain

I am a regular reader of the FlowingData blog by Nathan Yau. It is an excellent reference for anyone interested in statistical visualization of data. One of his posts that caught my attention was a visualization of the growth of Walmart in the US. Given my research interests in retail, it was a fascinating insight

Read more »

More pi plus 1 (or plus 0.01) day fun

March 15, 2011
By
More pi plus 1 (or plus 0.01) day fun

Since I just didn’t get enough this morning, I spent some more time fooling around with estimating pi. Since I was basically counting the number of random x,y pairs inside a quarter circle and computing a sample average for more … Continue reading →

Read more »

RStudio: My thoughts

March 15, 2011
By
RStudio: My thoughts

Let me get this out of the way: I just love RStudio.Created by a team lead by JJ Allaire, a name that should ring a bell if you were involved in web development during the Clinton administration, RStudio is an R IDE that is actually designed for R from...

Read more »

RStudio: My thoughts

March 15, 2011
By
RStudio: My thoughts

Let me get this out of the way: I just love RStudio.Created by a team lead by JJ Allaire, a name that should ring a bell if you were involved in web development during the Clinton administration, RStudio is an R IDE that is actually designed for R from...

Read more »

Webinar on integrating R with applications, March 16

March 15, 2011
By

A quick reminder that Revolution Analytics' CTO David Champagne will be hosting a live webinar tomorrow (March 16) on Integrating R into 3rd Party and Web Applications Using RevoDeployR. Designed for application developers, this webinar will cover publishing R scripts to the RevoDeployR server, and integrating their results into Web applications, Microsoft Excel, JasperReports Server and more. Complete details...

Read more »

New R User Group in Orange County, CA

March 15, 2011
By

The Orange County R User Group was formed to bring local R users together in a friendly, business-oriented environment. This is the fifth R user group in California. Founder Ray DiGiacomo, Jr. says, "I feel this group is necessary because the current Los Angeles and San Diego R User Groups are quite far from Orange County. Also, Orange County...

Read more »

Example 8.30: Compare Poisson and negative binomial count models

March 15, 2011
By
Example 8.30:  Compare Poisson and negative binomial count models

How similar can a negative binomial distribution get to a Poisson distribution?When confronted with modeling count data, our first instinct is to use Poisson regression. But in practice, count data is often overdispersed. We can fit the overdispersio...

Read more »

Want to say one thing and the exact oppositive with strong confidence ?

March 15, 2011
By
Want to say one thing and the exact oppositive with strong confidence ?

No need to do politics. Just take a statistical course. And I do not talk about misinterpretation of statistics, but I talk about the mathematical foundations of statistical tests. Consider the following parametric test, with a one-dimensional para...

Read more »

Chemometrics with R

March 15, 2011
By
Chemometrics with R

I just heard that my supervisor's book Chemometrics with R was released, and I immediately requested our library to get a copy. Ron introduced me to R at a time that most at our department were still using Matlab. In fact, I had be maintaining Matlab s...

Read more »

I’m late for π day

March 15, 2011
By
I’m late for π day

It is officially no longer pi day, but I didn’t see this Drew Conway post about estimating pi until just a few minutes ago. Because Google Reader doesn’t show github embeds, I also got to try it without seeing Drew’s … Continue reading →

Read more »

How to backtest a strategy in Excel

March 14, 2011
By

(This is a guest post by Damian from Skill Analytics and ETF Prophet)Let me start by saying that I’m not an expert in backtesting in Excel – there are a load of very smart bloggers out there that have, as I would say, “mad skillz” at working with Excel including (but not limited to) Michael Stokes over...

Read more »

How to backtest a strategy in Excel

March 14, 2011
By

(This is a guest post by Damian from Skill Analytics and ETF Prophet) Let me start by saying that I’m not an expert in backtesting in Excel – there are a load of very smart bloggers out there that have, as I would say, “mad skillz” at working with Excel including (but not limited to) Michael Stokes over...

Read more »

UAH Temperature Anomalies Following Predictable Pattern

March 14, 2011
By
UAH Temperature Anomalies Following Predictable Pattern

In this post I show one simple  and 2 multiple regression models to assess the role of time, El Nino – La Nina SSTA and volcanic activity (SATO) on UAH global temperature anomaly trends. The 3rd model provides a reasonable  … Continue reading →

Read more »

Parallel computation [revised]

March 14, 2011
By
Parallel computation [revised]

We have now completed our revision of the parallel computation paper and hope to send it to JCGS within a few days. As seen on the arXiv version, and given the very positive reviews we received, the changes are minor, mostly focusing on the explanation of the principle and on the argument that it comes

Read more »