2311 search results for "map"

Data fishing: R and XML part 3

February 18, 2013
By
Data fishing: R and XML part 3

I’ve recently posted two blogs about gathering data from web pages using functions in R. Both examples showed how we can create our own custom functions to gather data about Minnesota lakes from the Lakefinder website. The first post was an example showing the use of R to create our own custom functions to get

Read more »

Revisiting Cleveland’s The Elements of Graphing Data in ggplot2

February 18, 2013
By
Revisiting Cleveland’s The Elements of Graphing Data in ggplot2

I was flipping through my copy of William Cleveland’s The Elements of Graphing Data the other day; it’s a book worth revisiting. I’ve always liked Cleveland’s approach to visualization as statistical analysis. His quest to ground visualization principles in the context of human visual cognition (he called it “graphical perception”) generated useful advice for designing Related posts:

Read more »

When SAP HANA met R – What’s new?

February 18, 2013
By
When SAP HANA met R – What’s new?

Since I wrote my blog When SAP HANA met R - First kiss I had received a lot of nice feedback...and one those feedbacks was..."What's new?"...Well...as you might now SAP HANA works with R by using Rserve, a package that allows communication to an R Serv...

Read more »

Reshaping Horse Import/Export Data to Fit a Sankey Diagram

February 18, 2013
By
Reshaping Horse Import/Export Data to Fit a Sankey Diagram

As the food labeling and substituted horsemeat saga rolls on, I’ve been surprised at how little use has been made of “data” to put the structure of the food chain into some sort of context* (or maybe I’ve just missed those stories?). One place that can almost always be guaranteed to post a few related

Read more »

Google Statistician uses R and other programming tools

February 16, 2013
By

A great interview on the Simply Statistics blog with Google's Nick Chamandy, Phd in Statistics.  Explains that he mainly uses R among other tools to perform his work at Google.  Also of note is the active data science community within Google ...

Read more »

Version 1.0 of multilevelPSA Available on CRAN

February 14, 2013
By
Version 1.0 of multilevelPSA Available on CRAN

Version 1.0 of multilevelPSA has been released to CRAN. The multilevelPSA package provides functions to estimate and visualize propensity score models with multilevel, or clustered, data. The graphics are an extension of PSAgraphics package by Helmreich and Pruzek. The example below will investigate the differences between private and public school internationally using the Programme of International Student Assessment...

Read more »

In case you missed it: January 2103 Roundup

February 13, 2013
By

In case you missed them, here are some articles from January of particular interest to R users. Anthony Damico created an amusing and useful flowchart for finding resources for learning R, especially for survey analysis. All R users: please be counted for the 2013 Rexer Data Miner Survey (R was the #1 software reported in the last survey). Relatedly,...

Read more »

Basic R: rows that contain the maximum value of a variable

February 12, 2013
By
Basic R: rows that contain the maximum value of a variable

File under “I keep forgetting how to do this basic, frequently-required task, so I’m writing it down here.” Let’s create a data frame which contains five variables, vars, named A – E, each of which appears twice, along with some measurements: Now, let’s say we want only the rows that contain the maximum values of

Read more »

What Analytic Software are People Discussing?

February 12, 2013
By
What Analytic Software are People Discussing?

by Robert A. Muenchen How can we measure the popularity or market share of analytic software? One way is to see what people are discussing. I’m in the process of updating my annual article, The Popularity of Data Analysis Software. Below … Continue reading →

Read more »

The Problem with Testing for Heteroskedasticity in Probit Models

February 12, 2013
By
The Problem with Testing for Heteroskedasticity in Probit Models

A friend recently asked whether I trusted the inferences from heteroskedastic probit models. I said no, because the heteroskedastic probit does not allow a researcher to distinguish between non-constant variance and a mis-specified mean function. In particular, my friend had a hypothesis that the variance of the latent outcome (commonly called "y-star") should increase with an

Read more »