Radical Education Reform? Think Bigger.

April 2, 2012
By

“My job is to teach you how to think.” –Hugh Young A few days ago John Naughton published an article summarizing his manifesto on how to reform computer science education. I agree computer science education is in need of drastic...

Read more »

Exploring Pollen Data

April 2, 2012
By
Exploring Pollen Data

I wrote a few functions to grab data and from the Global Pollen Database: source("~/code/_Pollen/pollendatafuntions.R") ## Loading required package: stratigraph ## Loading required package: grid ## Loading required package: graphics ## Loading required package: stats billys <- getpctAP("billys", plot = TRUE) ## Number of taxa: 105 ## Number of levels: 77 Arboreal Pollen over time at Billy’s...</p><p><a href=Read more »

Web-Scraping in R

April 2, 2012
By
Web-Scraping in R

Web-scraping, or web-crawling, sounds like a seedy activity worthy of an Interpol investigative department. The reality, however, is far less nefarious. Web-scraping is any procedure by which someone extracts data from the internet. Given that it’s possible to get the internet on computers these days; web-scrapping opens an array of interesting possibilities to social-science researchers

Read more »

An unabashedly narcissistic data analysis of my own tweets. The…

April 2, 2012
By
An unabashedly narcissistic data analysis of my own tweets.
The…

pie( table( whence.i.tweet )) qplot( whence ) + coord_polar() pie( log( table( whence )))+RColorBrewer ggplot (see below) plot( density( tweets.len )) qplot(... stat="density") + geom_density qplot(...stat="bin") + geom_text(...) tweeple tweep...

Read more »

Working with Globcolour data

April 2, 2012
By
Working with Globcolour data

The Globcolour project (http://www.globcolour.info/) provides relatively easy access to ocean color remote sensing data. Data is provided at http://hermes.acri.fr/and the following parameters are available:· Chlorophyll-a (CHL1 and CHL2)· Fully normalised water leaving radiances at 412, 443, 490, 510, 531, 550-565, 620, 665-670, 681 and 709 nm (Lxxx)· Coloured dissolved and detrital...

Read more »

Introduction to ORE Embedded R Script Execution

April 2, 2012
By
Introduction to ORE Embedded R Script Execution

This Oracle R Enterprise (ORE) tutorial, on embedded R execution, is the third in a series to help users get started using ORE. See these links for the first tutorial on the transparency layer and second tutorial on the statistics engine. Oracle R Enterprise is a component in the Oracle Advanced Analytics Option of Oracle Database Enterprise...

Read more »

Add a frame to a map

April 2, 2012
By
Add a frame to a map

Here is a function that adds a frame of alternating colors to a map (un-projected). One defines the extension of each bar (in degrees) and an optional width of the bars (in inches). It uses the "joinPolys" function of the package to trim the bars near the map corners where the axes meet.the map.frame...

Read more »

3-D graphing with Google

April 2, 2012
By
3-D graphing with Google

You probably already knew that you can draw mathematical equations in Google by typing the equation into the search box. For example, here's the Standard Normal density function: I can't find a way to embed the graph directly, but if you click on it you'll find it's interactive: you can inspect points, zoom in/out etc. You can create a...

Read more »

Playing with fire (or water)

April 2, 2012
By
Playing with fire (or water)

A few days ago, http://www.futilitycloset.com/ published a short post based on the fourth problem of the 1987 Canadian Mathematical Olympiad (from on a problem from the 6th All Soviet Union Mathematical Competition in Voronezh, 1966). The problem i...

Read more »

Example 9.25: It’s been a mighty warm winter? (Plot on a circular axis)

April 2, 2012
By
Example 9.25: It’s been a mighty warm winter? (Plot on a circular axis)

Updated (see below)People here in the northeast US consider this to have been an unusually warm winter. Was it?The University of Dayton and the US Environmental Protection Agency maintain an archive of daily average temperatures that's reasonably current. In the case of Albany, NY (the most similar of their records to our...

Read more »

Surveys, Assumptions, and the Need for Data Collection Alternatives

April 2, 2012
By
Surveys, Assumptions, and the Need for Data Collection Alternatives

This is a long post. My previous posts have mostly been about my thoughts on various research subjects. This one reports an actual analysis. If you don’t want to read the whole thing, here are the highlights: We really need to stop using surveys so much. If we have to use surveys, it’s probably best

Read more »

My main resources for R programming

April 2, 2012
By
My main resources for R programming

In this article, you will find a list of Internet resources that may be useful if you are programming in R. If you have a problem with R and

Read more »

Linking apples liking to analytical data

April 2, 2012
By
Linking apples liking to analytical data

This post describes the last puzzle piece of the model. The link of instrumental to sensory data. Together with the previous pieces this leads to a model starting from physico-chemical measurements, to sensory data, to consumers' perception and finally...

Read more »

Simple data mining and plotting data on a map with ggplot2

April 2, 2012
By
Simple data mining and plotting data on a map with ggplot2

In this post I use OpenStreetsMaps and ggplot2 to plot geographically were psychologists live, using data from a facebook document.

Read more »

Replacing market indices

April 2, 2012
By
Replacing market indices

If equity markets suddenly sprang into existence now, would we create market indices? I’m doubtful. Why an index? The Dow Jones Industrial Average was born in 1896.  This was when computers were humans with adding machines (but they did do parallel processing).  At that point boiling “the market” down to a single number had value. … Continue reading...

Read more »

Sunday evening, stupid games…

April 1, 2012
By
Sunday evening, stupid games…

This evening, while I was about to wash the dishes, I heard my elders starting a game (call them Him and Her) Him: "I have picked - in my head - a number, lower than 50. Try to guess..." Her: "No way, too difficult..." Him: "You can try five differ...

Read more »

Missing Data Club

April 1, 2012
By

Welcome to Missing Data Club. There are only three rules. Rule #1 is: There is no missing data. Rule #2 is: THERE IS NO MISSING DATA! Rule #3: If you’ve never built a model using missing data – you must do it...

Read more »

Quantitative finance and computational systems

April 1, 2012
By
Quantitative finance and computational systems

I’m writing a book proposal based on the lecture notes for my R for Quants workshop I conducted at the …Continue reading »

Read more »

A better way of saving and loading objects in R

April 1, 2012
By
A better way of saving and loading objects in R

Hadley Wickham (@hadleywickham) this week mentioned on Twitter his preference for saveRDS() over the more familiar save(). Being a new function to me, I thought I’d take a look… save() and load() will be familiar to many R users. They … Continue reading →

Read more »

A better way of saving and loading objects in R

April 1, 2012
By

Hadley Wickham (@hadleywickham) this week mentioned on Twitter his preference for saveRDS() over the more familiar save(). Being a new function to me, I thought I’d take a look…

Read more »

Julia, I Love You

March 31, 2012
By

Julia is a new language for scientific computing that is winning praise from a slew of very smart people, including Harlan Harris, Chris Fonnesbeck, Douglas Bates, Vince Buffalo and Shane Conway. As a language, it has lofty design goals, which, if attained, will make it noticeably superior to Matlab, R and Python for scientific programming.

Read more »

Back to Blogging

March 31, 2012
By

If you’re subscribed to this blog, you’ve surely noticed the very long hiatus I’ve taken from writing over the last six months. I wish I’d kept up with blogging more faithfully this year, but, in my defense, I’ve been busy doing a few big things: I wrote a book with Drew Conway called Machine Learning

Read more »

More on Philadelphia Homicide

March 31, 2012
By
More on Philadelphia Homicide

I've been doing more analysis of the Philadelphia Homicide data that the Philadelphia Inquirer has published, and presented some of it at the Philadelphia UseR group yesterday. My slides and source are on github.I should be clear tha...

Read more »

Draw Your Breast with CloudStat – A R Apps (for fun)

March 31, 2012
By
Draw Your Breast with CloudStat – A R Apps (for fun)

This is a simple apps, called “Draw Your Breast with R“  created with R to generate Breast alike graphics. With this Draw Your Breast with R apps, you can change 4 parameters which are Theta, Phi, Expand and Color to generate graphics like...

Read more »

Ggplot2, PubMed citation frequency and DSM-IV Axis I disorders by year

March 31, 2012
By
Ggplot2, PubMed citation frequency and DSM-IV Axis I disorders by year

I searched PubMed for several major DSM-IV disorders and downloaded the hits. Using ggplot2 I plotted the number of publications each year for each disorder.

Read more »

Playing with XML-Package: Get No. of Google Search Hits with R

March 30, 2012
By
Playing with XML-Package: Get No. of Google Search Hits with R

GoogleHits <- function(input) { require(XML) require(stringr) require(RCurl) url

Read more »

GBLUP example in R

March 30, 2012
By

Shirin Amiri was asking about GBLUP (genomic BLUP) and based on her example I set up the following R script to show how GBLUP works. Note that this is the so called marker model, where we estimate allele substitution effects of the markers and not individual based model, where genomic breeding values are inferred directly. The code:library(package="MatrixModels") dat <- data.frame(...

Read more »

VIDEO: "R" Checking the reference values ("Y" Matrix).

March 30, 2012
By
VIDEO: "R" Checking the reference values ("Y" Matrix).

(This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: NIR-Quimiometría. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics...

Read more »

See the wind

March 30, 2012
By
See the wind

The image below isn't a bearskin rug in the shape of the USA. It's fact, it's a visualization of the wind flowing over the United States, as of 4PM EDT today, March 30. You can click through to see the current wind conditions, based on latest data from the National Digital Forecast Database. But more importantly, as long as...

Read more »