Articles by arthur charpentier

Another Interactive Map for the Cholera Dataset

March 31, 2015 | arthur charpentier

Following my previous post, François (aka @FrancoisKeck) posted a comment mentionning another package I could use to get an interactive map, the rleafmap package. And the heatmap was here easy to include. This time, we do not use openstreetmap. The first part is still the same, to get the ... [Read more...]

Interactive Maps for John Snow’s Cholera Data

March 28, 2015 | arthur charpentier

This week, in Istanbul, for the second training on data science, we’ve been discussing classification and regression models, but also visualisation. Including maps. And we did have a brief introduction to the  leaflet package, devtools::install_github("rstudio/leaflet") require(leaflet) To see what can be done with that ... [Read more...]

Spliting a Node in a Tree

March 23, 2015 | arthur charpentier

If we grow a tree with standard functions in R, on the same dataset used to introduce classification tree in some previous post, __ MYOCARDE=read.table( + "http://freakonometrics.free.fr/saporta.csv", + head=TRUE,sep=";") __ library(rpart) __ cart library(rpart.plot) __ library(rattle) __ prp(cart,type=2,extra=1) The first step ... [Read more...]

Forecast, Automatic Routines vs. Experience

March 18, 2015 | arthur charpentier

This morning, in our Time Series course, we’ve been playing with some data I got from google.ca/trends/. Actually, we’ve been playing on some old version, downloaded 18 months ago (discussed in a previous post, in French). __ urls = "http://freakonometrics.free.fr/report-headphones-2015.csv" __ report=read.table( + urls,... [Read more...]

Growing some Trees

March 18, 2015 | arthur charpentier

Consider here the dataset used in a previous post, about visualising a classification (with more than 2 features), __ MYOCARDE=read.table( + "http://freakonometrics.free.fr/saporta.csv", + header=TRUE,sep=";") The default classification tree is __ arbre = rpart(factor(PRONO)~.,data=MYOCARDE) __ rpart.plot(arbre,type=4,extra=6) We can change the options ... [Read more...]

Visualising a Classification in High Dimension

March 6, 2015 | arthur charpentier

So far, when discussing classification, we’ve been playing on my toy-dataset (actually, I should no claim it’s mine, it is inspired by the one used in the introduction of Boosting, by Robert Schapire and Yoav Freund). But in ral life, there are more observations, and more explanatory variables.... [Read more...]

John Snow, and Google Maps

February 27, 2015 | arthur charpentier

In my previous post, I discussed how to use OpenStreetMaps (and standard plotting functions of R) to visualize John Snow’s dataset. But it is also possible to use Google Maps (and ggplot2 types of graphs). library(ggmap) get_london [Read more...]

John Snow, and OpenStreetMap

February 27, 2015 | arthur charpentier

While I was working for a training on data visualization, I wanted to get a nice visual for John Snow’s cholera dataset. This dataset can actually be found in a great package of famous historical datasets. library(HistData) data(Snow.deaths) data(Snow.streets) One can easily visualize the ... [Read more...]

Visualizing Clusters

February 24, 2015 | arthur charpentier

Consider the following dataset, with (only) ten points x=c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y=c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) plot(x,y,pch=19,cex=2) We want to get – say – two clusters. Or more specifically, two sets of observations, each of them sharing some similarities. Since the number of observations is rather small, it is actually possible to ... [Read more...]

k-means clustering and Voronoi sets

February 22, 2015 | arthur charpentier

In the context of -means, we want to partition the space of our observations into  classes. each observation belongs to the cluster with the nearest mean. Here “nearest” is in the sense of some norm, usually the (Euclidean) norm. Consider the case where we have 2 classes. The means being respectively ... [Read more...]

Inequalities and Quantile Regression

February 6, 2015 | arthur charpentier

In the course on inequality measure, we've seen how to compute various (standard) inequality indices, based on some sample of incomes (that can be binned, in various categories). On Thursday, we discussed the fact that incomes can be related to different variables (e.g. experience), and that comparing income inequalities ... [Read more...]
1 5 6 7 8 9 19

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)