# Blog Archives

## Non-Uniform Population Density in some European Countries

April 17, 2016
By

A few months ago, I did mention that France was a country with strong inequalities, especially when you look at higher education, and research teams. Paris has almost 50% of the CNRS researchers, while only 3% of the population lives there. CNRS, "répartition des chercheurs en SHS" http://t.co/39dcJJBwrF, Paris 47.52% IdF 66.85% (pop 3.39% et 18.18% resp) pic.twitter.com/OsEXiFywPf — Arthur...

## How long could it take to run a regression

April 6, 2016
By
$n$

This afternoon, while I was discussing with Montserrat (aka @mguillen_estany) we were wondering how long it might take to run a regression model. More specifically, how long it might take if we use a Bayesian approach. My guess was that the time should probably be linear in , the number of observations. But I thought I would be good to check. Let...

## Computational Actuarial Science, with R, in Barcelona

April 5, 2016
By

This Wednesday, I will give a graduate crash course on computational actuarial science, with R, which will be the second part of the lecture of Tuesday. Slides are now available,

## Where People Live, part 2

April 4, 2016
By

Following my previous post, I wanted to use another dataset to visualize where people live, on Earth. The dataset is coming from sedac.ciesin.columbia.edu. We you register, you can download the database > base=read.table("glp00ag15.asc",skip=6) The database is a ‘big’ 1440×572 matrix, in each cell (latitude and longitude) we have the population > X=t(as.matrix(base,ncol=1440)) > dim(X) 1440 572 The dataset...

## Classification on the German Credit Database

March 18, 2016
By

In our data science course, this morning, we’ve use random forrest to improve prediction on the German Credit Dataset. The dataset is > url="http://freakonometrics.free.fr/german_credit.csv" > credit=read.csv(url, header = TRUE, sep = ",") Almost all variables are treated a numeric, but actually, most of them are factors, > str(credit) 'data.frame': 1000 obs. of 21 variables: \$ Creditability : int 1...

## Forecasts with ARIMA Models

March 16, 2016
By

In our time series class this morning, I was discussing forecasts with ARIMA Models. Consider some simple stationnary AR(1) simulated time series > n=95 > set.seed(1) > E=rnorm(n) > X=rep(0,n) > phi=.85 > for(t in 2:n) X=phi*X+E > plot(X,type="l") If we fit an AR(1) model, > model=arima(X,order=c(1,0,0), + include.mean = FALSE) > P=predict(model,n.ahead=20) > plot(P\$pred) > lines(P\$pred+2*P\$se,col="red") > lines(P\$pred-2*P\$se,col="red")...

## Where People Live

March 3, 2016
By

There was an interesting map on reddit this morning, with a visualisation of latitude and longituge of where people live, on Earth. So I tried to reproduce it. To compute the density, I used a kernel based approch > library(maps) > data("world.cities") > X=world.cities > liss=function(x,h){ + w=dnorm(x-X,0,h) + sum(X*w) + } > vx=seq(-80,80) > vy=Vectorize(function(x) liss(x,1))(vx) > vy=vy/max(vy)...

## R Crash Course, Data Science for Actuaries, Year 2

March 1, 2016
By

This Monday, we will start the second year of the Actuary: Data Science (ADS) program, supported by the (French) Institute of Actuaries. I will be there on monday morning for the opening, and we will start the R & Datamining course. The slides are now online, In order to get nice slides, I have been using slidify.

## Mortality by Weekday and Age

February 27, 2016
By

A few days ago, I did mention on Twitter a nice graph, with Mortality by Weekday and Age https://t.co/LyzQ7nJABZ very interesting difference, young vs. old pic.twitter.com/EfrX0C1GBS — Arthur Charpentier (@freakonometrics) 27 février 2016 My colleague Jean-Philippe was extremely sceptical, so I tried to reproduce that graph. The good thing is that we have the Social Security Death Master File,...

## Spatial and Temporal Viz of Gas Price, in France

February 25, 2016
By

A great think in France, is that we can play with a great database with gas price, in all gas stations, almost eveyday. The file is rather big, so let’s make sure we have enough memory to run our codes, > rm(list=ls()) To extract the data, first, we should extract the xml file, and then convert it in a...