Blog Archives

Confidence vs. Credibility Intervals

November 26, 2014
By
$\theta$

Tomorrow, for the final lecture of the Mathematical Statistics course, I will try to illustrate - using Monte Carlo simulations - the difference between classical statistics, and the Bayesien approach. The (simple) way I see it is the following, for frequentists, a probability is a measure of the the frequency of repeated events, so the interpretation is that parameters are...

Reinterpreting Lee-Carter Mortality Model

November 18, 2014
By

Last week, while I was giving my crash course on R for insurance, we’ve been discussing possible extensions of Lee & Carter (1992) model. If we look at the seminal paper, the model is defined as follows Hence, it means that This would be a (non)linear model on the logarithm of the mortality rate. A non-equivalent, but alternative expression...

Excel (and French people) are such a pain in the…

November 6, 2014
By

A few days ago, I published a post entitled extracting datasets from excel files in a zipped folder, because I wanted to use datasets that were online, in some (zipped) excel format. The first difficult part was the folder with a non-standard character (the French é). Because next week I should be using those dataset in a crash course...

Shapefiles from Isodensity Curves

November 3, 2014
By

Recently, with @3wen, we wanted to play with isodensity curves. The problem is that it is difficult to get – numerically – the equation of the contour (even if we can easily plot it). Consider the following surface (just for fun, in order to illustrate the idea) > f=function(x,y) x*y+(1-x)*(1-y) > u=v=seq(0,1,length=21) > v=seq(0,1,length=11) > f=outer(u,v,f) > persp(u,v,f,theta=angle,phi=10,box=TRUE, +...

Extracting datasets from excel files in a zipped folder

October 30, 2014
By

The title of the post is a bit long, but that’s the problem I was facing this morning: importing dataset from files, online. I mean, it was not a “problem” (since I can always download, and extract manually the files), more a challenge (I should be able to do it in R, directly). The files are located on ressources-actuarielles.net, in a...

Kernel Density Estimation with Ripley’s Circumferential Correction

October 21, 2014
By

The revised version of the paper Kernel Density Estimation with Ripley’s Circumferential Correction with Ewen Gallic is now online, on hal.archives-ouvertes.fr/. In this paper, we investigate (and extend) Ripley’s circumference method to correct bias of density estimation of edges (or frontiers) of regions. The idea of the method was theoretical and difficult to implement. We provide a simple technique —...

Removing Uncited References in a Tex File (with R)

October 18, 2014
By

Last week, with @3wen, we were working a the revised version of our work on smoothing densities of spatial processes (with edge correction). Usually, once you have revised the paper, some references were added, others were droped. But you need to spend some time, to check that all references are actually mentioned in the paper. For instance, consider the...

What happens if we forget a trivial assumption ?

October 4, 2014
By
$a$

Last week, @dmonniaux published an interesting post entitled l’erreur n’a rien d’original  on  his blog. He was asking the following question : let , and denote three real-valued coefficients, under which assumption on those three coefficients does has a real-valued root ? Everyone aswered , but no one mentioned that it is necessary to have a proper quadratic equation,...

Cross Validation for Kernel Density Estimation

October 1, 2014
By
$\mathbb{E}\left[\int [\widehat{f}_h(x)-f(x)]^2dx\right]$

In a post publihed in July, I mentioned the so called the Goldilocks principle, in the context of kermel density estimation, and bandwidth selection. The bandwith should not be too small (the variance would be too large) and it should not be too large (the bias would be too large). Another standard method to select the bandwith, as mentioned...

Generating Hurricanes with a Markov Spatial Process

September 30, 2014
By

The National Hurricane Center (NHC) collects datasets with all  storms in North Atlantic, the North Atlantic Hurricane Database (HURDAT). For all sorms, we have the location of the storm, every six jours (at midnight, six a.m., noon and six p.m.). Note that we have also the date, the maximal wind speed – on a 6 hour window – and...