Blog Archives

Conditional densities, on one single graph

December 5, 2013
By
Conditional densities, on one single graph

With Stéphane Tufféry we’ve been working on credit scoring1 and we’ve been using the popular german credit dataset, > myVariableNames <- c("checking_status","duration","credit_history", + "purpose","credit_amount","savings","employment","installment_rate", + "personal_status","other_parties","residence_since","property_magnitude", + "age","other_payment_plans","housing","existing_credits","job", + "num_dependents","telephone","foreign_worker","class") > credit = read.table( + "http://archive.ics.uci.edu/ml/machine-learning-databases/statlog/german/german.data", + header=FALSE,col.names=myVariableNames) > credit$class <- credit$class-1 We wanted to get a nice code to produce a graph like the one below, Yesterday, Stéphane...

Read more »

Binomial regression model

November 18, 2013
By
Binomial regression model

Most of the time, when we introduce binomial models, such as the logistic or probit models, we discuss only Bernoulli variables, . This year (actually also the year before), I discuss extensions to multinomial regressions, where  is a function on some simplex. The multinomial logistic model was mention here. The idea is to consider, for instance with three possible classes the following...

Read more »

Maximum Likelihood versus Goodness of Fit

November 8, 2013
By
Maximum Likelihood versus Goodness of Fit

Thursday, I got an interesting question from a colleague of mine (JP). I mean, the way I understood the question turned out to be a nice puzzle (but I have to confess I might have misunderstood). The question is the following : consider a i.i.d. sample of continuous variables. We would like to choose between two (parametric) families for...

Read more »

Generating functions

November 8, 2013
By
Generating functions

Today, I wanted to publish a post on generating functions, based on discussions I had with Jean-Francois while having our coffee after lunch a couple of times already. The other reason is that I publish my post while my student just finished their Probability exam (and there were a few questions on generating functions). A short introduction (back on...

Read more »

Smoothing mortality rates

November 4, 2013
By
Smoothing mortality rates

This morning, I was working with Julie, a student of mine, coming from Rennes, on mortality tables. Actually, we work on genealogical datasets from a small region in Québec, and we can observe a lot of volatiliy. If I borrow one of her graph, we get something like Since we have some missing data, we wanted to use some...

Read more »

Halloween and candies (a ballot problem)

October 30, 2013
By
Halloween and candies (a ballot problem)

This year, for Halloween, a post on candies (I promise, next year I will write another post on zombies). But I don’t want to focus on the kids problems (last year, we tried to minimize their walking distance to collect as much candies as possible, with part 1 and part 2), I want to discuss my own problems. Because usually, the kids wear...

Read more »

More significant? so what…

October 30, 2013
By
More significant? so what…

Following my non-life insurance class, this morning, I had an interesting question from a student, that I will try to illustrate, and reformulate as accurately as possible. Consider a simple regression model, with one variable of interest, and one possible explanatory variable. Assume that we have two possible models, with the following output (yes, I do hide interesting parts...

Read more »

Pricing Reinsurance Contracts

October 24, 2013
By
Pricing Reinsurance Contracts

In order to illustrate the next section of the non-life insurance course, consider the following example1, inspired from http://sciencepolicy.colorado.edu/…. This is the so-called “Normalized Hurricane Damages in the United States” dataset, for the period 1900-2005, from Pielke et al. (2008). The dataset is available in xls format, so we have to spend some time to import it, > library(gdata) >...

Read more »

GLM, non-linearity and heteroscedasticity

October 22, 2013
By
GLM, non-linearity and heteroscedasticity

Last week in non-life insurance course, we’ve seen the theory of the Generalized Linear Models, emphasizing the two important components the link function (which is actually the key component in predictive modeling) the distribution, or the variance function Just to illustrate, consider my favorite dataset ­lin.mod = lm(dist~speed,data=cars) A linear model means here where the residuals are assumed to be...

Read more »

Equidistant points on a map

October 17, 2013
By
Equidistant points on a map

This morning, I had a comment on a recent post, regarding a graph I did upload on the blog, which was extracted from a paper now online (see http://hal.archives-ouvertes.fr/hal-00871883). Jo (from KUL, I guess I can share that piece of information) asked me I was wondering whether you would want to share the R code for plotting figures 1...

Read more »