Articles by arthur charpentier

Sharing pictures from holidays in the Canadian Rockies (with R)

August 9, 2020 | arthur charpentier

My kids have a very popular blog (at least among their grandmothers) where they frequently post pictures from everyday’s life (since they live 5000km from them), as well as pictures taken from holidays. This afternoon, I tried to used the popupImage function from the leaflet package to post pictures, ... [Read more...]

Regression discontinuity model for TV series

July 12, 2020 | arthur charpentier

In September, we are usually happy to see our favorite TV series back on air… Or not? Because admit it, if we are happy to see those characters back, most of the time, we are disappointed. So why not look at the data, to confirm this feeling? Nazareno Andrade shared ...

Testing for Covid-19 in the U.S.

April 28, 2020 | arthur charpentier

For almost a month, on a daily basis, we are working with colleagues (Romuald, Chi and Mathieu) on modeling the dynamics of the recent pandemic. I learn of lot of things discussing with them, but we keep struggling with the tests. Paul, in Montréal, helped me a little bit, ...

On the “correlation” between a continuous and a categorical variable

April 4, 2020 | arthur charpentier

Let us get back on the Titanic dataset, loc_fichier = "http://freakonometrics.free.fr/titanic.RData" download.file(loc_fichier, "titanic.RData") load("titanic.RData") base = base[!is.na(base$Age),] On consider two variables, the age (the continuous one) and the survivor indicator (the qualitative one) X = base$Age ...

Modeling Pandemics (3)

March 20, 2020 | arthur charpentier

In Statistical Inference in a Stochastic Epidemic SEIR Model with Control Intervention, a more complex model than the one we’ve seen yesterday was considered (and is called the SEIR model). Consider a population of size , and assume that is the number of susceptible, the number of exposed, the number ...

Modeling pandemics (2)

March 20, 2020 | arthur charpentier

When introducing the SIR model, in our initial post, we got an ordinary differential equation, but we did not really discuss stability, and periodicity. It has to do with the Jacobian matrix of the system. But first of all, we had three equations for three function, but actuallyso it means ...

Modeling pandemics (1)

March 19, 2020 | arthur charpentier

The most popular model to model epidemics is the so-called SIR model – or Kermack-McKendrick. Consider a population of size , and assume that is the number of susceptible, the number of infectious, and for the number recovered (or immune) individuals, so that which implies that . In order to be more realistic, ...

Function basis and regression

March 1, 2020 | arthur charpentier

In the first part of the course on linear models, we’ve seen how to construct a linear model when the vector of covariates is given, so that is either simply (for standard linear models) or a functional of (in GLMs). But more generally, we can consider transformations of the ...

Testing for a causal effect (with 2 time series)

February 19, 2020 | arthur charpentier

A few days ago, I came back on a sentence I found (in a French newspaper), where someone was claiming that “… an old variable explains 85% of the change in a new variable. So we can talk about causality” and I tried to explain that it was just stupid : if we ...

Lasso Regression (home made)

February 17, 2020 | arthur charpentier

To compute Lasso regression, define the soft-thresholding functionThe R function would be soft_thresholding = function(x,a){ sign(x) * pmax(abs(x)-a,0) } To solve our optimization problem, set so that the optimization problem can be written, equivalently hence and one gets or, if we develop Again, if there are ...

Quantile Regression (home made, part 2)

February 17, 2020 | arthur charpentier

A few months ago, I posted a note with some home made codes for quantile regression… there was something odd on the output, but it was because there was a (small) mathematical problem in my equation. So since I should teach those tomorrow, let me fix them. Median Consider a ...

On Cochran Theorem (and Orthogonal Projections)

January 15, 2020 | arthur charpentier

Cochran Theorem – from The distribution of quadratic forms in a normal system, with applications to the analysis of covariance published in 1934 – is probably the most import one in a regression course. It is an application of a nice result on quadratic forms of Gaussian vectors. More precisely, we can prove ...

On the conjugate function

January 13, 2020 | arthur charpentier

In the MAT7381 course (graduate course on regression models), we will talk about optimization, and a classical tool is the so-called conjugate. Given a function its conjugate is function such that so, long story short, is the maximum gap between the linear function and . Just to visualize, consider a simple ...

Combining automatically factor levels with trees

October 3, 2019 | arthur charpentier

Last year, in a post, I discussed how to merge levels of factor variables, using combinatorial techniques (it was for my STT5100 cours, and trees are not in the syllabus), with an extension on trees at the end of the post. consider the following (simulated dataset) n=200 set.seed(1) x1=...

On leverage

October 3, 2019 | arthur charpentier

Last week, in our STT5100 (applied linear models) class, I’ve introduce the hat matrix, and the notion of leverage. In a classical regression model, (in a matrix form), the ordinary least square estimator of parameter is The prediction can then be writtenwhere is called the hat matrix. The matrix ...

Insurance data science : Networks

August 15, 2019 | arthur charpentier

At the Summer School of the Swiss Association of Actuaries, in Lausanne, I will start talking about networks and insurance this Friday. Slides are available online

Insurance data science : Text

August 14, 2019 | arthur charpentier

At the Summer School of the Swiss Association of Actuaries, in Lausanne, I will start talking about text based data and NLP this Thursday. Slides are available online Ewen Gallic (AMSE) will present a tutorial on tweets. I can upload a few additiona...

Insurance data science : Pictures

August 13, 2019 | arthur charpentier

At the Summer School of the Swiss Association of Actuaries, in Lausanne, following the part of Jean-Philippe Boucher (UQAM) on telematic data, I will start talking about pictures this Wednesday. Slides are available online Ewen Gallic (AMSE) will present a tutorial on satellite pictures, and a simple classification problem, related ...

Insurance data science : use and value of unusual data #1

August 5, 2019 | arthur charpentier

Next week, with , I will be at the Summer School of the Swiss Association of Actuaries, in Lausanne, with Jean-Philippe Boucher (UQAM) and Ewen Gallic (AMSE). There will be some hands-on applications, on R. I will share some codes in the slides.

Optimal transport on large networks

July 4, 2019 | arthur charpentier

With Alfred Galichon and Lucas Vernet, we recently uploaded a paper entitled optimal transport on large networks on arxiv. This article presents a set of tools for the modeling of a spatial allocation problem in a large geographic market and gives examples of applications. In our settings, the market is ...

« 1 2 3 4 … 19 »

Copyright © 2025 | MH Corporate basic by MH Themes