Blog Archives

What it the interpretation of the diagonal for a ROC curve

March 25, 2019
By
What it the interpretation of the diagonal for a ROC curve

Last Friday, we discussed the use of ROC curves to describe the goodness of a classifier. I did say that I will post a brief paragraph on the interpretation of the diagonal. If you look around some say that it describes the “strategy of randomly guessing a class“, that it is obtained with “a diagnostic test that is no...

Read more »

On the poor performance of classifiers in insurance models

March 13, 2019
By
On the poor performance of classifiers in insurance models

Each time we have a case study in my actuarial courses (with real data), students are surprised to have hard time getting a “good” model, and they are always surprised to have a low AUC, when trying to model the probability to claim a loss, to die, to fraud, etc. And each time, I keep saying, “yes, I know,...

Read more »

Random thoughts on econometric models with (pure) random features

February 16, 2019
By
Random thoughts on econometric models with (pure) random features

For my lectures on applied linear models, I wanted to illustrate the fact that the is never a good measure of the goodness of the model, since it’s quite easy to improve it. Consider the following dataset n=100 df=data.frame(matrix(rnorm(n*n),n,n)) names(df)=c("Y",paste("X",1:99,sep="")) with one variable of interest , and 99 features . All of them being (by construction) independent. And we...

Read more »

NSERC – Discovery Grants Program, over the past 5 years

February 7, 2019
By
NSERC – Discovery Grants Program, over the past 5 years

In a previous post, I discussed how it was possible to scrap the NSERC website to get stats about discovery grants. Since we just got the new 2018 figures, I thought it would be a good opportunity to update my graphs, library(XML) library(stringr) url="http://www.nserc-crsng.gc.ca/NSERC-CRSNG/FundingDecisions-DecisionsFinancement/ResearchGrants-SubventionsDeRecherche/ResultsGSC-ResultatsCSS_eng.asp" download.file(url,destfile = "GSC.html") library(XML) tables=readHTMLTable("GSC.html") GSC=tables]$V1 GSC=as.character(GSC) namesGSC=tables]$V2 namesGSC=as.character(namesGSC) Correction = function(x) as.numeric(gsub('', '', x))...

Read more »

The “probability to win” is hard to estimate…

November 6, 2018
By
The “probability to win” is hard to estimate…

Real-time computation (or estimation) of the “probability to win” is difficult. We’ve seem that in soccer games, in elections… but actually, as a professor, I see that frequently when I grade my students. Consider a classical multiple choice exam. After each question, imagine that you try to compute the probability that the student will pass. Consider here the case...

Read more »

Solving the chinese postman problem

October 19, 2018
By
Solving the chinese postman problem

Some pre-Halloween post today. It started actually while I was in Barcelona : kids wanted to go back to some store we’ve seen the first day, in the gothic part, and I could not remember where it was. And I said to myself that would be quite long to do all the street of the neighborhood. And I discovered...

Read more »

Monte Carlo techniques to create counterfactuals

October 11, 2018
By
Monte Carlo techniques to create counterfactuals

In the previous STT5100 course, last week, we’ve seen how to use monte carlo simulations. The idea is that we do observe in statistics a sample , and more generally, in econometrics . But let’s get back to statistics (without covariates) to illustrate. We assume that observations are realizations of an underlying random variable . We assume that are...

Read more »

October, grant proposal season

October 9, 2018
By
October, grant proposal season

In 2012, Danielle Herbert, Adrian Barnett, Philip Clarke and Nicholas Graves published an article entitled “on the time spent preparing grant proposals: an observational study of Australian researchers“, whose conclusions had been included in Nature under a more explicit title, “Australia’s grant system wastes time” ! In this study, they included 3700 grant applications sent to the National Health...

Read more »

Combining automatically factor levels in R

October 6, 2018
By
Combining automatically factor levels in R

Each time we face real applications in an applied econometrics course, we have to deal with categorial variables. And the same question arise, from students : how can we combine automatically factor levels ? Is there a simple R function ? I did upload a few blog posts, over the pas years. But so far, nothing satistfying. Let me...

Read more »

Convex Regression Model

July 5, 2018
By
Convex Regression Model

This morning during the lecture on nonlinear regression, I mentioned (very) briefly the case of convex regression. Since I forgot to mention the codes in R, I will publish them here. Assume that where is some convex function. Then is convex if and only if , , Hidreth (1954) proved that ifthen is unique. Let , then where. I.e....

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)