Articles by arthur charpentier

Game of Friendship Paradox

June 27, 2018 | arthur charpentier

In the introduction of my course next week, I will (briefly) mention networks, and I wanted to provide some illustration of the Friendship Paradox. On network of thrones (discussed in Beveridge and Shan (2016)), there is a dataset with the network of characters in Game of Thrones. The word “friend” might ...
[Read more...]

Linear Regression, with Map-Reduce

June 18, 2018 | arthur charpentier

Sometimes, with big data, matrices are too big to handle, and it is possible to use tricks to numerically still do the map. Map-Reduce is one of those. With several cores, it is possible to split the problem, to map on each machine, and then to agregate it back at ... [Read more...]

Quantile Regression (home made)

June 14, 2018 | arthur charpentier

After my series of post on classification algorithms, it’s time to get back to R codes, this time for quantile regression. Yes, I still want to get a better understanding of optimization routines, in R. Before looking at the quantile regression, let us compute the median, or the quantile, ...
[Read more...]

Discrete or continuous modeling ?

June 13, 2018 | arthur charpentier

Tuesday, we got our conference “Insurance, Actuarial Science, Data & Models” and Dylan Possamaï gave a very interesting concluding talk. In the introduction, he came back briefly on a nice discussion we usually have in economics on the kind of model we should consider. It was about optimal control. In many ...
[Read more...]

Classification from scratch, boosting 11/8

June 8, 2018 | arthur charpentier

Eleventh post of our series on classification from scratch. Today, that should be the last one… unless I forgot something important. So today, we discuss boosting. An econometrician perspective I might start with a non-conventional introduction. But that’s actually how I understood what boosting was about. And I am ...
[Read more...]

Classification from scratch, bagging and forests 10/8

June 8, 2018 | arthur charpentier

Tenth post of our series on classification from scratch. Today, we’ll see the heuristics of the algorithm inside bagging techniques. Often, bagging is associated with trees, to generate forests. But actually, it is possible using bagging for any kind of model. Recall that bagging means “boostrap aggregation”. So, consider ...
[Read more...]

Classification from scratch, SVM 7/8

June 6, 2018 | arthur charpentier

Seventh post of our series on classification from scratch. The latest one was on the neural nets, and today, we will discuss SVM, support vector machines. A formal introduction Here takes values in . Our model will be Thus, the space is divided by a (linear) border The distance from point ...
[Read more...]

Classification from scratch, neural nets 6/8

June 5, 2018 | arthur charpentier

Sixth post of our series on classification from scratch. The latest one was on the lasso regression, which was still based on a logistic regression model, assuming that the variable of interest has a Bernoulli distribution. From now on, we will discuss technique that did not originate from those probabilistic ...
[Read more...]

Classification from scratch, logistic with kernels 3/8

May 31, 2018 | arthur charpentier

Third post of our series on classification from scratch, following the previous post introducing smoothing techniques, with (b)-splines. Consider here kernel based techniques. Note that here, we do not use the “logistic” model… it is purely non-parametric. kernel based estimated, from scratch I like kernels because they are somehow ...
[Read more...]

Classification from scratch, trees 9/8

May 30, 2018 | arthur charpentier

Nineth post of our series on classification from scratch. Today, we’ll see the heuristics of the algorithm inside classification trees. And yes, I promised eight posts in that series, but clearly, that was not sufficient… sorry for the poor prediction. Decision Tree Decision trees are easy to read. So ...
[Read more...]

Classification from scratch, logistic with splines 2/8

May 30, 2018 | arthur charpentier

Today, second post of our series on classification from scratch, following the brief introduction on the logistic regression. Piecewise linear splines To illustrate what’s going on, let us start with a “simple” regression (with only one explanatory variable). The underlying idea is natura non facit saltus, for “nature does ...
[Read more...]

Classification from scratch, overview 0/8

May 29, 2018 | arthur charpentier

Before my course on « big data and economics » at the university of Barcelona in July, I wanted to upload a series of posts on classification techniques, to get an insight on machine learning tools. According to some common idea, machine learning algorithms are black boxes. I wanted to get back ...
[Read more...]
1 2 3 4 5 19

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)