R and Python: Gradient Descent
[This article was first published on Analysis with Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
One of the problems often dealt in Statistics is minimization of the objective function. And contrary to the linear models, there is no analytical solution for models that are nonlinear on the parameters such as logistic regression, neural networks, and nonlinear regression models (like Michaelis-Menten model). In this situation, we have to use mathematical programming or optimization. And one popular optimization algorithm is the gradient descent, which we’re going to illustrate here. To start with, let’s consider a simple function with closed-form solution given by begin{equation} f(beta) triangleq beta^4 – 3beta^3 + 2. end{equation} We want to minimize this function with respect to $beta$. The quick solution to this, as what calculus taught us, is to compute for the first derivative of the function, that is begin{equation} frac{text{d}f(beta)}{text{d}beta}=4beta^3-9beta^2. end{equation} Setting this to 0 to obtain the stationary point gives us begin{align} frac{text{d}f(beta)}{text{d}beta}&overset{text{set}}{=}0nonumber\ 4hat{beta}^3-9hat{beta}^2&=0nonumber\ 4hat{beta}^3&=9hat{beta}^2nonumber\ 4hat{beta}&=9nonumber\ hat{beta}&=frac{9}{4}. end{align} The following plot shows the minimum of the function at $hat{beta}=frac{9}{4}$ (red line in the plot below).Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- Initialize $mathbf{x}_{r},r=0$
- while $lVert mathbf{x}_{r}-mathbf{x}_{r+1}rVert > nu$
- $mathbf{x}_{r+1}leftarrow mathbf{x}_{r} – gammanabla f(mathbf{x}_r)$
- $rleftarrow r + 1$
- end while
- return $mathbf{x}_{r}$ and $r$
gamma
to .01 in the codes above) the algorithm will converge at 42nd iteration. To support that claim, see the steps of its gradient in the plot below.beta_new
to .1) with $gamma=.01$, the algorithm converges at 173rd iteration with estimate $hat{beta}_{173}=2.249962approxfrac{9}{4}$ (see the plot below).To leave a comment for the author, please follow the link and comment on their blog: Analysis with Programming.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.