Machine Learning Ex3 – Multivariate Linear Regression
[This article was first published on YGC » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Part 1. Finding alpha.
The first question to resolve in Exercise 3 is to pick a good learning rate alpha.
This require making an initial selection, running gradient descent and observing the cost function.
I test alpha range from 0.01 to 1.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | ##preparing data input.
x <- read.table("ex3x.dat", header=F)
y <- read.table("ex3y.dat", header=F)
#normalize features using Z-score.
x[,1] <- (x[,1] - mean(x[,1]))/sd(x[,1])
x[,2] <- (x[,2] - mean(x[,2]))/sd(x[,2])
x <- cbind(x0=rep(1, nrow(x)), x)
x <- as.matrix(x)
##gradient descent algorithm.
gradDescent_internal <- function(theta, x, y, m, alpha) {
h <- sapply(1:nrow(x), function(i) t(theta) %*% x[i,])
j <- t(h-y) %*% x
grad <- 1/m * j
theta <- t(theta) - alpha * grad
theta <- t(theta)
return(theta)
}
## cost function.
J <- function(theta, x, y, m) {
h <- sapply(1:nrow(x), function(i) t(theta) %*% x[i,])
j <- 2*sum((h-y)^2)/m
return(j)
}
## calculate cost function J for every iteration at specific alpha value.
testLearningRate <- function(x,y, alpha, niter=50) {
j <- rep(0, niter)
m <- nrow(x)
theta <- matrix(rep(0, ncol(x)), ncol=1)
for (i in 1:niter) {
theta <- gradDescent_internal(theta,x,y,m, alpha)
j[i] <- J(theta, x, y, m)
}
return(j)
}
## test learning rate.
alpha=c(0.01, 0.03, 0.1, 0.3, 1)
xxx=sapply(alpha, testLearningRate, x=x, y=y)
colnames(xxx) <- as.character(alpha)
require(ggplot2)
xxx <- melt(xxx)
names(xxx) <- c("niter", "alpha", "J")
p <- ggplot(xxx, aes(x=niter, y=J))
p+geom_line(aes(colour=factor(alpha))) +xlab("Number of iteractions") +ylab("Cost J") |
alpha = 0.3 seems to be the best.
Part 2. Normal Equations.
to be continued…
Related Posts
To leave a comment for the author, please follow the link and comment on their blog: YGC » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
