**S+/R – Yet Another Blog in Statistical Computing**, and kindly contributed to R-bloggers)

As pointed out in the chapter 10 of “The Elements of Statistical Learning”, ANN and SVM (support vector machines) share similar pros and cons, e.g. lack of interpretability and good predictive power. However, in contrast to ANN usually suffering from local minima solutions, SVM is always able to converge globally. In addition, SVM is less prone to over-fitting given a good choice of free parameters, which usually can be identified through cross-validations.

In the R package “e1071”, tune() function can be used to search for SVM parameters but is extremely inefficient due to the sequential instead of parallel executions. In the code snippet below, a parallelism-based algorithm performs the grid search for SVM parameters through the K-fold cross validation.

pkgs <- c('foreach', 'doParallel') lapply(pkgs, require, character.only = T) registerDoParallel(cores = 4) ### PREPARE FOR THE DATA ### df1 <- read.csv("credit_count.txt") df2 <- df1[df1$CARDHLDR == 1, ] x <- paste("AGE + ACADMOS + ADEPCNT + MAJORDRG + MINORDRG + OWNRENT + INCOME + SELFEMPL + INCPER + EXP_INC") fml <- as.formula(paste("as.factor(DEFAULT) ~ ", x)) ### SPLIT DATA INTO K FOLDS ### set.seed(2016) df2$fold <- caret::createFolds(1:nrow(df2), k = 4, list = FALSE) ### PARAMETER LIST ### cost <- c(10, 100) gamma <- c(1, 2) parms <- expand.grid(cost = cost, gamma = gamma) ### LOOP THROUGH PARAMETER VALUES ### result <- foreach(i = 1:nrow(parms), .combine = rbind) %do% { c <- parms[i, ]$cost g <- parms[i, ]$gamma ### K-FOLD VALIDATION ### out <- foreach(j = 1:max(df2$fold), .combine = rbind, .inorder = FALSE) %dopar% { deve <- df2[df2$fold != j, ] test <- df2[df2$fold == j, ] mdl <- e1071::svm(fml, data = deve, type = "C-classification", kernel = "radial", cost = c, gamma = g, probability = TRUE) pred <- predict(mdl, test, decision.values = TRUE, probability = TRUE) data.frame(y = test$DEFAULT, prob = attributes(pred)$probabilities[, 2]) } ### CALCULATE SVM PERFORMANCE ### roc <- pROC::roc(as.factor(out$y), out$prob) data.frame(parms[i, ], roc = roc$auc[1]) }

**leave a comment**for the author, please follow the link and comment on their blog:

**S+/R – Yet Another Blog in Statistical Computing**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...