Blog Archives

R: microbenchmark, reshaping big data features

January 7, 2016
By
R: microbenchmark, reshaping big data features

pacman::p_load(data.table, microbenchmark ) train train_mat f1 f2 microbenchmark(f1(),f2(),times=10)

Read more »

R: Remove constant and identical features programmatically

January 7, 2016
By
R: Remove constant and identical features programmatically

##### Removing constant features cat("n## Removing the constants features.n") for (f in names(train)) { if (length(unique(train])) == 1) { cat(f, "is constant in train. We delete it.n") train] <- NULL t...

Read more »

R: Setup a grid search for xgboost (!!)

January 7, 2016
By
R: Setup a grid search for xgboost (!!)

I find this code super useful because R’s implementation of xgboost (and to my knowledge Python’s) otherwise lacks support for a grid search:

Read more »

How to Conditionally Remove Character of a Vector Element in R

October 30, 2015
By
How to Conditionally Remove Character of a Vector Element in R

I have (sometimes incomplete) data on addresses that looks like this: data <- c("1600 Pennsylvania Avenue, Washington DC", ",Siem Reap,FC,", "11 Wall Street, New York, NY", ",Addis Ababa,FC,") where I need to remove the first and/or last character if either one of them are a comma. Avinash Raj was able to help me with this

Read more »

Write an R Package from Scratch with Github

September 9, 2015
By

Writing an R package is simple. Writing an R package via Github is simple and smart. Github adds all the traditional benefits of version control, in addition to showing off your work and providing and facilitating publication of your package. This tutorial was inspired by a blog post from the beautiful Hillary Parker last year.

Read more »

R: Happy Pi Day

March 14, 2015
By
R: Happy Pi Day

Today, 3/14/2015, is Pi Day (see http://piday.org). In honor of Pi Day, I threw together a little R code on Github, which discusses pi, prints it, and creates Julia set (fractal) images based on it: https://github.com/hack-r/Rpiday Happy Pi Day!

Read more »

R: How to Transform “prob” Predictions to a Single Column of Predicted Values

March 13, 2015
By

# Recombine Test + Training ———————————————– a <- cbind(x1, y1) b <- cbind(x, y) a$actual <- a$y1 b$actual <- b$y a$y1     <- NULL b$y      <- NULL c <- rbind(a, b) # Run Predictions for Entire Data Set ————————————- all_preds <- predict(rf, newdata = c, type = “prob”) colSums(all_preds) summary(c$actual) c$predicted <- apply(all_preds, 1, which.max) then

Read more »

Machine Learning: Definition of %Var(y) in R’s randomForest package’s regression method

March 13, 2015
By

The second column is simply the first column divided by the variance of the response that have been OOB up to that point (20 trees), times 100. Source: https://stat.ethz.ch/pipermail/r-help/2008-July/167748.html

Read more »

R: Add smoother to ggplot2 plot (geom_smooth()) in 1 line

March 13, 2015
By

Just use qplot(votes, rating, data = movies) + geom_smooth()

Read more »

Did you know? Source of ggplot2 in R

March 13, 2015
By

You thought it was Hadley Wickham, right? Nope! ggplot2 comes from  Grammar of Graphics developed by Leland Wilkinson

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)