Posts Tagged ‘ programming ’

DIY ZeroAccess GeoIP Plots

October 5, 2012
By
DIY ZeroAccess GeoIP Plots

Since F-Secure was #spiffy enough to provide us with GeoIP data for mapping the scope of the ZeroAccess botnet, I thought that some aspiring infosec data scientists might want to see how to use something besides Google Maps & Google Earth to view the data. If you look at the CSV file, it’s formatted as

Read more »

Scraping pages and downloading files using R

October 1, 2012
By
Scraping pages and downloading files using R

I have written a few posts discussing descriptive analyses of evaluation of National Standards for New Zealand primary schools.The data for roughly half of the schools was made available by the media, but the full version of the dataset is … Continue reading →

Read more »

m x n matrix with randomly assigned 0/1

August 28, 2012
By
m x n matrix with randomly assigned 0/1

Today Scott Chamberlain tweeted asking for a better/faster solution to building an m x n matrix with randomly assigned 0/1. He already had a working version: Now, I’m the first to acknowledge that I’ve never got the ‘apply’ family of … Continue reading →

Read more »

read raster data in parallel

August 18, 2012
By
read raster data in parallel

Use library(parallel) to read raster data in parallel fashion Use library(parallel) to read raster data in parallel fashion Recently, I have been doing some analysis for a project I am involved in. In particular, I was...

Read more »

My New Book: Developing, Deploying and Debugging Multi-Armed Bandit Algorithms

July 28, 2012
By

I’m happy to announce that I’ve started writing a new book for O’Reilly, which will focus on teaching readers how to use Multi-Armed Bandit Algorithms to build better websites. My hope is that the book can help web developers build up an intuition for the core conundrum facing anyone who wants to build a successful

Read more »

renaming data frame columns in lists

July 24, 2012
By
renaming data frame columns in lists

Renaming the columns of data frames which are stored in lists of lists Renaming the columns of data frames which are stored in lists of lists OK, so the scenario is as follows: we have a...

Read more »

Automatic Hyperparameter Tuning Methods

July 20, 2012
By

At MSR this week, we had two very good talks on algorithmic methods for tuning the hyperparameters of machine learning models. Selecting appropriate settings for hyperparameters is a constant problem in machine learning, which is somewhat surprising given how much expertise the machine learning community has in optimization theory. I suspect there’s interesting psychological and

Read more »

Outer Product of Character Vectors in R

July 19, 2012
By
Outer Product of Character Vectors in R

What follows is like a kata to strengthen your R fundamentals. The lovely stats in the wild recently posted some hott data analysis of Olympians’ ages and sexes. Because I’m annoyingly picky about graphics, I asked for his code so I could ...

Read more »

introduction to R: learning by doing (part 2: plots)

July 10, 2012
By
introduction to R: learning by doing (part 2: plots)

Lets go one with the second part of learning R by doing R (you will find the first part here. As we have used vectors, matrices and loops in the first part, we will concentrate on graphics in this one. but first we will need data to plot: Sometimes you will need several plots in

Read more »

Optimization Functions in Julia

July 9, 2012
By
Optimization Functions in Julia

Over the last few weeks, I’ve made a concerted effort to develop a basic suite of optimization algorithms for Julia so that Matlab programmers used to using fminunc() and R programmers used to using optim() can start to transition code over to Julia that requires access to simple optimization algorithms like L-BFGS and the Nelder-Mead

Read more »