DIY ZeroAccess GeoIP Plots

October 5, 2012 | hrbrmstr

Since F-Secure was #spiffy enough to provide us with GeoIP data for mapping the scope of the ZeroAccess botnet, I thought that some aspiring infosec data scientists might want to see how to use something besides Google Maps & Google Earth to view the data. If you look at the CSV ...
[Read more...]

Scraping pages and downloading files using R

October 1, 2012 | Luis

I have written a few posts discussing descriptive analyses of evaluation of National Standards for New Zealand primary schools.The data for roughly half of the schools was made available by the media, but the full version of the dataset is … Continue reading → [Read more...]

m x n matrix with randomly assigned 0/1

August 28, 2012 | Luis

Today Scott Chamberlain tweeted asking for a better/faster solution to building an m x n matrix with randomly assigned 0/1. He already had a working version: Now, I’m the first to acknowledge that I’ve never got the ‘apply’ family of … Continue reading → [Read more...]

read raster data in parallel

August 18, 2012 | Tim Salabim

Use library(parallel) to read raster data in parallel fashion Use library(parallel) to read raster data in parallel fashion Recently, I have been doing some analysis for a project I am involved in. In particular, I was interested what role pacific sea surface temperatures play with regard to rainfall ...
[Read more...]

Automatic Hyperparameter Tuning Methods

July 20, 2012 | John Myles White

At MSR this week, we had two very good talks on algorithmic methods for tuning the hyperparameters of machine learning models. Selecting appropriate settings for hyperparameters is a constant problem in machine learning, which is somewhat surprising given how much expertise the machine learning community has in optimization theory. I ... [Read more...]

Outer Product of Character Vectors in R

July 19, 2012 | Isomorphismes

What follows is like a kata to strengthen your R fundamentals. The lovely stats in the wild recently posted some hott data analysis of Olympians’ ages and sexes. Because I’m annoyingly picky about graphics, I asked for his code so I could ... [Read more...]

Optimization Functions in Julia

July 9, 2012 | John Myles White

Over the last few weeks, I’ve made a concerted effort to develop a basic suite of optimization algorithms for Julia so that Matlab programmers used to using fminunc() and R programmers used to using optim() can start to transition code over to Julia that requires access to simple optimization ... [Read more...]

Bayesian Nonparametrics in R

June 25, 2012 | John Myles White

On July 25th, I’ll be presenting at the Seattle R Meetup about implementing Bayesian nonparametrics in R. If you’re not sure what Bayesian nonparametric methods are, they’re a family of methods that allow you to fit traditional statistical models, such as mixture models or latent factor models, ... [Read more...]

The Great Julia RNG Refactor

June 21, 2012 | John Myles White

Many readers of this blog will know that I’m a big fan of Bayesian methods, in large part because automated inference tools like JAGS allow modelers to focus on the types of structure they want to extract from data rather than worry about the algorithmic details of how they ... [Read more...]

integrating R with other systems

June 16, 2012 | Harlan

I just returned from the useR! 2012 conference for developers and users of R. One of the common themes to many of the presentations was integration of R-based statistical systems with other systems, be they other programming languages, web systems, or enterprise data systems. Some highlights for me were an update ... [Read more...]

R’s increasing popularity. Should we care?

May 17, 2012 | Luis

Some people will say ‘you have to learn R if you want to get a job doing statistics/data science’. I say bullshit, you have to learn statistics and learn to work in a variety of languages if you want to … Continue reading → [Read more...]

cumplyr: Extending the plyr Package to Handle Cross-Dependencies

May 3, 2012 | John Myles White

Introduction For me, Hadley Wickham‘s reshape and plyr packages are invaluable because they encapsulate omnipresent design patterns in statistical computing: reshape handles switching between the different possible representations of the same underlying data, while plyr automates what Hadley calls the Split-Apply-Combine strategy, in which you split up your data ... [Read more...]
1 2 3 9

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)