The R world has come a long way since Jay & I wrote Data-Driven Security. We had to make a conscious decision to stick with R 2.14.0 (R is at version 3.2.1 now) and packages such as knitr and dplyr either didn’t exist or were in their infancy.
In Chapter 4, we showed some very basic exploratory data analysis and visualization. One of those examples showed how to do a basic network visualization of the ZeuS botnet nodes, clustered by country of origin.
We turned some of the functions that collected metadata on the ZeuS IP addresses into a new R package – cymruservices which will be on CRAN soon. If you’re new to installing from github, you’ll need to install and load the
devtools package then do a
devtools::install_github("hrbrmstr/cymruservices") to work with that package until it gets on CRAN. (UPDATE: It’s on CRAN.)
We’ll re-create the first network visualization from listing 4-12 (page 94) using this package and also modify the code to use
dplyr functions and visualize the graph with
networkD3, a super-spiffy
htmlwidget package. You’ll be able to pan & zoom the visualization and hopefully get some inspiration to “Try This At Home”.
We’ve placed the ZeuS botnet data used in the book on our website to make it easier to replicate the example. The code is (unsurprisingly) similar to the listing in the book:
If you have the book, take a look at some of the subtle changes and also see how easy it is to make existing, static R visualizations dynamic.
There are a few more interesting functions in that package that will get you tons of useful metadata for your security data science projects. The package should be helpful when creating features for classification or just building relationships between objects that you may never know have exists. Plus, you now have a new visualization toy to play with!