Blog Archives

Maybe I Don’t Really Know R After All

June 26, 2014
By
Maybe I Don’t Really Know R After All

Lately, I’ve been feeling that I’m spreading myself too thin in terms of programming languages. At work, I spend most of my time in Hive/SQL, with the occasional Python for my smaller data. I really prefer Julia, but I’m alone at work on that one. And since I maintain a package on CRAN (RSiteCatalyst), I frequently spend Related posts:

Read more »

Using Julia As A ‘Glue’ Language

June 24, 2014
By

While much of the focus in the Julia community has been on the performance aspects of Julia relative to other scientific computing languages, Julia is also perfectly suited to ‘glue’ together multiple data sources/languages. In this blog post, I will cover how to create an interactive plot using Gadfly.jl, by first preparing the data using Related posts:

Read more »

Five Hard-Won Lessons Using Hive

June 12, 2014
By

I’ve been spending a ton of time lately on the data engineering side of ‘data science’, so I’ve been writing a lot of Hive queries. Hive is a great tool for querying large amounts of data, without having to know very much about the underpinnings of Hadoop. Unfortunately, there are a lot of things about Five Hard-Won...

Read more »

Building JSON in R: Three Methods

May 13, 2014
By

When I set out to build RSiteCatalyst, I had a few major goals: learn R, build a CRAN-worthy package and learn the Adobe Analytics API. As I reflect back on how the package has evolved over the past two years and what I’ve learned, I think my greatest learning was around how to deal with JSON Building JSON...

Read more »

Real-time Reporting with the Adobe Analytics API

March 10, 2014
By

Starting with version 1.3.1 of RSiteCatalyst, you can now access the real-time reporting capabilities of the Adobe Analytics API through a familiar R interface. Here’s how to get started… GetRealTimeConfiguration Before using the real-time reporting capabilities of Adobe Analytics, you first need to indicate which metrics and elements you are interested in seeing in real-time. To Real-time Reporting...

Read more »

RSiteCatalyst Version 1.3 Release Notes

February 4, 2014
By
RSiteCatalyst Version 1.3 Release Notes

Version 1.3 of the RSiteCatalyst package to access the Adobe Analytics API is now available on CRAN! Changes include: Search via regex functionality in QueueRanked/QueueTrended functions Support for Realtime API reports: Overtime and one-element Ranked report Allow for variable API request timing in Queue functions Fixed validate flag in JSON request to work correctly Deprecated RSiteCatalyst Version...

Read more »

Quickly Create Dummy Variables in a Data Frame

January 2, 2014
By
Quickly Create Dummy Variables in a Data Frame

On Quora, a question was asked about how to fix the error of the randomForest package in R not being able to handle more than 32 levels in a categorical variable. Seeing as how I’ve seen this question asked on Kaggle forums, StackOverflow and elsewhere, here’s the answer: code your own dummy variables instead of Quickly Create...

Read more »

Adobe Analytics Implementation Documentation in 60 seconds

December 9, 2013
By

When I was working as a digital analytics consultant, no question quite had the ability to cause belly laughs AND angst as, “Can you send me an updated copy of your implementation documentation?” I saw companies that were spending six-or-seven-figures annually on their analytics infrastructure, multi-millions in salary for employees and yet the only way Adobe Analytics Implementation...

Read more »

RSiteCatalyst Version 1.2 Release Notes

November 5, 2013
By
RSiteCatalyst Version 1.2 Release Notes

Version 1.2 of the RSiteCatalyst package to access the Adobe Analytics API is now available on CRAN! Changes include: Removed RCurl package dependency Changed argument order for GetAdminConsoleLog to avoid error when date not passed Return proper numeric type for metric columns Fixed bug in GetEVars function Added validate:true flag to API to improve error reporting Removed remaining RSiteCatalyst Version 1.2...

Read more »

Clustering Search Keywords Using K-Means Clustering

September 17, 2013
By
Clustering Search Keywords Using K-Means Clustering

One of the key tenets to doing impactful digital analysis is understanding what your visitors are trying to accomplish. One of the easiest methods to do this is by analyzing the words your visitors use to arrive on site (search keywords) and what words they are using while on the site (on-site search). Although Google has Clustering Search Keywords...

Read more »