Articles by Randy Zwitch

Visualizing Website Pathing With Sankey Charts

September 10, 2014 | Randy Zwitch

In my prior post on visualizing website structure using network graphs, I referenced that network graphs showed the pairwise relationships between two pages (in a bi-directional manner). However, if you want to analyze how your visitors are pathing through your site, you can visualize your data using a Sankey chart. ... [Read more...]

Visualizing Website Pathing With Network Graphs

September 8, 2014 | Randy Zwitch

Last week, version 1.4 of RSiteCatalyst was released, and now it’s possible to get site pathing information directly within R. Now, it’s easy to create impressive looking network graphs from your Adobe Analytics data using RSiteCatalyst and d3Network. In this blog post, I will cover simple and force-directed ... [Read more...]

RSiteCatalyst Version 1.4 Release Notes

September 1, 2014 | Randy Zwitch

It felt like it would never happen, but RSiteCatalyst v1.4 is now available on CRAN! There are numerous changes in this version of the package, so unlike previous posts, there won’t be any code examples. THIS VERSION IS ONE BIG BREAKING CHANGE While not the most important improvement, it ... [Read more...]

Maybe I Don’t Really Know R After All

June 26, 2014 | Randy Zwitch

Lately, I’ve been feeling that I’m spreading myself too thin in terms of programming languages. At work, I spend most of my time in Hive/SQL, with the occasional Python for my smaller data. I really prefer Julia, but I’m alone at work on that one. And ... [Read more...]

Using Julia As A ‘Glue’ Language

June 24, 2014 | Randy Zwitch

While much of the focus in the Julia community has been on the performance aspects of Julia relative to other scientific computing languages, Julia is also perfectly suited to ‘glue’ together multiple data sources/languages. In this blog post, I will cover how to create an interactive plot using Gadfly.... [Read more...]

Five Hard-Won Lessons Using Hive

June 12, 2014 | Randy Zwitch

I’ve been spending a ton of time lately on the data engineering side of ‘data science’, so I’ve been writing a lot of Hive queries. Hive is a great tool for querying large amounts of data, without having to know very much about the underpinnings of Hadoop. Unfortunately, ... [Read more...]

Building JSON in R: Three Methods

May 13, 2014 | Randy Zwitch

When I set out to build RSiteCatalyst, I had a few major goals: learn R, build a CRAN-worthy package and learn the Adobe Analytics API. As I reflect back on how the package has evolved over the past two years and what I’ve learned, I think my greatest learning ... [Read more...]

Real-time Reporting with the Adobe Analytics API

March 10, 2014 | Randy Zwitch

Starting with version 1.3.1 of RSiteCatalyst, you can now access the real-time reporting capabilities of the Adobe Analytics API through a familiar R interface. Here’s how to get started… GetRealTimeConfiguration Before using the real-time reporting capabilities of Adobe Analytics, you first need to indicate which metrics and elements you are ... [Read more...]

RSiteCatalyst Version 1.3 Release Notes

February 4, 2014 | Randy Zwitch

Version 1.3 of the RSiteCatalyst package to access the Adobe Analytics API is now available on CRAN! Changes include: Search via regex functionality in QueueRanked/QueueTrended functions Support for Realtime API reports: Overtime and one-element Ranked report Allow for variable API request timing in Queue functions Fixed validate flag in JSON ... [Read more...]

Quickly Create Dummy Variables in a Data Frame

January 2, 2014 | Randy Zwitch

On Quora, a question was asked about how to fix the error of the randomForest package in R not being able to handle more than 32 levels in a categorical variable. Seeing as how I’ve seen this question asked on Kaggle forums, StackOverflow and elsewhere, here’s the answer: code ... [Read more...]

Adobe Analytics Implementation Documentation in 60 seconds

December 9, 2013 | Randy Zwitch

When I was working as a digital analytics consultant, no question quite had the ability to cause belly laughs AND angst as, “Can you send me an updated copy of your implementation documentation?” I saw companies that were spending six-or-seven-figures annually on their analytics infrastructure, multi-millions in salary for employees ... [Read more...]

RSiteCatalyst Version 1.2 Release Notes

November 5, 2013 | Randy Zwitch

Version 1.2 of the RSiteCatalyst package to access the Adobe Analytics API is now available on CRAN! Changes include: Removed RCurl package dependency Changed argument order for GetAdminConsoleLog to avoid error when date not passed Return proper numeric type for metric columns Fixed bug in GetEVars function Added validate:true flag ... [Read more...]

Clustering Search Keywords Using K-Means Clustering

September 17, 2013 | Randy Zwitch

One of the key tenets to doing impactful digital analysis is understanding what your visitors are trying to accomplish. One of the easiest methods to do this is by analyzing the words your visitors use to arrive on site (search keywords) and what words they are using while on the ... [Read more...]

RSiteCatalyst Version 1.1 Release Notes

August 25, 2013 | Randy Zwitch

RSiteCatalyst version 1.1 is now available on CRAN. Changes from version 1 include: Support for Correlations/Subrelations in the QueueRanked function Support for Current Data in all ‘Queue‘ functions Support Anomaly Detection for QueueOvertime and QueueTrended functions (example usage with ggplot2 graph) Decrease in wait time for API calls (from 5 seconds to 2 ... [Read more...]

Anomaly Detection Using The Adobe Analytics API

August 15, 2013 | Randy Zwitch

As digital marketers & analysts, we’re often asked to quantify when a metric goes beyond just random variation and becomes an actual “unexpected” result. In cases such as A/B..N testing, it’s easy to calculate a t-test to quantify the difference between two testing populations, but for time-series ... [Read more...]

Tabular Data I/O in Julia

August 6, 2013 | Randy Zwitch

Importing tabular data into Julia can be done in (at least) three ways: reading a delimited file into an array, reading a delimited file into a DataFrame and accessing databases using ODBC. Reading a file into an array using readdlm The most basic way to read data into Julia is ... [Read more...]

A Beginner’s Look at Julia

July 23, 2013 | Randy Zwitch

Over the past month or so, I’ve been playing with a new scientific programming language called ‘Julia‘, which aims to be a high-level language with performance approaching that of C. With that goal in mind, Julia could be a replacement for the ‘multi-language’ problem of needing to move between ... [Read more...]

Innovation Will Never Be At The Push Of A Button

May 17, 2013 | Randy Zwitch

@randyzwitch @benjamingaines @usujason I am envisioning the data science equivalent of an autonomous vehicle pileup. — Todd Belcher (@toddmetrics) May 16, 2013   Recently, I’ve been getting my blood pressure up reading (marketing) articles about “big data” and “data science”.  What saddens me about the whole discussion is that there is the underlying ... [Read more...]
1 2 3

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)