Vignette: Simulating a minimal SPSS dataset from R

April 29, 2020 | Martin Chan

What this is about ???? I will simulate a minimal labelled survey dataset that can be exported as a SPSS (.SAV) file (with full variable and value labels) in R. I will also attempt to fabricate ‘meaningful patterns’ to the dataset such that it can be more effectively used for creating ... [Read more...]

Data cleaning with Kamehamehas in R

April 10, 2020 | Martin Chan

Background Given present circumstances in in the world, I thought it might be nice to write a post on a lighter subject. Recently, I came across an interesting Kaggle dataset that features the power levels of Dragon Ball characters at different points in the franchise. Whilst the dataset itself is ... [Read more...]

Vignette: Downloadable tables in RMarkdown with the DT package

December 24, 2019 | Martin Chan

Background In an earlier post April this year, I discussed using flexdashboard (with RMarkdown) as an appealing and practical R alternative to Excel-based reporting dashboards. Since it’s possible to (i) export these ‘flexdashboards’ as static HTML files that can be opened on practically any computer (virtually no dependencies), (ii) ...
Vignette: Google Trends with the gtrendsR package

October 17, 2019 | Martin Chan

Background Google Trends is a well-known, free tool provided by Google that allows you to analyse the popularity of top search queries on its Google search engine. In market exploration work, we often use Google Trends to get a very quick view of what behaviours, language, and general things are ...
Data Chats: An Interview with Avision Ho

August 1, 2019 | Martin Chan

Introduction Why do an interview? On this occasion, I’ve decided to have a conversation with a data scientist for a change, as opposed to creating a vignette or reviewing a package (atypical of the content on this blog). I’ve always enjoyed int...
A Short Essay on Duplicated R Artefacts

July 5, 2019 | Martin Chan

Organic Development of R Artefacts In a previous post, I alluded to the point that one of the great strengths (but also one of the challenges) of R is the organic way in which R ‘artefacts’ are developed.1 One characteristic of this “organic d...
Working with SPSS labels in R

June 12, 2019 | Martin Chan

TL;DR ???? This post provides an overview of R functions for dealing with survey data labels, particularly ones that I wish I’d known when I first started out analysing survey data in R (primarily stored in SPSS data files). Some of these functions come from surveytoolbox, a package I’...
Vignette: Scraping Amazon Reviews in R

May 15, 2019 | Martin Chan

Background One of the pet projects that I had been working on earlier in the year was to figure out an efficient way to gain an insight into what is going on in a consumer market, e.g.: What do people look for when they’re buying a product? What ...
Two Styles of Learning R

May 5, 2019 | Martin Chan

What’s the best way to learn R? Motivations behind the debate Some argue that R fundamentally has a steep learning curve, and that there are no real shortcuts for learning R. I don’t completely agree with that: I think that there are easier way...
