Articles by Andrew Collier

satRday in Cape Town (from Exegetic Analytics)

November 30, 2016 | Andrew Collier

The second satRday (and first satRday on African soil) will happen in Cape Town on 18 August 2017. It’s going to be a one day celebration of R. We have a trio of phenomenal keynote speakers (Hilary Parker, Jenny Bryan and Julia Silge) who will be giving inspiring talks at the ... [Read more...]

satRday Cape Town: Call for Submissions

October 26, 2016 | Andrew Collier

satRday Cape Town will happen on 18 February 2017 at Workshop 17, Victoria & Alfred Waterfront, Cape Town, South Africa. Keynotes and Workshops We have a trio of fantastic keynote speakers: Hilary Parker, Jennifer Bryan and Julia Silge, who’ll be dazzling you on the day as well as presenting workshops on the two ... [Read more...]

Fixing “Peer certificate cannot be authenticated”

September 16, 2016 | Andrew Collier

I’m currently getting the following error on a Windows machine: The machine in question is sitting behind a gnarly firewall and proxy, which I suspect are the source of the problem. I also need to use --ignore-certificate-errors when running chromium-browser, which points to the same issue. This seems to ... [Read more...]

ubeR: A Package for the Uber API

August 31, 2016 | Andrew Collier

Uber exposes an extensive API for interacting with their service. ubeR is a R package for working with that API which Arthur Wu and I put together during a Hackathon at iXperience. Installation The package is currently hosted on GitHub. Installation is simple using the devtools package. Authentication To work ...
[Read more...]

Sportsbook Betting (Part 2): Bookmakers’ Odds

August 10, 2016 | Andrew Collier

In the first instalment of this series we gained an understanding of the various types of odds used in Sportsbook betting and the link between those odds and implied probabilities. We noted that the implied probabilities for all possible outcomes in an event may sum to more than 100%. At first ... [Read more...]

feedeR: Reading RSS and Atom Feeds from R

August 8, 2016 | Andrew Collier

I’m working on a project in which I need to systematically parse a number of RSS and Atom feeds from within R. I was somewhat surprised to find that no package currently exists on CRAN to handle this task. So this presented the opportunity for a bit of DIY. ... [Read more...]

Web Scraping and “invalid multibyte string”

August 2, 2016 | Andrew Collier

A couple of my collaborators have had trouble using read_html() from the readr package to access this Wikipedia page. Specifically they have been getting errors like this: Since I couldn’t reproduce these errors on my machine it appeared to be something relating to their particular machine setup. Looking ... [Read more...]

Sportsbook Betting (Part 1): Odds

August 1, 2016 | Andrew Collier

This series of articles was written as support material for Statistics exercises in a course that I’m teaching for iXperience. In the series I’ll be using illustrative examples for wagering on a variety of Sportsbook events including Horse Racing, Rugby and Tennis. The same principles can be applied ... [Read more...]

Building a Life Table

July 28, 2016 | Andrew Collier

After writing my previous post, Mortality by Year and Age, I’ve become progressively more interested in the mortality data. Perhaps those actuaries are onto something? I found this report, which has a wealth of pertinent information. On p. 13 the report gives details on constructing a Life Table, which is ... [Read more...]

Calculating Pi using Buffon’s Needle

July 26, 2016 | Andrew Collier

I put together this example to illustrate some general R programming principles for my Data Science class at iXperience. The idea is to use Buffon’s Needle to generate a stochastic estimate for pi. Here are the results (click on the image for an interactive version). The orange line is ... [Read more...]

Mortality by Year and Age

July 22, 2016 | Andrew Collier

Taking another look at the data from the lifespan package. Plot below shows the evolution of mortality in the US as a function of year and age. Also, following up on a suggestion from @robjohnnoble, population data have been included in the package. The post Mortality by Year and Age ... [Read more...]

Life Expectancy by Country

July 20, 2016 | Andrew Collier

I was rather inspired by this plot on Wikipedia’s List of Countries by Life Expectancy. Shouldn’t be too hard to reproduce with a bit of scraping. Here are the results (click on the static image to view the interactive plot): The bubble plot above compares female and male ... [Read more...]

Escalating Life Expectancy

July 18, 2016 | Andrew Collier

I’ve added mortality data to the lifespan package. A result that immediately emerges from these data is that average life expectancy is steadily climbing. The effect is more pronounced for men, rising from around 66.5 in 1994 to 70.0 in 2014. The corresponding values for women are 74.6 and 76.5 respectively. Good news for everyone. […] [Read more...]

Birth Month by Gender

July 16, 2016 | Andrew Collier

Based on some feedback to a previous post I normalised the birth counts by the (average) number of days in each month. As pointed out by a reader, the results indicate a gradual increase in the number of conceptions during (northern hemisphere) Autumn and Winter, roughly up to the end ... [Read more...]

Most Probable Birth Month

July 14, 2016 | Andrew Collier

In a previous post I showed that the data from www.baseball-reference.com support Malcolm Gladwell’s contention that more professional baseball players are born in August than any other month. Although this might be explained by the 31 July cutoff for admission to baseball leagues, it was suggested that it ... [Read more...]

Major League Baseball Birth Months

July 5, 2016 | Andrew Collier

The cutoff date for almost all nonschool baseball leagues in the United States is July 31, with the result that more major league players are born in August than in any other month.Malcolm Gladwell, Outliers A quick analysis to confirm Gladwell’s assertion above. Used data scraped from www.baseball-reference.... [Read more...]

R Saturday [satRday] in Cape Town

May 12, 2016 | Andrew Collier

I put in a proposal to host a R Saturday [satRday] in Cape Town next year. The R Consortium has committed to funding three of these events: one will be in Hungary, another will be somewhere in the USA and the third will be elsewhere in the world. The voting ... [Read more...]

International Open Data Day

March 5, 2016 | Andrew Collier

As part of International Open Data Day we spent the morning with a bunch of like minded people poring over some open Census South Africa data. Excellent initiative, @opendatadurban, I’m very excited to see where this is all going and look forward to contributing to the journey! The data ... [Read more...]

R, HDF5 Data and Lightning

February 23, 2016 | Andrew Collier

I used to spend an inordinate amount of time digging through lightning data. These data came from a number of sources, the World Wide Lightning Location Network (WWLLN) and LIS/OTD being the most common. I recently needed to work with some Hierarchical Data Format (HDF) data. HDF is something ... [Read more...]
1 2 3

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)