R bloggers

Creating a Movie with Data from Outer Space in R

July 2, 2019
By
Creating a Movie with Data from Outer Space in R

The Rosetta mission of the European Space Agency (ESA) is one of the greatest (yet underappreciated) triumphs of humankind: it was launched in 2004 and landed the spacecraft Philae ten years later on a small comet, named 67P/Churyumov–Gerasimenko (for the whole timeline of the mission see here: Timeline of Rosetta spacecraft). ESA provided the world … Continue reading "Creating...

Read more »

VLOOKUP in R with Schwartau Beehive Data

July 1, 2019
By
VLOOKUP in R with Schwartau Beehive Data

I started learning R back in 2016 in college thanks to a couple of my professors who used it to teach statistics: Dr. Grimshaw and Dr. Lawson. Thanks to the R community I’ve learned a lot more since then, but recently I did an embarrassing Google search for “how to do VLOOKUP in r.” For those of you who don’t know, VLOOKUP...

Read more »

Prob/Stat for Data Sci: Math + R + Data

June 30, 2019
By

My new book, Probability and Statistics for Data Science: Math + R + Data, pub. by the CRC Press, was released on June 24! This book arose from an open-source text I wrote and have been teaching from. The open source version will still be available, though rather different from the published one. This is … Continue reading Prob/Stat...

Read more »

Powerlytics: Impact of Age, Gender, and Weight on Total Weight Lifted in Powerlifting Meets

June 30, 2019
By

A. Background The Open Powerlifting initiative attempts to create an accurate and open archive of all powerlifting meet data throughout the world. As someone who recently started competing again after a six year delay from powerlifting, I often mess around with the Open Powerlifting data as it’s of personal interest. Most of the anlysis that … Continue reading Powerlytics:...

Read more »

Comrades Marathon (2019) Splits

June 30, 2019
By
Comrades Marathon (2019) Splits

I’m looking at ways to effectively visualise the splits data for the 2019 edition of the Comrades Marathon. My objectives are to provide: an overall view of the splits across the entire field and a detailed view for individual runners (relative to the rest of the field). Ridge Plot My working solution for visualising the global splits data is a ridgeline plot created...

Read more »

Hubway Station Metrics

June 30, 2019
By

Hubway, a bike sharing system in Boston, was launched in July of 2011. In the past 8 years, they have expanded to over 150 locations throughout the city. In 2014, as a part of a data science challenge, Hubway made 3 years of its data public. This reflected every time a user started or ended

Read more »

B3 is shutting down its ftp site

June 30, 2019
By

Well, bad news travels fast. Over the last couple of weeks I’ve been receiving a couple of emails regarding B3’s decision of shutting down its ftp site. More specifically, users are eager to know how it will impact my data grabbing packages in ...

Read more »

Imagine your Data Before You Collect It

June 30, 2019
By
Imagine your Data Before You Collect It

As data scientists, we are often presented with a dataset and are asked to use it to produce insights. We use R to wrangle, visualize, model, and produce tables and plots for sharing or publication. When we focus on the data in hand in this way, we don’t get to consider where the data came from. The sample size...

Read more »

Reordering and facetting for ggplot2

June 30, 2019
By
Reordering and facetting for ggplot2

I recently wrote about the release of tidytext 0.2.1, and one of the most useful new features in this release is a couple of helper functions for making plots with ggplot2. These helper functions address a class of challenges that often arises when dealing with text data, so we’ve included them in the tidytext package. Let’s work through an example To...

Read more »

Where the Beer is

June 30, 2019
By
Where the Beer is

Since prohibition ended in 1933, the American brewing industry has undergone two massive transformations. The first saw hundreds of regional breweries from across the country, often brewing beer unique their respective regions, become consolidated into a handful of behemoths. During the period of greatest consolidation in the early 1980s the ten largest breweries produced almost

Read more »

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)