Articles by sindri

Fighting Factors with Cats: Exercises

July 20, 2018 | sindri

  In this exercise set, we will practice using the forcats factor manipulation package by Hadley Wickham. In the last exercise set, we saw that it is entirely possible to deal with factors in base R,  but also that things can get a bit involved and un-intuitive. Forcats simplifies many common ...
[Read more...]

Melt and cast the shape of your data.frame – Exercises

June 22, 2018 | sindri

  Datasets often arrive to us in a form that is different from what we need for our modelling or visualisations functions who in turn don’t necessary require the same format. Reshaping data.frames is a step that all analysts need but many struggle with. Practicing this meta-skill will in ...
[Read more...]

Create and Format a Google Sheet Within R: Exercises

May 10, 2018 | sindri

In this exercise set, we will practice using the Google Sheets package to create and manipulate a Google spreadsheet within R. After completing this exercise set, you will be able to prepare a basic Google Sheets document using just R, leaving behind a reproducible R-script. Note that using Google Sheets ...
[Read more...]

Well-Behaved Functions – Exercises

April 26, 2018 | sindri

It is said that, in R, everything that happens is a function call. So, if we want to improve our ability to make things happen the way we want them to, maybe it’s worth getting very comfortable with how functions work in R. In this exercise set, we’ll ...
[Read more...]

K-Means Clustering in R – Exercises

April 13, 2018 | sindri

K-means is efficient, and perhaps, the most popular clustering method. It is a way for finding natural groups in otherwise unlabeled data. You specify the number of clusters you want defined and the algorithm minimizes the total within-cluster variance. In this exercise, we will play around with the base R ...
[Read more...]

Loops in R – Exercises

March 30, 2018 | sindri

Using loops is generally discouraged in R when it is possible to avoid them using vectorized alternatives. Vectorized solution are be both faster to write, read and execute – except sometimes they aren’t and the definition of vectorization isn’t always straightforward. In any event, solutions using loops can be: ...
[Read more...]

Answer probability questions with simulation (part-2)

September 20, 2017 | sindri

This is the second exercise set on answering probability questions with simulation. Finishing the first exercise set is not a prerequisite. The difficulty level is about the same – thus if you are looking for a challenge aim at writing up faster more elegant algorithms. As always, it pays off to ... [Read more...]

Basics of data.table: Smooth data exploration

August 23, 2017 | sindri

The data.table package provides perhaps the fastest way for data wrangling in R. The syntax is concise and is made to resemble SQL. After studying the basics of data.table and finishing this exercise set successfully you will be able to start easing into using data.table for all ... [Read more...]

Answer probability questions with simulation

August 20, 2017 | sindri

Probability is at the heart of data science. Simulation is also commonly used in algorithms such as the bootstrap. After completing this exercise, you will have a slightly stronger intuition for probability and for writing your own simulation algorithms. Most of the problems in this set have an exact analytical ... [Read more...]

Soccer data sparring: Scraping, merging and analyzing exercises

August 8, 2017 | sindri

While understanding and spending time improving specific techniques, and strengthening indvidual muscles is important, occasionally it is necessary to do some rounds of actual sparring to see your flow and spot weaknesses. This exercise sets forces you to use all that you have practiced: to scrape links, download data, regular ... [Read more...]

Working with the xlsx package Exercises (part 2)

June 28, 2017 | sindri

This exercise set provides (further) practice in writing Excel documents using the xlsx package as well as importing and general data manipulation. Specifically we have loops in order for you to practice scaling. A previous exercise set focused on writing a simple sheet with the same package, see here. We ... [Read more...]

Using the xlsx package to create an Excel file

June 17, 2017 | sindri

Microsoft Excel is perhaps the most popular data anlysis tool out there. While arguably convenient, spreadsheet software is error prone and Excel code can be very hard to review and test. After successfully completing this exercise set, you will be able to prepare a basic Excel document using just R (... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)