Riding Watopia

[This article was first published on R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When it’s raining or when it’s just cold outside, I like to ride on Zwift. I especially like Watopia. Let’s see what can be done with the FIT files that are generated. Via the Zwift dashboard you can download the FIT file which is created after each ride. It’s also located on your PC/Mac.

Getting the data

I use the cycleRtools package to read the FIT file but unfortunately all conversions to a data.frame do not work (more about the error at the end of this post). I have to do that manually. It’s bit of extra work but but gives some insights in what the package normally does for you.

library(cycleRtools)

zwift <- read_fit("file.fit", format = FALSE)
# list to data frame
zwift <- do.call(cbind.data.frame, zwift)
# rename columns and modify data
names(zwift) <- c("timestamp","lat","lon","dist","heart","alt","speed_ms","pwr","cad",
                  "enh_alt","enh_speed_ms","X","lap")
zwift$X <- NULL                                                         # don't know what that is, throw it out
zwift$timestamp <- as.POSIXct(zwift$timestamp, origin="1989-12-31")     # this is the zwift reference date
zwift$dist <- zwift$dist / 1000                                         # distance in km
zwift$lat <- zwift$lat / 11930465                                       # 11930465 (2^32 / 360)
zwift$lon <- zwift$lon / 11930465                                       # see ref [1] for background
zwift$moving <- as.POSIXct(c(1:nrow(zwift)), 
                origin="1989-12-31 00:00:00") - 3600                    # moving time, origin is required
zwift$speed <- zwift$speed_ms * 3.6                                     # speed in km/h

str(zwift) provides info on the data frame. Note that timestamp is the actual date and moving has 1989-12-31 as reference date. It’s in the data but the date part won’t be used, so it doesn’t really matter.

'data.frame':    9233 obs. of  13 variables:
$ timestamp   : POSIXct, format: "2021-01-23 20:33:12" "2021-01-23 20:33:13" "2021-01-23 20:33:14" …
$ lat         : num  -11.6 -11.6 -11.6 -11.6 -11.6 …
$ lon         : num  167 167 167 167 167 …
$ dist        : num  0.00192 0.00398 0.00652 0.00944 0.01265 …
$ heart       : int  116 116 116 116 116 116 116 116 116 116 …
$ alt         : num  12.8 12.8 12.8 12.8 12.8 …
$ speed_ms    : num  1.74 2.29 2.73 3.11 3.42 …
$ pwr         : int  83 99 100 100 98 98 98 106 106 86 …
$ cad         : int  26 35 42 48 1 52 54 54 55 56 …
$ enh_alt     : num  12.8 12.8 12.8 12.8 12.8 …
$ enh_speed_ms: num  1.74 2.29 2.73 3.11 3.42 …
$ lap         : Ord.factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 …
$ moving      : POSIXct, format: "1989-12-31 00:00:01" "1989-12-31 00:00:02" "1989-12-31 00:00:03" …
$ speed       : num  6.26 8.24 9.83 11.2 12.33 …

Basic route and height profile

We now can create basic plots which give some information on the route (Watopia, Big Foot Hills) and height profile.

par(mfrow=c(1,2), mai=c(0.9,0.9,0.1,0.1))                               # set the plot area in a 1*2 array
plot(zwift$lon, zwift$lat, type = "l", lwd = 2, xlab = "Latitude", ylab = "Longitude")
plot(zwift$moving, zwift$alt, type = "h", lwd = 2, col = "darkgrey", xlab= "Moving time", 
     ylab = "Elevation [m]")

To get information on the time on the bike and the actual moving time we can do this. I find calculating with times/dates usually a bit tricky as they give a lot of errors before it eventually works. Probably due to my lack of knowledge on this topic. The hms package brings relief.

library(hms)                                               #needed for time calculations
total_time <- as_hms(difftime(zwift$timestamp[NROW(zwift)], zwift$timestamp[1]))
moving_time <- as_hms(difftime(zwift$moving[NROW(zwift)], zwift$moving[1]))
break_time <- as_hms(difftime(total_time, moving_time))

Which gives these results.

   > total_time
   02:48:32
   > moving_time
   02:33:52
   > break_time
   00:14:40 

The data also contains heart rate and speed and I’d like to compare that with climbing the hills of this Watopian route. The hill segments are nicely described by What’s on Zwift and I used that data for the next steps. First a new data frame.

#Round down the distance entries in the source table, otherwise the positions cannot be found
zwift$rounddist <- round(zwift$dist,2)

#Data about the hills
colstart <- c(10.30, 19.30, 29.20, 47.15, 59.30, 67.00)
colend <- c(13.025, 20.30, 33.10, 48.10, 61.95, 67.50)
coltype <- c("col", "col", "col", "col", "col", "sprint")
colname <- c("Titans Grove KOM\n2.6km, 2.2%", "Zwift KOM\n 900m, 5%", "Volcano KOM\n3.75km, 3.2%", 
             "Titans Grove KOM Reverse\n890m, 6.6%", "Zwift KOM Reverse\n2.4km, 2%", 
             "Fuego Flats\n496m Sprint")
 
# Combine in data frame
cols <- data.frame(colstart, colend, coltype, colname)
# this is a lead in to the actual route, a warm up. This distance needs to be added
lead_in = 2.47  
starts <- zwift[match(unique(round(cols$colstart + lead_in),2), zwift$rounddist), 4]  #4th column is distance
ends <- zwift[match(unique(round(cols$colend + lead_in),2), zwift$rounddist), 4]  
cols$colstart_li <- starts
cols$colend_li <- ends


# Postion of the start of the label, x = mean value start + finish. y postion = find altitude
cols$labelposx <- (cols$colend_li + cols$colstart_li)/2
# Find the elevation at the middle point of the climb in column 6. And add 5 for an offset of the text balloon
cols$labelposy <- zwift[match(unique(round(cols$labelposx),2), zwift$rounddist), 6] + 5

# Clean up
rm(colstart,colend,coltype,colname, starts, ends)

Heart rate and speed on hilly segments

Create three plots: Speed, heart rate and elevation versus distance. Then combine them using the grid package. The heart rate and speed data has been smoothed a bit with geom_spline(), to get rid of the spikes. The conclusion is as expected: Climbing uphill reduces speed and increases heartrate. Riding downhill does the opposite.

The code could be improved a bit by combineing the theme changes in a custom theme but I didn’t do that here. Perhaps there is also a way to combine the code for each hill (instead of repeating code and increasing the index) but there are a few variables in there like the color and the height. Also a for..next loop does not work within a ggplot.

library(ggrepel)
library(ggformula)
colcol <- "pink"
colsprint <- "lightgreen"

#Height profile with hills
p_height <- ggplot(zwift, aes(x=dist, y=ele)) +
   geom_rect(aes(xmin=cols$colstart_li[1], xmax=cols$colend_li[1], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
   geom_rect(aes(xmin=cols$colstart_li[2], xmax=cols$colend_li[2], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
   geom_rect(aes(xmin=cols$colstart_li[3], xmax=cols$colend_li[3], ymin = 0, ymax = Inf), fill = colcol, alpha = 0.05) +
   geom_rect(aes(xmin=cols$colstart_li[4], xmax=cols$colend_li[4], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
   geom_rect(aes(xmin=cols$colstart_li[5], xmax=cols$colend_li[5], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
   geom_rect(aes(xmin=cols$colstart_li[6], xmax=cols$colend_li[6], ymin = 0, ymax = 50), fill = colsprint, alpha = 0.05) +
   geom_line(lwd = 0.5, colour = "black", alpha = 0.9) +
   geom_area(fill = "black", alpha = 0.5) +
   xlab("Distance [km]") +
   ylab("Elevation [m]") +
   scale_x_continuous(expand = expansion(mult = 0.02), breaks=seq(0,75,5)) +
   scale_y_continuous(breaks=seq(0,150,25)) +
   theme_classic() 

#Heart rate plot
p_heart <- ggplot(zwift, aes(x=dist)) +
   geom_spline(aes(y = heart), col="red", lwd = 0.5) +
   theme_classic() +
   ylab("Heartrate [BPM]") +
   scale_x_continuous(expand = expansion(mult = 0.02), breaks=seq(0,75,5)) +
   theme(axis.title.x = element_blank(),
         axis.text.x = element_blank(),
         axis.ticks.x = element_blank(),
         axis.line.x = element_blank())

#Speed plot
p_speed <- ggplot(zwift, aes(x=dist)) +
   geom_spline(aes(y = speed), col="blue", lwd = 0.5) +
   ylab("Speed [km/h]") +
   scale_x_continuous(expand = expansion(mult = 0.02), breaks=seq(0,75,5)) +
   theme_classic() +
   theme(axis.title.x = element_blank(),
         axis.text.x = element_blank(),
         axis.ticks.x = element_blank(),
         axis.line.x = element_blank())

#combining and plotting above each other
library(grid)
grid.newpage()
grid.draw(rbind(ggplotGrob(p_speed), ggplotGrob(p_heart), ggplotGrob(p_height), size = "last"))

Drop shadow added with magick:

library(magick)
image_read("image.png") %>% 
   image_shadow(bg="white", color="black", geometry="25x5+15+15",
                operator = "atop", offset = "+10+10") %>% 
   image_write("image_shadow.png")
image shadow created with magick

Adding labels to segments

Analog to the above graphs you can only fill the area below to line with geom_area (part of the ggplot package). Filling just a part of the graphs requires to set a few limits of course. I didn’t figure out how to code this in a more compact way, using a line for each coloured area bit results in ugly code.

ggplot(zwift, aes(x=dist, y=ele)) + 
geom_area(fill = "black", alpha = 0.75) +   
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[1] & dist<cols$colend_li[1] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[2] & dist<cols$colend_li[2] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[3] & dist<cols$colend_li[3] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[4] & dist<cols$colend_li[4] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[5] & dist<cols$colend_li[5] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[6] & dist<cols$colend_li[6] , dist , 0 )), fill = "green", alpha = 0.7) +
xlab("Distance [km]") +
ylab("Elevation [m]") +
scale_x_continuous(breaks=seq(0,70,10), limits = c(0,75)) +
scale_y_continuous(breaks=seq(0,130,25), limits = c(-5,160)) +
geom_line(lwd = 0.75, colour = "black", alpha = 0.9) +
theme_classic()

With geom_label_repel, from the ggrepel package, the labels that were defined earlier can be added to the plot. Several prameters define position, size, fill color etc. Fairly self explanatory I would say. Color #0786e3 is the same color as the waters around Watopia, a bit softened using alpha = .5 otherwise it is very hard against a white background.

geom_label_repel(data = cols, aes(x=labelposx, y=labelposy, label=colname), size = 4.2, alpha = .5, 
                 fill = "#0786e3", min.segment.length = 0.5, segment.color = "black", segment.size = 1, 
                 nudge_x = 0, nudge_y = 50)

Read_fit error

Earlier I mentioned issues with reading the FIT files from Zwift (also with Garmin). A little background: read_fit has a default and advised setting format = TRUE, this creates a formatted dataframe and adds some columns. Unfortunately I get the following error message. Chaning to format = FALSE resolves the issue but requires some manual work to get it right.

   zwift <- read_fit("2021-01-23-20-32-55.fit", format = TRUE)
   Reading .fit file…
   Error in if ((diffs["count", which.max(diffs["count", ])]/length(tdat)) <  : 
     argument is of length zero 

References

[1] Converting FIT coordinates to degrees https://gis.stackexchange.com/questions/122186/convert-garmin-or-iphone-weird-gps-coordinates

To leave a comment for the author, please follow the link and comment on their blog: R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)