[This article was first published on R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When it’s raining or when it’s just cold outside, I like to ride on Zwift. I especially like Watopia. Let’s see what can be done with the FIT files that are generated. Via the Zwift dashboard you can download the FIT file which is created after each ride. It’s also located on your PC/Mac.

1. Labelling those segments
2. References
3. Part – II: animation (working on this)

## Getting the data

I use the cycleRtools package to read the FIT file but unfortunately all conversions to a data.frame do not work (more about the error at the end of this post). I have to do that manually. It’s bit of extra work but but gives some insights in what the package normally does for you.

library(cycleRtools)

zwift <- read_fit("file.fit", format = FALSE)
# list to data frame
zwift <- do.call(cbind.data.frame, zwift)
# rename columns and modify data
"enh_alt","enh_speed_ms","X","lap")
zwift$X <- NULL # don't know what that is, throw it out zwift$timestamp <- as.POSIXct(zwift$timestamp, origin="1989-12-31") # this is the zwift reference date zwift$dist <- zwift$dist / 1000 # distance in km zwift$lat <- zwift$lat / 11930465 # 11930465 (2^32 / 360) zwift$lon <- zwift$lon / 11930465 # see ref [1] for background zwift$moving <- as.POSIXct(c(1:nrow(zwift)),
origin="1989-12-31 00:00:00") - 3600                    # moving time, origin is required
zwift$speed <- zwift$speed_ms * 3.6                                     # speed in km/h

str(zwift) provides info on the data frame. Note that timestamp is the actual date and moving has 1989-12-31 as reference date. It’s in the data but the date part won’t be used, so it doesn’t really matter.

'data.frame':    9233 obs. of  13 variables:
$timestamp : POSIXct, format: "2021-01-23 20:33:12" "2021-01-23 20:33:13" "2021-01-23 20:33:14" …$ lat         : num  -11.6 -11.6 -11.6 -11.6 -11.6 …
$lon : num 167 167 167 167 167 …$ dist        : num  0.00192 0.00398 0.00652 0.00944 0.01265 …
$heart : int 116 116 116 116 116 116 116 116 116 116 …$ alt         : num  12.8 12.8 12.8 12.8 12.8 …
$speed_ms : num 1.74 2.29 2.73 3.11 3.42 …$ pwr         : int  83 99 100 100 98 98 98 106 106 86 …
$cad : int 26 35 42 48 1 52 54 54 55 56 …$ enh_alt     : num  12.8 12.8 12.8 12.8 12.8 …
$enh_speed_ms: num 1.74 2.29 2.73 3.11 3.42 …$ lap         : Ord.factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 …
$moving : POSIXct, format: "1989-12-31 00:00:01" "1989-12-31 00:00:02" "1989-12-31 00:00:03" …$ speed       : num  6.26 8.24 9.83 11.2 12.33 …

## Basic route and height profile

We now can create basic plots which give some information on the route (Watopia, Big Foot Hills) and height profile.

par(mfrow=c(1,2), mai=c(0.9,0.9,0.1,0.1))                               # set the plot area in a 1*2 array
plot(zwift$lon, zwift$lat, type = "l", lwd = 2, xlab = "Latitude", ylab = "Longitude")
plot(zwift$moving, zwift$alt, type = "h", lwd = 2, col = "darkgrey", xlab= "Moving time",
ylab = "Elevation [m]")

To get information on the time on the bike and the actual moving time we can do this. I find calculating with times/dates usually a bit tricky as they give a lot of errors before it eventually works. Probably due to my lack of knowledge on this topic. The hms package brings relief.

library(hms)                                               #needed for time calculations
total_time <- as_hms(difftime(zwift$timestamp[NROW(zwift)], zwift$timestamp[1]))
moving_time <- as_hms(difftime(zwift$moving[NROW(zwift)], zwift$moving[1]))
break_time <- as_hms(difftime(total_time, moving_time))

Which gives these results.

   > total_time
02:48:32
> moving_time
02:33:52
> break_time
00:14:40 

The data also contains heart rate and speed and I’d like to compare that with climbing the hills of this Watopian route. The hill segments are nicely described by What’s on Zwift and I used that data for the next steps. First a new data frame.

#Round down the distance entries in the source table, otherwise the positions cannot be found
zwift$rounddist <- round(zwift$dist,2)

#Data about the hills
colstart <- c(10.30, 19.30, 29.20, 47.15, 59.30, 67.00)
colend <- c(13.025, 20.30, 33.10, 48.10, 61.95, 67.50)
coltype <- c("col", "col", "col", "col", "col", "sprint")
colname <- c("Titans Grove KOM\n2.6km, 2.2%", "Zwift KOM\n 900m, 5%", "Volcano KOM\n3.75km, 3.2%",
"Titans Grove KOM Reverse\n890m, 6.6%", "Zwift KOM Reverse\n2.4km, 2%",
"Fuego Flats\n496m Sprint")

# Combine in data frame
cols <- data.frame(colstart, colend, coltype, colname)
# this is a lead in to the actual route, a warm up. This distance needs to be added
starts <- zwift[match(unique(round(cols$colstart + lead_in),2), zwift$rounddist), 4]  #4th column is distance
ends <- zwift[match(unique(round(cols$colend + lead_in),2), zwift$rounddist), 4]
cols$colstart_li <- starts cols$colend_li <- ends

# Postion of the start of the label, x = mean value start + finish. y postion = find altitude
cols$labelposx <- (cols$colend_li + cols$colstart_li)/2 # Find the elevation at the middle point of the climb in column 6. And add 5 for an offset of the text balloon cols$labelposy <- zwift[match(unique(round(cols$labelposx),2), zwift$rounddist), 6] + 5

# Clean up
rm(colstart,colend,coltype,colname, starts, ends)

## Heart rate and speed on hilly segments

Create three plots: Speed, heart rate and elevation versus distance. Then combine them using the grid package. The heart rate and speed data has been smoothed a bit with geom_spline(), to get rid of the spikes. The conclusion is as expected: Climbing uphill reduces speed and increases heartrate. Riding downhill does the opposite.

The code could be improved a bit by combineing the theme changes in a custom theme but I didn’t do that here. Perhaps there is also a way to combine the code for each hill (instead of repeating code and increasing the index) but there are a few variables in there like the color and the height. Also a for..next loop does not work within a ggplot.

library(ggrepel)
library(ggformula)
colcol <- "pink"
colsprint <- "lightgreen"

#Height profile with hills
p_height <- ggplot(zwift, aes(x=dist, y=ele)) +
geom_rect(aes(xmin=cols$colstart_li[1], xmax=cols$colend_li[1], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
geom_rect(aes(xmin=cols$colstart_li[2], xmax=cols$colend_li[2], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
geom_rect(aes(xmin=cols$colstart_li[3], xmax=cols$colend_li[3], ymin = 0, ymax = Inf), fill = colcol, alpha = 0.05) +
geom_rect(aes(xmin=cols$colstart_li[4], xmax=cols$colend_li[4], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
geom_rect(aes(xmin=cols$colstart_li[5], xmax=cols$colend_li[5], ymin = 0, ymax = 100), fill = colcol, alpha = 0.05) +
geom_rect(aes(xmin=cols$colstart_li[6], xmax=cols$colend_li[6], ymin = 0, ymax = 50), fill = colsprint, alpha = 0.05) +
geom_line(lwd = 0.5, colour = "black", alpha = 0.9) +
geom_area(fill = "black", alpha = 0.5) +
xlab("Distance [km]") +
ylab("Elevation [m]") +
scale_x_continuous(expand = expansion(mult = 0.02), breaks=seq(0,75,5)) +
scale_y_continuous(breaks=seq(0,150,25)) +
theme_classic()

#Heart rate plot
p_heart <- ggplot(zwift, aes(x=dist)) +
geom_spline(aes(y = heart), col="red", lwd = 0.5) +
theme_classic() +
ylab("Heartrate [BPM]") +
scale_x_continuous(expand = expansion(mult = 0.02), breaks=seq(0,75,5)) +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank())

#Speed plot
p_speed <- ggplot(zwift, aes(x=dist)) +
geom_spline(aes(y = speed), col="blue", lwd = 0.5) +
ylab("Speed [km/h]") +
scale_x_continuous(expand = expansion(mult = 0.02), breaks=seq(0,75,5)) +
theme_classic() +
theme(axis.title.x = element_blank(),
axis.text.x = element_blank(),
axis.ticks.x = element_blank(),
axis.line.x = element_blank())

#combining and plotting above each other
library(grid)
grid.newpage()
grid.draw(rbind(ggplotGrob(p_speed), ggplotGrob(p_heart), ggplotGrob(p_height), size = "last"))

Drop shadow added with magick:

library(magick)
operator = "atop", offset = "+10+10") %>%
image_write("image_shadow.png")

## Adding labels to segments

Analog to the above graphs you can only fill the area below to line with geom_area (part of the ggplot package). Filling just a part of the graphs requires to set a few limits of course. I didn’t figure out how to code this in a more compact way, using a line for each coloured area bit results in ugly code.

ggplot(zwift, aes(x=dist, y=ele)) +
geom_area(fill = "black", alpha = 0.75) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[1] & dist<cols$colend_li[1] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[2] & dist<cols$colend_li[2] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[3] & dist<cols$colend_li[3] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[4] & dist<cols$colend_li[4] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[5] & dist<cols$colend_li[5] , dist , 0 )), fill = "red", alpha = 0.7) +
geom_area(mapping = aes(x = ifelse(dist>cols$colstart_li[6] & dist<cols$colend_li[6] , dist , 0 )), fill = "green", alpha = 0.7) +
xlab("Distance [km]") +
ylab("Elevation [m]") +
scale_x_continuous(breaks=seq(0,70,10), limits = c(0,75)) +
scale_y_continuous(breaks=seq(0,130,25), limits = c(-5,160)) +
geom_line(lwd = 0.75, colour = "black", alpha = 0.9) +
theme_classic()

With geom_label_repel, from the ggrepel package, the labels that were defined earlier can be added to the plot. Several prameters define position, size, fill color etc. Fairly self explanatory I would say. Color #0786e3 is the same color as the waters around Watopia, a bit softened using alpha = .5 otherwise it is very hard against a white background.

geom_label_repel(data = cols, aes(x=labelposx, y=labelposy, label=colname), size = 4.2, alpha = .5,
fill = "#0786e3", min.segment.length = 0.5, segment.color = "black", segment.size = 1,
nudge_x = 0, nudge_y = 50)

Earlier I mentioned issues with reading the FIT files from Zwift (also with Garmin). A little background: read_fit has a default and advised setting format = TRUE, this creates a formatted dataframe and adds some columns. Unfortunately I get the following error message. Chaning to format = FALSE resolves the issue but requires some manual work to get it right.

   zwift <- read_fit("2021-01-23-20-32-55.fit", format = TRUE)
Error in if ((diffs["count", which.max(diffs["count", ])]/length(tdat)) <  :
argument is of length zero 

## References

[1] Converting FIT coordinates to degrees https://gis.stackexchange.com/questions/122186/convert-garmin-or-iphone-weird-gps-coordinates