Some years ago I read an article – I forget where – describing how our general knowledge often becomes frozen in time. Asked to name the tallest building in the world you confidently proclaim “the Sears Tower!”, because for most of your childhood that was the case – never mind that the record was surpassed long ago and it isn’t even called the Sears Tower anymore. From memory the example in the article was of a middle-aged speaker who constantly referred to a figure of 4 billion for the human population – again, because that’s what he learned in school and had never mentally updated.
Is this the case with programming too? Oh yes – as I learned today when performing the simplest of tasks: reading CSV files using R.
Here’s the scenario: given a directory containing CSV files with the same columns, read them into a single data frame with an additional column containing the file name.
We start with
list_files() of course, something along the lines of.
csv_files <- list.files(path = "path/to/the/folder", pattern = ".csv", full.names = TRUE)
My frozen, outdated knowledge tells me that the next steps are: (1) use
lapply() to read the CSV files into a list of data frames, (2) use the vector of file names as names for the list and (3) use
dplyr::bind_rows() to create a single data frame and add the column of file names, here named “path”.
library(dplyr) library(readr) csv_data <- lapply(csv_files, read_csv) names(csv_data) <- csv_files csv_data <- bind_rows(csv_data, .id = "path")
readr::read_csv() for years. Only today did I learn that not only can it read multiple files given a vector of file names, but it can also add a column for those file names. All in one line.
csv_data <- read_csv(csv_files, id = "path")
Why did I not know this? I guess because I had a solution that worked, and I’d never bothered to go back and see if something better had been invented since I learned my solution.
How can we unlearn our frozen, outdated knowledge and update our skills? Right now my answer is “once in a while take the time to read the help page when you use a function, even if it’s one you use all the time, in case it’s been updated with something new and useful.”
Any better ideas?