I thought I’d try Julia out and see how far I could get with nothing but Google on my side.
I’ve had it installed for a while, but never really done anything with it. My aims for this exercise were :
- download some open data
- wrangle it, or at least do some sort of manipulation on it
- plot it
I’ve used the daily Covid-19 statistics provided by Public Health Scotland (see code for url).
These get updated daily, and published after 2pm, so there may be a short period during the day when they are not available whilst the update is in progress.
After installation, the first thing that needs to be done is to open up a terminal (PowerShell if you’re a Windows user) and type
Julia to launch the REPL (equivalent to the RStudio console for the R users out there)
With that being done, I realised I needed to install some packages.
Julia has it’s own package manager, called
You can launch that in the REPL by hitting the
The prompt then changes from
Install a package with
Once you’ve installed packages, you need to do the equivalent of
library in R, which in Julia, is
using Pkg using Dates using CSV using Downloads using DataFrames using Chain using VegaLite url = "https://www.opendata.nhs.scot/dataset/b318bddf-a4dc-4262-971f-0ba329e09b87/resource/427f9a25-db22-4014-a3bc-893b68243055/download/trend_ca_20220301.csv" file = CSV.File(Downloads.download(url), missingstring = "NA", dateformat = "yyyymmdd") df = DataFrame(file) df[!, :Date] = string.(df[!, :Date]) df[!, :Date] = Date.(df[!, :Date], "yyyymmdd") df |> @vlplot(:line, columns = 4, wrap = "CAName:o", x = "Date", y = "DailyPositive")
Here, I load the packages, and have defined the url for the Covid-19 data.
CSV is a package for working with CSV / flat files, while
Downloads also does what it says on the tin.
You can use
describe as the equivalent of
str – it gives you an overview of the object.
From this, I could see that the Date column was an integer, so I needed to convert it to a string, and from there, to a date.
Finally, I piped the df to the VegaLite package (which has a fairly effortless ability to make small multiples). Now, this may not be particularly polished (and that’s on me, I have kept this as minimal as possible), but it’s certainly more than good enough for a first look at a dataset.
At the top right of the plot window, 3 dots appear, clicking on them brings up a menu to save the plot in various formats, including svg:
Among the things I searched up – how to filter dates (still haven’t quite sussed that out yet, but I suspect I need to spend some time here.
I also looked at Gadfly, but couldn’t suss out how to get the small multiple to work legibly – again, that’s on me and my lack of time. [I will need to look at it again[(http://gadflyjl.org/stable/man/compositing/).
One other thing I discovered in passing was the Julia equivalent of the
here functionality, namely
joinpath , which allows you to build filepaths from parts so they are independent of OS.
root = dirname(@__FILE__) joinpath(root, "positives.csv")
There is a lot to learn, and I am looking for something more structured, but for a quick dabble, this has been a useful exercise.