Digital Water Utility Competencies: R for Water Professionals

[This article was first published on R Language – The Lucid Manager, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R for Water Professionals: Developing a Digital Water Utility

The digital water utility is a fashionable catchphrase in the water industry. Managing reliable water services requires not only a sufficient volume of water but also significant amounts of data. Water professionals continuously measure the flow and quality of the water and how customers perceive their service. Water professionals analyse this data to monitor processes and to decide the appropriate course of action. R for Water professionals is an online workshop to introduce water engineers, biologists, economists, and so on with the principles of using code to create value from data. The content of this course is also freely available on GitHub, including the data and scripts used in the case studies.

Digital Water Utility Competencies

One of the three main competencies of a data scientist is domain knowledge. To make sense of what data scientists analyse, they need to have subject-matter expertise. A data scientist without this knowledge can easily make mistakes because they view their problem as abstract, instead of being part of reality.

data science competencies
Data science competencies (Based on Conway, 2009).

Data scientists with competencies in all three areas are rare. Some people refer to these people as data data science unicorns or full-stack data scientists. Most people enter data science from the domains of computing or mathematics. For a mathematician or a computer scientist to become a domain expert, in this case, a water professional takes several years of training and experience. So why not reverse this equation and train existing subject-matter experts to learn about writing data science code?

R for Water Professionals teaches the basics of using the R language to solve water management problems. I developed this course because I want to help fellow water professionals to ditch the spreadsheet and write code to analyse data. The objective of this course is to give participants a starting point for further self-study. To become a digital water utility, water professionals need to develop competencies in data science to fully embrace the benefits of digitisation.

Case Study approach

This workshop uses a case study approach to introduce participants to coding in the R language. Rather than a systematic theoretical approach to teach coding, the case studies form a starting point to introduce various concepts of writing data science code. Each case study starts with a problem statement that participants resolve with R code. The case studies relate to water quality data, customer perception and water consumption.

Firstly, the workshop introduces the principles of strategic data science, which is an extract from my book on that topic. This session presents a framework for best practice in data science:

  • Useful: Actionable intelligence
  • Sound: Valid, reliable and reproducible
  • Aesthetic: Interpretable and explainable

Case Study 1

The first case study introduces the participant to the basics of using R and Studio by assessing a set of lab results from a water supply network against local regulations. Above all, his case study demonstrates the importance of sound data science by reviewing the various ways percentiles can be calculated.

Analysing turbidity data in various systems.
Analysing turbidity data in various systems.

Case Study 2

The digital water utility is not only about technology. All technology is in service of providing a convenient service to consumers. second case study looks at the results of a customer survey about their perception of water services. This study delves into cleaning data and exploration through visualising the data. An earlier article on this website showed how to use factor analysis for this data.

Digital water utility competencies: the data-to-pixel ratio
Best practice in data visualisation is to maximise the data-to-pixel ratio.

Case Study 3

The last case study uses simulated smart meter data to introduces the basics of literate programming to develop a report. The assignment for this case study is to report on water consumption and find properties with a leak.

Hydroinformatics: diurnal curve for smart water meters
Diurnal curve from smart water meters.


In addition to the case studies, the course closes with a capstone project where participants develop a water quality reporting dashboard for a Board of Directors, using R markdown.

This course helps to develop the competencies required to transform into a digital water utility. I hope to establish a group of R users within the water industry globally to promote using code and develop further case studies or even a library.

R for Water Professionals: Developing Digital Water Utility Competencies

The course is currently under development. I am conducting the first face-to-face session on 17 July 2019 in Melbourne. You can sign-up for free for the online version while the content is not yet complete. Once you register for the course, you will be notified of any updates.

Hydroinformatics: Musings about Water Utility Data Science in R

The post Digital Water Utility Competencies: R for Water Professionals appeared first on The Lucid Manager.

To leave a comment for the author, please follow the link and comment on their blog: R Language – The Lucid Manager. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)