Character to Date conversion using lazy queries

[This article was first published on klogr::blog(posts), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Problem

How to convert a string with format DDMMYYYY, to a date type using lazy queries. The problem is that base r function as.Date() does not have a translatable SQL equivalent. More information on this is detailed in this sql translation article.

This problem is largely addressed in the special forms article. However, no example is given for converting a character to a date data type. This post will provide this example.

Like most of my self described “self-tech-support” post categories, it was introduced through work. Took me a while to come to the answer so here is to saving someone sometime somewhere.

Let’s begin with simulating a lazy connection.

# Load packages
library(dplyr)
library(dbplyr)
library(DT)

# Create a local lazy tibble with Postgres SQL connection
test_tbl <- lazy_frame(char_date = c("01012020", "01022020", "01032020"),
                       con = simulate_postgres())

test_tbl
<SQL>
SELECT *
FROM `df`

Solution

The below provides two workable examples or at least the translations. Whilst you can denote a SQL function by using uppercase as shown in example 1, the second example does provide an explicit marker for what is to be used as-is in SQL with function sql().

# example 1 
mutate(test_tbl, date = TO_DATE(char_date, 'DDMMYYYY'))
<SQL>
SELECT `df`.*, TO_DATE(`char_date`, 'DDMMYYYY') AS `date`
FROM `df`
# example 2
mutate(test_tbl, date = sql("TO_DATE(char_date, 'DDMMYYYY')"))
<SQL>
SELECT `df`.*, TO_DATE(char_date, 'DDMMYYYY') AS `date`
FROM `df`

Example (kind of)

It’s “kind of” because it’s not a direct translation applied to SQL but creating an example without having an instance of SQL is difficult. At least for free. I’ll provide instruction below if you wish to follow along:

  1. Head to DB Fiddle
  2. Change SQL in the top left to Postgres SQL version 17
  3. Copy & paste below in Schema SQL box
CREATE TABLE test_tbl (
  char_date varchar(8) NOT NULL
);
INSERT INTO test_tbl (char_date) VALUES ('01012020');
INSERT INTO test_tbl (char_date) VALUES ('01022020');
INSERT INTO test_tbl (char_date) VALUES ('01032020');
  1. Copy & paste below in Query SQL box
SELECT char_date, pg_typeof(char_date) AS type
FROM test_tbl;

SELECT TO_DATE(char_date, 'DDMMYYYY') AS char_date, 
       pg_typeof(TO_DATE(char_date, 'DDMMYYYY')) AS type 
FROM test_tbl;

Query #1

Query #2

Query 2 shows the output of the translated mutate from Solution albeit manually added into SQL.

Consider

The flavour of SQL for this example was be PostgreSQL version 17. Not all SQL flavours support the TO_DATE function, below is a short and not exhaustive list.

Conclusion

Lazy all the way! My blocker, this date field as a character, was keeping my lazy queries from continued laziness. Thankfully there was a workaround. The blog does miss the step of directly showing the translated mutate query a SQL database but I hope this example from two sides helps.

Acknowledgements

r-bloggers.com for the reach, platform, and content

DB Fiddle for SQL sandpit

Packages and package maintainer(s):

  • dplyr | Hadley Wickham
  • dbplyr | Hadley Wickham
  • DT | Joe Cheng
To leave a comment for the author, please follow the link and comment on their blog: klogr::blog(posts).

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)