Character to Date conversion using lazy queries
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Problem
How to convert a string with format DDMMYYYY, to a date type using lazy queries. The problem is that base r function as.Date()
does not have a translatable SQL equivalent. More information on this is detailed in this sql translation article.
This problem is largely addressed in the special forms article. However, no example is given for converting a character to a date data type. This post will provide this example.
Like most of my self described “self-tech-support” post categories, it was introduced through work. Took me a while to come to the answer so here is to saving someone sometime somewhere.
Let’s begin with simulating a lazy connection.
# Load packages library(dplyr) library(dbplyr) library(DT) # Create a local lazy tibble with Postgres SQL connection test_tbl <- lazy_frame(char_date = c("01012020", "01022020", "01032020"), con = simulate_postgres()) test_tbl
<SQL> SELECT * FROM `df`
Solution
The below provides two workable examples or at least the translations. Whilst you can denote a SQL function by using uppercase as shown in example 1, the second example does provide an explicit marker for what is to be used as-is in SQL with function sql()
.
# example 1 mutate(test_tbl, date = TO_DATE(char_date, 'DDMMYYYY'))
<SQL> SELECT `df`.*, TO_DATE(`char_date`, 'DDMMYYYY') AS `date` FROM `df`
# example 2 mutate(test_tbl, date = sql("TO_DATE(char_date, 'DDMMYYYY')"))
<SQL> SELECT `df`.*, TO_DATE(char_date, 'DDMMYYYY') AS `date` FROM `df`
Example (kind of)
It’s “kind of” because it’s not a direct translation applied to SQL but creating an example without having an instance of SQL is difficult. At least for free. I’ll provide instruction below if you wish to follow along:
- Head to DB Fiddle
- Change SQL in the top left to Postgres SQL version 17
- Copy & paste below in Schema SQL box
CREATE TABLE test_tbl ( char_date varchar(8) NOT NULL ); INSERT INTO test_tbl (char_date) VALUES ('01012020'); INSERT INTO test_tbl (char_date) VALUES ('01022020'); INSERT INTO test_tbl (char_date) VALUES ('01032020');
- Copy & paste below in Query SQL box
SELECT char_date, pg_typeof(char_date) AS type FROM test_tbl; SELECT TO_DATE(char_date, 'DDMMYYYY') AS char_date, pg_typeof(TO_DATE(char_date, 'DDMMYYYY')) AS type FROM test_tbl;
Query #1
Query #2
Query 2 shows the output of the translated mutate from Solution albeit manually added into SQL.
Consider
The flavour of SQL for this example was be PostgreSQL version 17. Not all SQL flavours support the TO_DATE
function, below is a short and not exhaustive list.
Conclusion
Lazy all the way! My blocker, this date field as a character, was keeping my lazy queries from continued laziness. Thankfully there was a workaround. The blog does miss the step of directly showing the translated mutate query a SQL database but I hope this example from two sides helps.
Acknowledgements
r-bloggers.com for the reach, platform, and content
DB Fiddle for SQL sandpit
Packages and package maintainer(s):
- dplyr | Hadley Wickham
- dbplyr | Hadley Wickham
- DT | Joe Cheng
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.