A new version of the package redshiftTools has arrived with improvements and it’s now available in CRAN! This package let’s you efficiently upload data into an Amazon Redshift database using the approach recommended by Amazon
This package is helpful because otherwise uploading data with inserts in Redshift is super slow, this is the recommended way of doing replaces and upserts per the Redshift documentation, which consists of generating various CSV files, uploading them to an S3 bucket and then calling a copy command on the Redshift server, all of that is handled by the package.
To install this package, use the following command:
After installing, you’ll have these functions to use, which are explained in full detail in the package’s man pages.
rs_create_statement: Generates the SQL statement to create a table based on the structure of a data.frame. It allows you to specify sort key, dist key and if you want to allow compression to be added or not.
rs_replace_table: Deletes all records in a table, then uploads the provided data frame into it. It runs as a transaction so the table is never empty to the other users.
rs_upsert_table: Deletes all records matching the provided keys from the uploaded dataset, and then inserts the rows from the dataset. If no keys are provided, it acts as a regular insert.
rs_cols_upsert_table: Like rs_upsert_table but can choose only some columns to update
rs_append_table: Like the previous functions but only appends data without altering existing data.
rs_create_table: This just runs rs_create_statement and then rs_replace_table, creating a table with the same structure as your data frame and then uploading the data frame to it.
For more details, read the official README in https://github.com/sicarul/redshiftTools
A special thanks to all the collaborators that sent contributions to the package:
- Quentin Rousseau – kwent
- Ryan Johnson – rtjohn
- Ilya Goldin – ilyaminati
- Farkhan Novianto – mfarkhann
- Emelie Hofland – Emelieh21
For future versions, i plan to include additional utility functions that allow you to obtain table metadata, optimize table encoding, check table permissions, etc. If you feel like you have some cool functionality to share please share your pull request!