redshiftTools v1.0.0 – CRAN Release!

[This article was first published on R - Data Science Heroes Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

redshiftTools v1.0.0 - CRAN Release!

A new version of the package redshiftTools has arrived with improvements and it’s now available in CRAN! This package let’s you efficiently upload data into an Amazon Redshift database using the approach recommended by Amazon

This package is helpful because otherwise uploading data with inserts in Redshift is super slow, this is the recommended way of doing replaces and upserts per the Redshift documentation, which consists of generating various CSV files, uploading them to an S3 bucket and then calling a copy command on the Redshift server, all of that is handled by the package.

To install this package, use the following command:

install.packages('redshiftTools')

After installing, you’ll have these functions to use, which are explained in full detail in the package’s man pages.

rs_create_statement: Generates the SQL statement to create a table based on the structure of a data.frame. It allows you to specify sort key, dist key and if you want to allow compression to be added or not.

rs_replace_table: Deletes all records in a table, then uploads the provided data frame into it. It runs as a transaction so the table is never empty to the other users.

rs_upsert_table: Deletes all records matching the provided keys from the uploaded dataset, and then inserts the rows from the dataset. If no keys are provided, it acts as a regular insert.

rs_cols_upsert_table: Like rs_upsert_table but can choose only some columns to update

rs_append_table: Like the previous functions but only appends data without altering existing data.

rs_create_table: This just runs rs_create_statement and then rs_replace_table, creating a table with the same structure as your data frame and then uploading the data frame to it.

For more details, read the official README in https://github.com/sicarul/redshiftTools

A special thanks to all the collaborators that sent contributions to the package:

Future Plans

For future versions, i plan to include additional utility functions that allow you to obtain table metadata, optimize table encoding, check table permissions, etc. If you feel like you have some cool functionality to share please share your pull request!

To leave a comment for the author, please follow the link and comment on their blog: R - Data Science Heroes Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)