R Work Areas. Standardize and Automate.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Before beginning work on a new data science project I like to do the following:
1. Get my work area ready by creating an R Project for use with the RStudio IDE.
2. Organize my work area by creating a series of directories to store my project inputs and outputs. I create ‘data’ (raw data), ‘src’ (R code), ‘reports’ (markdown documents etc) and ‘documentation’ (help files) directories.
3. Set up tracking of my work by Initializing a Git Repo.
4. Take measures to avoid tracking sensitive information (such as data or passwords) by adding certain file names or extensions to the .gitignore file.
You can of course achieve the the desired result by using the RStudio IDE GUI but I have found into handy to automate the process using a shell script. Because I use Windows, I execute this script using the Git BASH emulator. If you have a Mac or Linux machine, just use the terminal.
Steps:
1. Navigate to a directory you want to want to designate as your area of work and run
bash project_setup projectname
where “projectname” is a name of your choosing.
2. Open the freshly generated R Project in RStudio. This will create your .Rproj.user directory.
3. Start work!
You can see my script below, just modify to suit your own requirements. Notice I have set the R project options ‘Restore Workspace’, ‘Save Workspace’ and ‘Always Save History’ to an explicit ‘No’, increased the ‘Number of Spaces for Tab’ to 4 and prevented the tracking of .csv data files.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.