Before beginning work on a new data science project I like to do the following:
1. Get my work area ready by creating an R Project for use with the RStudio IDE.
2. Organize my work area by creating a series of directories to store my project inputs and outputs. I create ‘data’ (raw data), ‘src’ (R code), ‘reports’ (markdown documents etc) and ‘documentation’ (help files) directories.
3. Set up tracking of my work by Initializing a Git Repo.
4. Take measures to avoid tracking sensitive information (such as data or passwords) by adding certain file names or extensions to the .gitignore file.
You can of course achieve the the desired result by using the RStudio IDE GUI but I have found into handy to automate the process using a shell script. Because I use Windows, I execute this script using the Git BASH emulator. If you have a Mac or Linux machine, just use the terminal.
1. Navigate to a directory you want to want to designate as your area of work and run
bash project_setup projectname
where “projectname” is a name of your choosing.
2. Open the freshly generated R Project in RStudio. This will create your .Rproj.user directory.
3. Start work!
You can see my script below, just modify to suit your own requirements. Notice I have set the R project options ‘Restore Workspace’, ‘Save Workspace’ and ‘Always Save History’ to an explicit ‘No’, increased the ‘Number of Spaces for Tab’ to 4 and prevented the tracking of .csv data files.