Building a productivity system in R, Part 1

July 29, 2014

(This article was first published on Bearded Analytics » R, and kindly contributed to R-bloggers)

I recently came to the conclusion that I need a more meaningful way to track my productivity than the spreadsheet I am currently using, so my next few posts are going to be about building a system in R to track this.  If you’re building your own productivity tracking system then by all means take this as inspiration, but don’t expect it to suit your needs.  I’m making it to suit my needs using terminology that is common in my workplace and you’ll have to figure out what will work for your needs in your workplace.

As with all such endeavors, the thing that is really going to make or break this tracking is the data model, so let’s define that first.

At the very top level I have projects.  Each client will have one or more projects.  I’m not interested in tracking work for particular clients (at least for now) so I’m skipping that level, but it is necessary to note that each client has a 4 digit number.  Each project also has a 4 digit number, so the combination of the client digits and the project digits form a partial billing code.  The addition of the task-level 4 digit number makes a complete billing code that can be entered into my timesheet, but we’re not there yet.  At the project level, the first two quartets is all that is necessary.  Additionally, we’re going to have a name for the project, the date the project gets added, and the date the project gets removed.  Projects can often be multi-year endeavors, so understanding just how long you’ve been working on various tasks for a project can be useful.  For referencing across different datasets in this data model a project ID will also be defined.

Below the project level, as mentioned, are tasks.  Each task is a concrete goal that has been assigned for me to work on for that project.  Sometimes I only have one task for an entire project, other times I might have several tasks simultaneously. Some tasks may also depend on the completion of other tasks.   So we’re going to want the following things: task ID, task name, project ID, complete 12 digit billing code, if the task depends on the completion of another task, add date, complete date, budgeted hours, total used hours (will be cumulative), impact, effort, and notes.  I’m using the impact and effort fields to automatically assign priorities.  They will each be given a value from 1 to 10, with 10 being the highest.  I’m not going to get into how impact and effort will be used to create the priority since I will go into more detail about that in a future post, but see this article for my inspiration.

Finally, I want to track the actual hours in the day that I do the work.  So for this dataset I just want the task ID, the date/time in, and the date/time out.

Since I want all of this to appear as a single object I’m going to use a list containing three data frames.  Below is a function that will actually generate this object.  I expect I’ll only ever have to use it once, but it’s still useful to me to think in this way.  My next post will get into adding projects and tasks.

createStructure <- function() {
  Projects <- data.frame(ProjectID = character(),
                         ProjectName = character(),
                         BillingCode = character(), #(possibly partial)
                         AddDate = ymd(),
                         RemoveDate = ymd(),
  Tasks <- data.frame(TaskID = character(),
                      TaskName = character(),
                      ProjectName = character(),
                      BillingCode = character(), #(should be complete)(multiple codes spill into Notes field)
                      AddDate = ymd(),
                      CompleteDate = ymd(),
                      BudgetHours = numeric(),
                      TotalUsedHours = numeric(),
                      Impact = integer(),
                      Effort = integer(),
                      Notes = character(),
  Hours <- data.frame(TaskID = character(),
                      TimeIn = ymd_hms(),
                      TimeOut = ymd_hms(),
  return(list(Projects, Tasks, Hours))


To leave a comment for the author, please follow the link and comment on their blog: Bearded Analytics » R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)