Tracking my Work Hours with IFTTT and R

[This article was first published on Jasmine Dumas' R Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

An ongoing effort to work more efficiently –

TL;DR: For the forgetful, IFTTT can be a great way to track your work hours and provide insight about trends through visualization.

Manually entering your own timesheet and logging your specific project time hours for work (and contracting) can cause inconsistencies, namely forgetting how may hours to submit. In November 2015 I began a new role as a Data Science Intern at The Hartford Insurance Group headquartered in Hartford, CT and I was responsible for logging my working hours in both the contractor database as well as The Hartford’s database so that I can be promptly paid each week. After the first few days, I made an honest assessment about how likely I was to screw up my actual work hours for entry and decided if I wanted to not be scolded, I should think of a automated way of recording hours.

It has been said that:

“When you can measure what you are speaking about, and express it in numbers, you know something about it, when you cannot express it in numbers, your knowledge is of a meager and unsatisfactory kind; it may be the beginning of knowledge, but you have scarely, in your thoughts advanced to the stage of science.” – William Thomson, 1st Baron Kelvin

So with previous exposure to IFTTT, which is a free web-based service which creates recipes that connect the digital and physical world in the realm of IoT (Internet of Things), I set out the create a recipe that would use my iPhone’s location services and a google spreadsheet for tracking timestamps to and from work. I established a perimeter around my work location in the IFTTT application and specified that I wanted the arrival and departure time from that area. Back in November I only intended for this recipe to be a plan b if I had forgot how many hours I had worked for my time sheets. In the interest of becoming more informed about the entirety of my work hours and after accumulating approximately 258 check-ins over the last 7 months on the google spreadsheet, I set out to extract some meaning and tangible statistics to understand how often I was working.

For this project analysis I used the googlesheets package by Jenny Bryan which is a Google Spreadsheets R API, dplyr for tidy data forms and ggplot2 and plotly for interactive graphics.

Here is the resulting visualization:

From the graph above, it is pretty clear when I switched from being a part-time intern to a full-time data scientist in early April. Also its interesting to see when I started to work a compressed work week in early June. I unfortunately sometimes walk to lunch events that inadvertently exit my location services out of the original perimeter and also when traveling through Hartford to visit family in New York.

In conclusion, IFTTT was a bit irregular and not completely accurate for logging the location coordinates but overall this is a good start to generating more data and providing insight to about my working hours!

<span class="c1">##############################
# Explore my work schedule from a google sheet where I log
# my travel time to Hartford, CT with IFTTT
##############################
</span><span class="n">library</span><span class="p">(</span><span class="n">magrittr</span><span class="p">)</span><span class="w">
</span><span class="c1">################
# data clean
################
# install googlesheets
</span><span class="n">library</span><span class="p">(</span><span class="n">googlesheets</span><span class="p">)</span><span class="w">
</span><span class="c1"># register the google sheet as a URL with share ability enable
</span><span class="n">sheet</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">gs_url</span><span class="p">(</span><span class="s2">"https://docs.google.com/spreadsheets/d/1EFH_QytjB661YUe0VKZkuik-wx-v_w2sJvL12IiEMOY/edit?usp=sharing"</span><span class="p">,</span><span class="w">
                </span><span class="n">lookup</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="nb">T</span><span class="p">)</span><span class="w">
</span><span class="c1"># read the sheet in
</span><span class="n">time_sheet</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">sheet</span><span class="w"> </span><span class="o">%>%</span><span class="w"> </span><span class="n">gs_read</span><span class="p">(</span><span class="n">ws</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"Sheet1"</span><span class="p">)</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">time_sheet</span><span class="p">)</span><span class="w">
</span><span class="n">sapply</span><span class="p">(</span><span class="n">time_sheet</span><span class="p">,</span><span class="w"> </span><span class="n">class</span><span class="p">)</span><span class="w">

</span><span class="c1"># remove duplicates
</span><span class="n">time_sheet</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">time_sheet</span><span class="p">[</span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="m">96</span><span class="o">:</span><span class="m">98</span><span class="p">),]</span><span class="w">

</span><span class="c1"># extract date from time
</span><span class="n">library</span><span class="p">(</span><span class="n">stringr</span><span class="p">)</span><span class="w">
</span><span class="n">date</span><span class="w"> </span><span class="o"><-</span><span class="n">str_extract_all</span><span class="p">(</span><span class="n">string</span><span class="o">=</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">date_time</span><span class="p">,</span><span class="w"> </span><span class="n">pattern</span><span class="o">=</span><span class="s1">'\\w+\\s\\d+(st)?(nd)?(rd)?(th)?,\\s+\\d+'</span><span class="p">)</span><span class="w">
</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">date</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">date</span><span class="w">

</span><span class="c1"># extract out the time to a separate column
</span><span class="n">time</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">str_extract_all</span><span class="p">(</span><span class="n">string</span><span class="o">=</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">date_time</span><span class="p">,</span><span class="w"> </span><span class="n">pattern</span><span class="o">=</span><span class="s1">'\\b((1[0-2]|0?[1-9]):([0-5][0-9])([AaPp][Mm]))'</span><span class="p">)</span><span class="w">
</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">time</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">time</span><span class="w">

</span><span class="c1"># transform the types for later data munging
</span><span class="n">library</span><span class="p">(</span><span class="n">lubridate</span><span class="p">)</span><span class="w">
</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">date_format</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">mdy</span><span class="p">(</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">date</span><span class="p">)</span><span class="w">
</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">time</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">as.character</span><span class="p">(</span><span class="n">time_sheet</span><span class="o">$</span><span class="n">time</span><span class="p">)</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">time_sheet</span><span class="p">)</span><span class="w">

</span><span class="c1"># remove the second column
</span><span class="n">time_sheet</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">time_sheet</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="m">2</span><span class="p">)]</span><span class="w">

</span><span class="c1"># look at a data frame with specific types of exit, enter times
</span><span class="n">library</span><span class="p">(</span><span class="n">dplyr</span><span class="p">)</span><span class="w">
</span><span class="n">exit</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">filter</span><span class="p">(</span><span class="n">time_sheet</span><span class="p">,</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s1">'exited'</span><span class="p">)</span><span class="w">
</span><span class="n">enter</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">filter</span><span class="p">(</span><span class="n">time_sheet</span><span class="p">,</span><span class="w"> </span><span class="n">type</span><span class="w"> </span><span class="o">==</span><span class="w"> </span><span class="s1">'entered'</span><span class="p">)</span><span class="w">

</span><span class="c1"># merge the two, which will align the dates with enter and exit time labels
</span><span class="n">merge_time_sheet</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">merge</span><span class="p">(</span><span class="n">enter</span><span class="p">,</span><span class="w"> </span><span class="n">exit</span><span class="p">,</span><span class="w"> </span><span class="n">by</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s1">'date_format'</span><span class="p">)</span><span class="w">

</span><span class="c1"># remove duplicate middle columns of unformatted date and labels
</span><span class="n">merge_time_sheet</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">merge_time_sheet</span><span class="p">[,</span><span class="w"> </span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="m">2</span><span class="o">:</span><span class="m">3</span><span class="p">,</span><span class="w"> </span><span class="m">5</span><span class="o">:</span><span class="m">6</span><span class="p">)]</span><span class="w">

</span><span class="c1"># change column names
</span><span class="n">colnames</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="p">)</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="s2">"date"</span><span class="p">,</span><span class="w"> </span><span class="s2">"enter_time"</span><span class="p">,</span><span class="w"> </span><span class="s2">"exit_time"</span><span class="p">)</span><span class="w">
</span><span class="n">head</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="p">)</span><span class="w">

</span><span class="c1"># format the time column to get the difference from enter and exit (in seconds)
</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">strptime</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">enter_time</span><span class="p">,</span><span class="w"> </span><span class="s2">"%H:%M%p"</span><span class="p">)</span><span class="w"> </span><span class="o">-</span><span class="w"> </span><span class="n">strptime</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">exit_time</span><span class="p">,</span><span class="w"> </span><span class="s2">"%H:%M%p"</span><span class="p">)</span><span class="w">
</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">abs</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="p">)</span><span class="w">

</span><span class="c1"># change seconds into hours (3600 in an hour)
</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="w"> </span><span class="o">/</span><span class="w"> </span><span class="m">3600</span><span class="w">

</span><span class="c1"># so I travel through hartford alot on the highway or walk around
# hartford for lunch, so I want to remove small time diff observations
</span><span class="n">merge_time_sheet</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">merge_time_sheet</span><span class="p">[</span><span class="o">-</span><span class="nf">c</span><span class="p">(</span><span class="n">which</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="m">3.51666666666667</span><span class="p">)),</span><span class="w"> </span><span class="p">]</span><span class="w">
</span><span class="c1"># there are some duplicates but I will leave those for now
</span><span class="w">
</span><span class="c1"># round time diff
</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">round</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="p">,</span><span class="w"> </span><span class="m">2</span><span class="p">)</span><span class="w">

</span><span class="c1">#################
# time summaries
#################
</span><span class="n">mean</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="o">$</span><span class="n">time_diff</span><span class="p">)</span><span class="w"> </span><span class="c1"># mean of 5.1 hours (even though it says secs)
</span><span class="w">
</span><span class="c1">##########
# data viz
##########
</span><span class="n">library</span><span class="p">(</span><span class="n">ggplot2</span><span class="p">)</span><span class="w">
</span><span class="n">work_hrs</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">ggplot</span><span class="p">(</span><span class="n">merge_time_sheet</span><span class="p">,</span><span class="w"> </span><span class="n">aes</span><span class="p">(</span><span class="n">date</span><span class="p">,</span><span class="w"> </span><span class="nf">c</span><span class="p">(</span><span class="n">time_diff</span><span class="p">)))</span><span class="w"> </span><span class="o">+</span><span class="w">
                  </span><span class="n">geom_line</span><span class="p">(</span><span class="n">color</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="s2">"dodgerblue"</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
                  </span><span class="n">xlab</span><span class="p">(</span><span class="s2">""</span><span class="p">)</span><span class="w"> </span><span class="o">+</span><span class="w">
                  </span><span class="n">ylab</span><span class="p">(</span><span class="s2">"Working Hours"</span><span class="p">)</span><span class="w">
</span><span class="n">work_hrs</span><span class="w">
</span><span class="n">library</span><span class="p">(</span><span class="n">plotly</span><span class="p">)</span><span class="w">
</span><span class="n">ggplotly</span><span class="p">(</span><span class="n">work_hrs</span><span class="p">)</span><span class="w">
</span><span class="c1">## provided my username and API key before this step
</span><span class="n">plotly_POST</span><span class="p">(</span><span class="n">work_hrs</span><span class="p">,</span><span class="w"> </span><span class="s2">"Time Series Analysis of Work Hours"</span><span class="p">)</span><span class="w">


</span>

To leave a comment for the author, please follow the link and comment on their blog: Jasmine Dumas' R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)