Fix missing dates with R

September 2, 2011
By

(This article was first published on plausibel, and kindly contributed to R-bloggers)

I have data on user access to a website. This log file (helpdesk log.csv) just contains the date of access, and how many accesses were counted. It would look like this:

Date hits
13-07-2011 2
14-07-2011 1
16-07-2011 3
17-07-2011 4
...

As you can see, for days with no access (like 15-07-2011 for example), there is no entry.
I wanted to draw a graph showing the number of hits over time. plotting this shows the graph below, but it's conditional on there having been at least one hit. So it's a bit misleading. We don't know if there was zero hits or one.



What I plot looks like this:
2011-06-28 1
2011-06-29 2
2011-06-30 3
2011-07-01 1
2011-07-04 3
2011-07-05 3


Obviously, no data for 2011-07-02 and 2011-07-03, when I would want an entry 2011-07-02 = 0. In other words, I want this

2011-06-28 1
2011-06-29 2
2011-06-30 3
2011-07-01 1
2011-07-02 0
2011-07-03 0
2011-07-04 3
2011-07-05 3

So, I need to insert date and a value of zero for each date with no activity. There's an easy way to do this in R.


so actind is an index vector. the first seven entries are

2011-06-28 1
2011-06-29 2
2011-06-30 3
2011-07-01 4
2011-07-02 0
2011-07-03 0
2011-07-04 5

where each row corresponds to a consecutive date, zero means no activity on that date, and a positive number is the INDEX of the element in "daycount" (the short vector) corresponding to that date.

The correct graph is this one:






To leave a comment for the author, please follow the link and comment on his blog: plausibel.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.