(This article was first published on

**plausibel**, and kindly contributed to R-bloggers)I have data on user access to a website. This log file (helpdesk log.csv) just contains the date of access, and how many accesses were counted. It would look like this:

Date hits

13-07-2011 2

14-07-2011 1

16-07-2011 3

17-07-2011 4

…

As you can see, for days with no access (like 15-07-2011 for example), there is no entry.

I wanted to draw a graph showing the number of hits over time. plotting this shows the graph below, but it’s conditional on there having been at least one hit. So it’s a bit misleading. We don’t know if there was zero hits or one.

What I plot looks like this:

2011-06-28 1

2011-06-29 2

2011-06-30 3

2011-07-01 1

2011-07-04 3

2011-07-05 3

Obviously, no data for 2011-07-02 and 2011-07-03, when I would want an entry 2011-07-02 = 0. In other words, I want this

2011-06-28 1

2011-06-29 2

2011-06-30 3

2011-07-01 1

2011-07-02 0

2011-07-03 0

2011-07-04 3

2011-07-05 3

So, I need to insert date and a value of zero for each date with no activity. There’s an easy way to do this in R.

so actind is an index vector. the first seven entries are

2011-06-28 1

2011-06-29 2

2011-06-30 3

2011-07-01 4

2011-07-02 0

2011-07-03 0

2011-07-04 5

where each row corresponds to a consecutive date, zero means no activity on that date, and a positive number is the INDEX of the element in “daycount” (the short vector) corresponding to that date.

The correct graph is this one:

To

**leave a comment**for the author, please follow the link and comment on their blog:**plausibel**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...