# Simulating a Queue in R

May 30, 2010
By

(This article was first published on Taking the Pith Out of Performance, and kindly contributed to R-bloggers)

In the GCaP class earlier this month, we talked about the meaning of the load average (in Unix and Linux) and simulating a grocery store checkout lane, but I didn't actually do it. So, I decided to take a shot at constructing a discrete-event simulation (as opposed to Monte Carlo simulation) of a simple M/M/1 queue in R.

We can make use of a lot of conveniences in R to accomplish such a simulation. For example, we don't have to worry about random number generation, we can simply use the rexp() function for an M/M/1 queue. It may not be the fastest code on the planet but it is guaranteed to be reliable. We also have the ease of integrating PDQ (Pretty Damn Quick) for analytic comparison, as well as the nice statistical analysis and plotting capabilities available in R.

Simulation Variables
As usual, we start with a list of the necessary variables for the simulation and its instrumentation.
t.end   <- 10^5 # duration of simt.clock <- 0    # sim timeTa <- 1.3333    # interarrival periodTs <- 1.0000    # service periodt1 <- 0         # time for next arrivalt2 <- t.end     # time for next departuretn <- t.clock   # tmp var for last event timetb <- 0         # tmp var for last busy-time startn <- 0          # number in systems <- 0          # cumulative number-time productb <- 0          # total busy timec <- 0          # total completionsqc <- 0         # plot instantaneous q sizetc <- 0         # plot time deltaplotSamples <- 100set.seed(1)

Next, we need to write the R code to perform the actual M/M/1 simulation of arrivals into and departures from the queue.

Simulation Loop
This code meant to be pedagogic so, I haven't bothered to do anything spiffy like pre-allocating the Exp variates, for example. I based it on the example in Mac MacDougall's book Simulating Computer Systems (an oldie but a goodie), rather than the example in the more recent Introduction to Scientific Programming and Simulation Using R book, because I think there's a bug in their R code, but I didn't spend any time trying to find it. Also, that code is not instrumented.
while (t.clock < t.end) {    if (t1 < t2) {      # arrival event        t.clock <- t1        s <- s + n * (t.clock - tn)  # delta time-weighted number in queue        n <- n + 1        if (t.clock < plotSamples) {             qc <- append(qc,n)            tc <- append(tc,t.clock)         }        tn <- t.clock        t1 <- t.clock + rexp(1, 1/Ta)        if(n == 1) {             tb <- t.clock            t2 <- t.clock + rexp(1, 1/Ts)  # exponential  interarrival period        }    } else {            # departure event        t.clock <- t2        s <- s + n * (t.clock - tn)  # delta time-weighted number in queue        n <- n - 1        if (t.clock < plotSamples) {             qc <- append(qc,n)            tc <- append(tc,t.clock)        }        tn <- t.clock        c <- c + 1        if (n > 0) {             t2 <- t.clock + rexp(1, 1/Ts)  # exponential  service period        }        else {             t2 <- t.end            b <- b + t.clock - tb        }    }   }

So, now we have the simulation workhorse in place.

Instrumented Metrics
Here, we collect the instrumentation data to form some well-known performance metrics. They correspond to the definitions given in class.
u <- b/t.clock       # utilization B/TN <- s/t.clock       # mean queue length (see the Load Average notes)x <- c/t.clock       # mean throughput C/Tr <- N/x             # mean residence time (from Little's law: Q = XR)q <- sum(qc)/max(tc) # estimated queue length for plot

Queue Length
This is a plot of instantaneous queue length à la load average data. This is what queueing fluctuations look like. As I point out in class, they're responsible for the usually complicated math seen in queueing-theory textbooks that can make your head hurt.

The box shown as the red dashed line has the same area as that under the stair-step curve. Since the box has the same width (in time) as the curve, its height of 1.7845 represents the time-averaged queue length based on a sample of 100 time steps out of the total of 100,000 steps. As you'll see in the next section, the height of the box approaches to the steady-state value of Q = 3.00 predicted by PDQ as the simulation time is increased.

PDQ Model
For analytic comparison, we also include the corresponding PDQ-R model in the same script using the online manual for reference.
Init("")CreateOpen("w",1/Ta)     # arrivals into queueCreateNode("n",CEN,FCFS) # the M/M/1 queueSetDemand("n","w",Ts)    # service timeSolve(CANON) # Collect individual performance metricsR <- GetResidenceTime("n","w",TRANS)Q <- GetQueueLength("n","w",TRANS)U <- GetUtilization("n","w",TRANS)X <- GetThruput(TRANS,"w")

Yes, these few lines are equivalent to the above simulation code with instrumentation, and it's guaranteed to be in steady state. Running PDQ, even in R, is essentially instantaneous. The simulation will take longer, but given the plethora of MIPS/core available today, especially on laptops, running simulations in R is entirely feasible.

Results
Finally, we can compare the simulated M/M/1 queue with the corresponding PDQ results. As usual, it's best to break them into inputs and outputs.

1. Inputs:
Tsim: 1.00e+05Ta: 1.3333, Ts: 1.0000    # timesAr: 0.7500, Sr: 1.0000    # rates

2. Outputs:
Usim: 0.7477, Updq: 0.75Xsim: 0.7495, Xpdq: 0.75Rsim: 4.0316, Rpdq: 4.00Qsim: 3.0219, Qpdq: 3.00

Within the expected limits of precision, we can conclude that the simulation reached steady state during the specified 105 time-steps.

No doubt, I'll go into more detail about doing simulations in R during the upcoming GDAT class in August.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...