Simulating a Queue in R

[This article was first published on Taking the Pith Out of Performance, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the GCaP class earlier this month, we talked about the meaning of the load average (in Unix and Linux) and simulating a grocery store checkout lane, but I didn’t actually do it. So, I decided to take a shot at constructing a discrete-event simulation (as opposed to Monte Carlo simulation) of a simple M/M/1 queue in R.

We can make use of a lot of conveniences in R to accomplish such a simulation. For example, we don’t have to worry about random number generation, we can simply use the rexp() function for an M/M/1 queue. It may not be the fastest code on the planet but it is guaranteed to be reliable. We also have the ease of integrating PDQ (Pretty Damn Quick) for analytic comparison, as well as the nice statistical analysis and plotting capabilities available in R.

Simulation Variables
As usual, we start with a list of the necessary variables for the simulation and its instrumentation.
t.end   <- 10^5 # duration of sim
t.clock <- 0    # sim time
Ta <- 1.3333    # interarrival period
Ts <- 1.0000    # service period
t1 <- 0         # time for next arrival
t2 <- t.end     # time for next departure
tn <- t.clock   # tmp var for last event time
tb <- 0         # tmp var for last busy-time start
n <- 0          # number in system
s <- 0          # cumulative number-time product
b <- 0          # total busy time
c <- 0          # total completions
qc <- 0         # plot instantaneous q size
tc <- 0         # plot time delta
plotSamples <- 100
set.seed(1)

Next, we need to write the R code to perform the actual M/M/1 simulation of arrivals into and departures from the queue.

Simulation Loop
This code meant to be pedagogic so, I haven't bothered to do anything spiffy like pre-allocating the Exp variates, for example. I based it on the example in Mac MacDougall's book Simulating Computer Systems (an oldie but a goodie), rather than the example in the more recent Introduction to Scientific Programming and Simulation Using R book, because I think there's a bug in their R code, but I didn't spend any time trying to find it. Also, that code is not instrumented.
while (t.clock < t.end) {
    if (t1 < t2) {      # arrival event
        t.clock <- t1
        s <- s + n * (t.clock - tn)  # delta time-weighted number in queue
        n <- n + 1
        if (t.clock < plotSamples) { 
            qc <- append(qc,n)
            tc <- append(tc,t.clock) 
        }
        tn <- t.clock
        t1 <- t.clock + rexp(1, 1/Ta)
        if(n == 1) { 
            tb <- t.clock
            t2 <- t.clock + rexp(1, 1/Ts)  # exponential  interarrival period
        }
    } else {            # departure event
        t.clock <- t2
        s <- s + n * (t.clock - tn)  # delta time-weighted number in queue
        n <- n - 1
        if (t.clock < plotSamples) { 
            qc <- append(qc,n)
            tc <- append(tc,t.clock)
        }
        tn <- t.clock
        c <- c + 1
        if (n > 0) { 
            t2 <- t.clock + rexp(1, 1/Ts)  # exponential  service period
        }
        else { 
            t2 <- t.end
            b <- b + t.clock - tb
        }
    }   
}

So, now we have the simulation workhorse in place.

Instrumented Metrics
Here, we collect the instrumentation data to form some well-known performance metrics. They correspond to the definitions given in class.
u <- b/t.clock       # utilization B/T
N <- s/t.clock       # mean queue length (see the Load Average notes)
x <- c/t.clock       # mean throughput C/T
r <- N/x             # mean residence time (from Little's law: Q = XR)
q <- sum(qc)/max(tc) # estimated queue length for plot


Queue Length
This is a plot of instantaneous queue length à la load average data. This is what queueing fluctuations look like. As I point out in class, they're responsible for the usually complicated math seen in queueing-theory textbooks that can make your head hurt.

The box shown as the red dashed line has the same area as that under the stair-step curve. Since the box has the same width (in time) as the curve, its height of 1.7845 represents the time-averaged queue length based on a sample of 100 time steps out of the total of 100,000 steps. As you'll see in the next section, the height of the box approaches to the steady-state value of Q = 3.00 predicted by PDQ as the simulation time is increased.

PDQ Model
For analytic comparison, we also include the corresponding PDQ-R model in the same script using the online manual for reference.
Init("")
CreateOpen("w",1/Ta)     # arrivals into queue
CreateNode("n",CEN,FCFS) # the M/M/1 queue
SetDemand("n","w",Ts)    # service time
Solve(CANON) 

# Collect individual performance metrics
R <- GetResidenceTime("n","w",TRANS)
Q <- GetQueueLength("n","w",TRANS)
U <- GetUtilization("n","w",TRANS)
X <- GetThruput(TRANS,"w")

Yes, these few lines are equivalent to the above simulation code with instrumentation, and it's guaranteed to be in steady state. Running PDQ, even in R, is essentially instantaneous. The simulation will take longer, but given the plethora of MIPS/core available today, especially on laptops, running simulations in R is entirely feasible.

Results
Finally, we can compare the simulated M/M/1 queue with the corresponding PDQ results. As usual, it's best to break them into inputs and outputs.

  1. Inputs:
    Tsim: 1.00e+05
    Ta: 1.3333, Ts: 1.0000    # times
    Ar: 0.7500, Sr: 1.0000    # rates
    

  2. Outputs:
    Usim: 0.7477, Updq: 0.75
    Xsim: 0.7495, Xpdq: 0.75
    Rsim: 4.0316, Rpdq: 4.00
    Qsim: 3.0219, Qpdq: 3.00

Within the expected limits of precision, we can conclude that the simulation reached steady state during the specified 105 time-steps.

No doubt, I'll go into more detail about doing simulations in R during the upcoming GDAT class in August.

To leave a comment for the author, please follow the link and comment on their blog: Taking the Pith Out of Performance.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)