General Behavior of Queueing Systems

1. General Behavior of Queueing Systems

In this section, we develop a simulation environment for queueing systems to explore their behavior and visualize queue-length dynamics.

The first aspect to address is system stability. Intuitively, when work arrives faster than it can be served, on average, the queue grows; if work arrives slower, the queue remains close to the origin. To illustrate this, \cref{fig:drift} shows queue-length sample paths for several arrival rates and service capacities.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style("whitegrid")
tex_fonts = {
    "text.usetex": True,
    "font.family": "fourier",
    "axes.labelsize": 10,
    "font.size": 10,
    "legend.fontsize": 8,
    "xtick.labelsize": 8,
    "ytick.labelsize": 8,
}

plt.rcParams.update(tex_fonts)

After importing the standard modules; cf. \cref{sec:simul-psych-case}, the function below simulates the queue-length process for arrivals \(a=(a_{0}, a_{1}, \ldots, a_{n})\) and capacities \(c = (c_{0}, c_{1}, \ldots, c_{n})\). We set the initial queue length to L0 and start simulating from time \(1\), so \(a_{0}\) and \(c_{0}\) remain unused. Jobs arriving in period \(i\) can be served in that same period.

def compute_queue_length(a, c, L0=0):
    "Departures and queue lengths for arrivals, capacities, and init level L0."
    L = np.zeros_like(a)
    d = np.zeros_like(a)
    L[0] = L0
    for i in range(1, len(a)):
        d[i] = min(c[i], L[i - 1] + a[i])
        L[i] = L[i - 1] + a[i] - d[i]
    return d, L

Next, we set the parameters for the simulation.

labda, mu = 6, 7  # arrival and service rates
L0 = 40  # starting level of queue
num = 50  # run length
x = np.arange(0, num)  # x values for the plot

To illustrate stochastic variability, we plot several sample paths of the queue-length process. We use ten random seeds, simulate each case, and plot all paths. For the simulation, we take \(a_{k}\) uniformly distributed on the integers \(\lambda-1, \lambda, \lambda +1\) with \(\lambda=6\), so that \(6\) jobs arrive on average in a period. Because rng.integers(l, h) draws iid uniformly distributed from l to h-1, we set h = labda + 2. The constant service capacity \(\mu=\lambda + 1 =7\) makes the system stable.

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(6, 3), sharey=True)
ax1.set_xlabel("time")
ax1.set_ylabel("Queue length")
ax1.set_title("Stable system")

for seed in range(10):
    rng = np.random.default_rng(seed)
    a = rng.integers(labda - 1, labda + 2, size=num)
    c = mu * np.ones_like(a)
    d, L = compute_queue_length(a, c, L0)
    ax1.plot(x, L, linewidth=0.75)

Repeating with \(\mu=5\) yields an unstable system. We finish by writing the figure to a file. \cref{fig:drift} shows the result of our work.

mu = 5
ax2.set_xlabel("time")
ax2.set_title("Unstable system")
for seed in range(10):
    rng = np.random.default_rng(seed)
    a = rng.integers(labda - 1, labda + 2, size=num)
    c = mu * np.ones_like(a)
    d, L = compute_queue_length(a, c, L0)
    ax2.plot(x, L, linewidth=0.75)


fig.tight_layout()
fig.savefig('../figures/queue-discrete-time-stability.pdf')

./../figures/queue-discrete-time-stability.pdf

How about reducing the difference between \(\lambda\) and \(\mu\)? When \(\mu\) barely exceeds \(\lambda\), large fluctuations take longer to clear. To see this in quantitative terms, we run two simulations with smaller values for \(\mu\) while keeping \(\lambda=6\), and we compute the mean and variance of the number of jobs in the system.

We need a sequence of random integers \(c_{k}\) with mean \(\mu=6.2\). One way to make such a sequence is to set \(c_{k} = 6 + b_{k}\) where \(b_{k}\) is a Bernoulli rv with success probability \(p=\mu - 6 = 0.2\).

labda, mu, L0 = 6, 6.5, 0
num = 5000
rng = np.random.default_rng(4)

a = rng.integers(labda - 1, labda + 2, size=num)
c = int(mu) * np.ones_like(a)
p = mu - int(mu)  # fractional part of mu
c += rng.binomial(1, p, size=len(c))
print(f"{c.mean()=:.2f}")  # just a check
d, L = compute_queue_length(a, c, L0)
print(f"{L.mean()=:.2f}, {L.var()=:.2f}")

c.mean()=6.49
L.mean()=0.45, L.var()=0.64

If we rerun with \(\mu=6.1\), we obtain this.

labda, mu, L0 = 6, 6.1, 0
num = 5000
rng = np.random.default_rng(4)

a = rng.integers(labda - 1, labda + 2, size=num)
c = int(mu) * np.ones_like(a)
p = mu - int(mu)  # fractional part of mu
c += rng.binomial(1, p, size=len(c))
print(f"{c.mean()=:.2f}")  # just a check
d, L = compute_queue_length(a, c, L0)
print(f"{L.mean()=:.2f}, {L.var()=:.2f}")

c.mean()=6.10
L.mean()=4.36, L.var()=25.30

Here \(\E L\) stands for the sample average \(n^{-1}\sum_{k=1}^{n} L_{k}\), and likewise for \(\V L\).¹¹ We are in a state of sin here: the mean of a rv is not the same as its sample average. The strong law of large numbers makes a statement only about the limit; for small sample sizes, the difference can be quite large. As expected, the mean queue length rises when capacity tightens.

Would variability affect the average queue length? We can suspect this because when \(a_{k}=1\) and \(c_{k}=1\) for all \(i\), and \(L_{0} = 0\), then \(L_{k} = 0\) for all \(i\). If instead \(c_{1}=5\) when \(i\) is a multiple of \(5\) and \(0\) otherwise, queues do appear and disappear.²² Observe that the mean service time is still 1; only the periods in which service is offered changed. Let us use simulation to see how variability can influence the queue-length process. The goal is now to make \cref{fig:scv}.

We first need a way to quantify relative variability. Suppose the standard deviation of the service time is \(1\) minute. If the mean service time is 1 hour, we are inclined to call the service times very regular, while if the mean is just 1 minute, we would call it irregular. To capture this this relative effect, we define the \recall{square coefficient of variation (scv)} of the rv \(X\) as³³ The scv is unit-less.

\begin{equation*} C^2_{X}:= \frac{\V X}{(\E X)^2}. \end{equation*}

Continuing with the simulation, we seek a probability distribution for \(\{c_{k}\}\) with mean \(\mu\) and scv \(c^{2}\). A simple way to achieve this is to use a rv \(X\) such that \(\P{X=b} = p\) and \(\P{X=0} = 1-p\), and require that \(\E X = p b= \mu\) and \(C^2_X = c^{2}\). Using \(\V X = \E{X^{2}} - (\E X)^{2}\), we see that \(C^{2}_{X} = \E{X^{2}}/(\E X)^{2} - 1\). Rewriting, we get \(\E{X^{2}} = \mu^{2}(c^2+1)\), but we also know that \(\E{X^{2}} = p b^{2}\) for a Bernoulli distributed rv. Solving for \(b\) and \(p\) in terms of \(\mu\) and \(c^{2}\) gives \(p = 1 / (1 + c^{2})\) and \(b=\mu/p = \mu (1+c^{2})\).

The next function implements the construction of a list of rvs with the required properties.

def make_vector_mean_scv(mu, scv, num):
    "vector of iid rvs with a given mean mu and scv."
    p = 1 / (1 + scv)
    b = mu / p
    vec = b * rng.binomial(1, p, size=num)
    return vec

We simulate for \(c^2=0.5, 1, 2\) and with constant arrivals.

num = 5000
x = np.arange(0, num)
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(6, 3), sharey=True)

ax1.set_title("Constant arrivals")
ax1.set_xlabel("time")
ax1.set_ylabel("Queue length")
a = make_vector_mean_scv(mu=6, scv=0, num=num)  # constant arrivals
for scv in (2, 1, 0.5):
    c = make_vector_mean_scv(mu=6.5, scv=scv, num=num)
    d, L = compute_queue_length(a, c, L0)
    ax1.plot(x, L, linewidth=0.5, label=f"$c^2 = {scv}$")
ax1.legend()

We repeat this to see the effect of variability in both the arrivals and services. \cref{fig:scv} shows that higher variability yields larger queue-length fluctuations.

ax2.set_xlabel("time")
ax2.set_title("Both variable")
for scv in (2, 1, 0.5):
    a = make_vector_mean_scv(mu=6, scv=scv, num=num)
    c = make_vector_mean_scv(mu=6.5, scv=scv, num=num)
    d, L = compute_queue_length(a, c, L0)
    ax2.plot(x, L, linewidth=0.5, label=f"$c^2 = {scv}$")

fig.tight_layout()
fig.savefig('../figures/queue-discrete-time-scv.pdf')

./../figures/queue-discrete-time-scv.pdf

In conclusion, tight capacity or high relative variability leads to large queues. To improve performance, we have basically four options: increase capacity, reduce load (or demand), reduce the scv of the arrivals or of the services.⁴⁴ Of course, these options are not exclusive; for example, we can try to reduce the variability of the service process and add some capacity at the same time.

Finally, we consider queues in sequence. For example, obtaining grades for a course involves several sequential steps. First, I grade the exams (and handle all changes that might occur during the review). I send a csv file with the grades to the secretary who uploads it. The secretary prints the file for my signature, after which it is logged by the administration. In abstract terms, there are three stations (me, the secretary, and the back office), and I am involved at least twice, once for the grading and a second time for the signing. More generally, most service systems involve multiple servers visited⁵⁵ Perhaps even multiple times. to complete a service.

To gain insight into such systems, we simulate a chain of \(4\) stations in tandem, which means that jobs arrive at station 1, are then sent to station 2, etc., until they leave the network after station \(4\). Jobs neither visit the same server nor leave the network prematurely.

We generate arrivals at Station 1 and service capacities at each of the stations. Note that station 2 has the slower server.⁶⁶ Even though a Poisson distributed rv can have a fractional rate, the values of the rv are integer.

labda, L0 = 6, 0
num = 5000
x = np.arange(num)

rng = np.random.default_rng(3)
a = rng.integers(labda - 1, labda + 2, size=num)
c1 = rng.poisson(labda + 1, num)
c2 = rng.poisson(labda + 0.1, num)
c3 = rng.poisson(labda + 1.5, num)
c4 = rng.poisson(labda + 1.0, num)

It remains to make the plots.

fig, ax = plt.subplots(figsize=(5, 3))
ax.set_xlabel("time")
ax.set_ylabel("Queue length")

d1, L1 = compute_queue_length(a, c1)
ax.plot(x, L1, linewidth=0.5, label="Q1")

d2, L2 = compute_queue_length(d1, c2)
ax.plot(x, L2, linewidth=0.5, label="Q2")

d3, L3 = compute_queue_length(d2, c3)
ax.plot(x, L3, linewidth=0.5, label="Q3")

d4, L4 = compute_queue_length(d3, c4)
ax.plot(x, L4, linewidth=0.5, label="Q4")

plt.legend()
fig.tight_layout()
fig.savefig('../figures/queue-discrete-network.pdf')

\cref{fig:tandem} shows that the station with the slowest server typically has the longest queue. This brings us to an interesting idea: to improve performance, invest in the bottleneck. In practice, identifying the bottleneck can be difficult, because it is difficult to estimate service capacities.⁷⁷ The capacity may depend on the person at the station on a certain day. There can also be disturbances due to missing raw materials or machine breakdowns. In short, in practical situations, there are many messy details that disturb the assembly of high-quality data. However, often the station with the largest queue is the bottleneck.

./../figures/queue-discrete-network.pdf

TF 1:

A general queueing system is stable when the arrival rate is not higher than the service rate.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

False. The condition allows \(\lambda = \mu\), in which case stochastic queues are not stable.

TF 2:

If we find the average queue length too large, then we must increase the average service rate to decrease the average queue length.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

False. We can also block jobs, or reduce the variability of the arrival or service process.

TF 3:

We consider a discrete-time queueing system with \(a_k\) the number of arrivals in period \(k\), and \(c_k\) the service capacity. Let \(d_k = \min\{L_{k-1}, c_k\}\), \(L_k = L_{k-1} -d_k + a_k\). Take \(a_k \sim \Pois{2}\), \(c_k\sim \Pois{1}\), and \(L_0 = 0\). Claim: \(\P{L_{1000} \geq 100} \leq 1/2\).

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

False. Since, on average, two jobs arrive per period and one leaves, the queue length increases on average by one job per period. After 1000 periods, the system must contain about 1000 jobs. As the Poisson distribution has a small relative variability (if \(X\sim \Pois{\lambda}\), then \(\V X / (\E X))^{2} = 1/\lambda\)), the probability that \(L_{1000}< 100\) is exceedingly small.

TF 4:

The scv of a rv \(X\) is the same scv as \(\alpha X\) if \(\alpha\) is a positive scalar.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

True.

TF 5:

The parking meters at the universities are redesigned. A study has shown that the average amount of time individuals need to spend to pay for parking meters has decreased. Claim: As long as the inter-arrival distribution stays the same, the average queue length at the parking meter will be smaller.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

False. variability in payment time also affects the queue length.

TF 6:

Claim: When \(X \sim \Exp{\lambda}\) its scv exceeds 1.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

False.

TF 7:

Consider a queue with an exponential inter-arrival distribution and a uniform service time distribution on \([0,A]\). Claim: \(C_s^2 = \frac{1}{3}\).

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

True.

Exercise 8:

A queueing system is under periodic review, that is, at the end of each period, the queue length is measured. Jobs arriving at period \(k\) cannot be served in period \(k\) and the system cannot contain more than \(K\) jobs; in other words, jobs are blocked when the system is too full. Develop code to simulate \(\{L_k\}\) and compute the amount of jobs lost per period and the fraction of jobs lost after simulating for \(T\) periods.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

All jobs that arrive such that the queue becomes larger than \(K\) must be dropped.

import numpy as np

rng = np.random.default_rng(3)
a = rng.integers(3, 8, size=100)
c = 2 * np.ones_like(a)
L = np.zeros_like(a)
l = np.zeros_like(a)  # store the lost jobs

K = 8  # loss level

for k in range(1, len(a)):
    d = min(L[k - 1], c[k])
    Lp = L[k - 1] + a[k] - d  #  without loss
    L[k] = min(Lp, K)  #  chop off at K
    l[k] = Lp - L[k]  #  lost

print(sum(l) / sum(a))  # fraction lost.

Exercise 9:

All items produced by a machine are tested. With probability \(p\) an item turns out to be defective and put at at the head of the queue. The production time of new and faulty items is equal. Suppose that items cannot be served in the period in which they arrive, develop a set of recursions to simulate \(\{L_k\}\).

Hint

When the machine makes \(n\) items, and each of these is faulty with probability \(p\), then the number of faulty items is \(\Bin{n, p}\).

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

Here is the code.

import numpy as np

rng = np.random.default_rng(3)
a = rng.integers(3, 8, size=100)
c = 7 * np.ones_like(a)
L = np.zeros_like(a)
d = np.zeros_like(a)
p = 0.1

for k in range(1, len(a)):
    produced = min(L[k - 1], c[k])
    faulty = rng.binomial(produced, p)
    d[k] = produced - faulty
    L[k] = L[k - 1] + a[k] - d[k]

print(L.mean())

Observe that faulty items remain in the system.

An interesting challenge: Can you use these recursions to \emph{prove} that the long-run average service capacity \(n^{-1}\sum_{k=1}^n c_k\) must be larger than \(\gamma/(1-p)\), where \(\gamma = \lim_{n\to \infty} n^{-1}\sum_{k=1}^n a_k\) is the arrival rate of new jobs?

Exercise 10:

Sometimes items need re-work. Suppose a fraction \(p\) of items does not meet the quality requirements after the first pass at a machine, but requires a second pass to repair the problems. Assume that repair jobs need half the service time of new jobs and are served with priority over new jobs. Develop the recursions.

Hint

Make a queue for new and repaired items and use \cref{ex:l-117}.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

In code:

a = [0, 4, 8, 2, 1]
c = [3] * len(a)
L_R = [0] * len(a)  # repair jobs
d_R = [0] * len(a)  # departing repair jobs
L_N = [0] * len(a)  # new jobs
d_N = [0] * len(a)  # departing new jobs
p = 0.2

L_R[0] = 2
L_N[0] = 8

for k in range(1, len(a)):
    l = L_R[k - 1]
    if l % 2 == 1:
        l -= 1
    d_R[k] = min(l, 2 * c[k])
    c_N = c[k] - d_R[k] / 2  # capacity left for new jobs
    d_N[k] = min(L_N[k - 1], c_N)
    a_R = int(p * d_N[k] + 0.5)  # rounding
    a_N = a[k]
    L_R[k] = L_R[k - 1] + a_R - d_R[k]
    L_N[k] = L_N[k - 1] + a_N - d_N[k]

print(L_R)
print(L_N)

We need a trick to get around the division by 2 in line 17 (because if L_R[k] is odd we might end up with 0.5). For this, we should check whether L_{R}[k] is odd or not. If so, it must be at least 1, and then we can subtract 1 to make \(l\) even. This \(l\) can be used to compute the number of departures.

The above code is easily converted to mathematics. Lines 13 to 15 might be a bit problematic, but that is also easy. When writing line 16 like this,

\begin{equation} \label{eq:20} d_{R,k} = 2 \min\{\lfloor L_{R,k-1}/2\rfloor, c_{k}\}, \end{equation}

we obtain the effect of lines 13 to 16 in one step.

As a general observation: there are many ways to organize the repairs. Do we serve them with priority or not, do we bin them and make new items instead, do we do the repairs on another, separate, station? Such differences should be reflected in a simulation model.

Exercise 11:

Consider a network with two production stations in tandem, i.e., all jobs from station 1 move on to station 2, and blocking, i.e., the server at station 1 is not allowed to produce more than the amount \(M\) station \(2\) can maximally contain. Assume that jobs moving from station 1 to station 2 arrive at the end of a period (hence cannot be served in that period), and must fit at the second station before any jobs from station 2 leave. Find recursions.

Hint

In addition to the regular conditions, we need to impose \(c^1_{k+1} \leq M-L^2_k\).

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

Using \cref{ex:l-118} with \(d^1_k = \min\{\L^1_{k-1}, c_k^1, M-\L^2_{k-1}\}\) is nearly correct. But what if \(\L^2_{k-1}>M\) for the first few periods? Therefore, instead take \(d^1_k = \min\{\L_{k-1}^1, c_k^1, [M-\L^2_{k-1}]^+\}\}\).

a1 = [0, 2, 3, 8, 0, 9]
c1 = [3] * len(a1)
c2 = [2] * len(a1)
L1 = [0] * len(a1)
L2 = [0] * len(a1)

M2 = 10

if L2[0] > M2:
    print("The starting value of L2 is too large")
    exit(0)

for k in range(1, len(a1)):
    d1 = min(L1[k - 1], c1[k], M2 - L2[k - 1])
    L1[k] = L1[k - 1] + a1[k] - d1
    d2 = min(L2[k - 1], c2[k])
    L2[k] = L2[k - 1] + d1 - d2

Exercise 12:

Consider the discrete-time model specified by \cref{eq:31}. We assume that \(a_{k}\) is a \emph{batch} of jobs arriving in period \(k\) after the departures \(d_k\) have left. Provide a simulation to estimate \(\P{\L\leq m}\) for a situation in which the first job of the batch sees \(L_{k-1} - d_k\) jobs in the system, the second sees \(L_{k-1}-d_k + 1\) jobs, and so on.

Solution

Did you actually try? Maybe see the ‘hints’ above!:

Solution, for real

The idea is as follows. The dictionary L_count counts the number of jobs that see \(0\), \(1\), and so on, in the system.

Here is the code. As an aside, it was hard to make it this simple.

from collections import defaultdict

L_count = defaultdict(int)

a = [0, 2, 5, 1, 2]
c = [0, 1, 1, 1, 1]

d = [0] * len(a)
L = [0] * len(a)

for k in range(1, len(a)):
    d[k] = min(L[k - 1], c[k])
    L[k] = L[k - 1] + a[k] - d[k]
    for i in range(a[k]):
        L_count[L[k - 1] - d[k] + i] += 1


# normalize
tot = sum(L_count.values())
L_dist = {k: v / tot for k, v in L_count.items()}

print(L_count)
print(L_dist)

defaultdict(<class 'int'>, {0: 1, 1: 2, 2: 1, 3: 1, 4: 1, 5: 3, 6: 1})
{0: 0.1, 1: 0.2, 2: 0.1, 3: 0.1, 4: 0.1, 5: 0.3, 6: 0.1}

This work is licensed by the University of Groningen under a Creative Commons Attribution-ShareAlike 4.0 International License.