1. Simulating Waiting Times
Using the recursions of \cref{sec:constr-gg1-queu}, we can simulate the waiting time of the \(G/G/1\) queue. We use this to analyze how various blocking strategies can prevent long waiting times. In passing we will encounter subtle problems with stochastic dependence and with (plotting) pmfs.
The dynamics of the single-server queue in continuous time is given by the recursions in \cref{eq:qc-3}.
The corresponding code implements the formulas of \cref{sec:constr-gg1-queu} to compute the departure times \(D\), the sojourn times \(J\), the waiting times \(W\), and the time-average number of jobs in the system from \cref{eq:50}.
We set X[0] = 0 so that A[0] = 0.
Since job \(0\) is not served, we set S[0] = 0.
import numpy as np
from functions import Plus
rng = np.random.default_rng(3)
labda, mu = 1, 1.1
num = 10000
X = rng.exponential(scale=1 / labda, size=num)
S = rng.exponential(scale=1 / mu, size=num)
X[0] = S[0] = 0
A = X.cumsum()
D = np.zeros_like(X)
for i in range(1, num):
D[i] = max(A[i], D[i - 1]) + S[i]
J = D - A # sojourn times
W = J - S # waiting times
L_tilde = J.sum() / D[-1] # time-average number in system
print(f"{J.mean()=:.2f}, {J.std()=:.2f}")
print(f"{W.mean()=:.2f}, {W.std()=:.2f}")
print(f"{L_tilde=:0.3f}")
J.mean()=10.11, J.std()=9.03
W.mean()=9.20, W.std()=8.99
L_tilde=10.167
There is a small point of concern.
Calling J.mean() computes the mean sojourn time, but this includes job 0.
Since the simulation excludes this job, the correct mean is J.mean() * len(J) / (len(J) -1).
Here we neglect this.1
Be aware, in critical software you should not use such tricks.
We can also compute the maximum waiting time and the fraction of jobs that have to wait longer than \(T=5\).
The expression (W >= T) returns a Boolean array whose \(i\) element equals \(\1{W_{i} \geq T}\), so (W >= T) gives the total number of jobs with long waits, and (W >= T).mean() the fraction.
We see that the maximum waiting time is indeed much larger than \(T\), and that more than half of the jobs have to wait a long time.
T = 5 # threshold for long waiting
print(f"{W.max()=:.2f}, {(W >= T).mean()=:.2f}")
W.max()=52.12, (W >= T).mean()=0.59
Blocking jobs can be a remarkably effective strategy to prevent long and variable waiting times. One rule, known as complete acceptance, only accepts a job when \(W_k \leq T\), and otherwise blocks the job.
Here is a hack to implement this rule. When \(W_k > T\), job \(k\) should not enter. That means that its service time should not be added to the waiting time, which can be achieved by setting \(S_k=0\).
W = np.zeros_like(S)
for k in range(1, len(S)):
W[k] = Plus(W[k - 1] + S[k - 1] - X[k])
S[k] = S[k] if W[k] <= T else 0
Such code is undesirable, it modifies the primary data.2 Here the service times. Consequently, these data cannot be reused later.3 Golden rule: never modify primary data.
A better approach uses an indicator \(I_k\).
When \(W_k \geq T\), set \(I_k=0\) (reject); otherwise \(I_k=1\) (accept).
Note that job \(k\) depends on I[k-1].
#+begin_src python :results none
W = np.zeros_like(S) # Reset waiting times
I = np.zeros_like(S, dtype=int) # Accept or not
for k in range(1, len(S)):
W[k] = Plus(W[k - 1] + S[k - 1] * I[k - 1] - X[k])
I[k] = 1 if W[k] <= T else 0
Under this rule, about 90% of the jobs is accepted (I.mean() is 0.9), yet \(\E W\) drops by a factor 4, and the maximum by more than a factor 10.
print(f"{W.mean()=:.2f}, {I.mean()=:.2f}")
print(f"{W[I == 1].mean()=:.2f}, {W[I==1].max()=:.2f}")
print(f"{len(W)=}, {len(W[I == 1])=}")
W.mean()=2.23, I.mean()=0.90
W[I == 1].mean()=1.83, W[I==1].max()=5.00
len(W)=10000, len(W[I == 1])=8994
There are other interesting rules to block jobs. A second rule, complete rejection, accepts only if \(J_k \leq T\).
W = np.zeros_like(S)
I = np.zeros_like(S, dtype=int) # Accept or not
for k in range(1, len(S)):
W[k] = Plus(W[k - 1] + S[k - 1] * I[k - 1] - X[k])
I[k] = 1 if W[k] + S[k] <= T else 0
This is an interesting result. Compared with complete acceptance, the acceptance probability is higher because jobs never occupy the server longer than \(T=5\).4 In other words, only small jobs are accepted.
print(f"{W.mean()=:.2f}, {I.mean()=:.2f}")
print(f"{S.mean()=:.2f}, {S[I == 1].mean()=:.2f}")
print(f"{S.max()=:.2f}, {S[I == 1].max()=:.2f}")
W.mean()=1.42, I.mean()=0.94
S.mean()=0.91, S[I == 1].mean()=0.81
S.max()=9.41, S[I == 1].max()=4.92
A third rule, partial acceptance, keeps a middle ground: it accepts only the portion of a job’s service time that fits, that is, \(\min\{T - W_{k}, S_{k}\}\).
W = np.zeros_like(S)
J = np.zeros_like(S) # Sojourn times
accepted_service = np.ones_like(S)
for k in range(1, len(S)):
W[k] = Plus(W[k - 1] + accepted_service[k - 1] - X[k])
accepted_service[k] = min(T - W[k], S[k])
J[k] = W[k] + accepted_service[k]
print(f"{W.mean()=:.2f}, {J.mean()=:.2f}")
print(f"{accepted_service.mean()=:.2f}")
print(f"{accepted_service.max()=:.2f}")
W.mean()=1.84, J.mean()=2.64
accepted_service.mean()=0.80
accepted_service.max()=5.00
We next show how to obtain the \emph{distribution} of the waiting time and departure time of job \(k=30\).5 Job \(k=30\) is chosen arbitrarily. We use both the probabilistic arithmetic of \cref{sec:prob_arithmetic} and simulation.
import numpy as np
import matplotlib.pyplot as plt
import random_variable as rv
from latex_figures import apply_figure_settings
apply_figure_settings(True)
The inter-arrival and service distributions are as follows. We choose \(\E S > \E X\) to get more widely spread values in the figures.
X = rv.RV({3: 1 / 3, 4: 1 / 3, 5: 1 / 3})
S = rv.RV({4: 1 / 3, 5: 1 / 3, 7: 1 / 3})
print(f"{X.mean()=:0.2f}, {S.mean()=:0.2f}")
X.mean()=4.00, S.mean()=5.33
We set the initial arrival such that \(\P{A_{0} = 0}=1\).
The first call A += X computes the pmf of the rv \(A_{1}\).
Since the system starts empty, \(\P{W_{1} = 0} = 1\).
In the for loop, we update \(A_{k}\) according to the rule \(A_{k} = A_{k-1} + X_{k}\) and the waiting time with \(W_{k} = [W_{k-1} + S_{k-1} - X_{k}]^{+}\).
We start the for loop at \(2\), because the lines before the loop compute the pmf of \(A_{1}\) and \(W_{1}\).
We later explain why the line computing \(D\) is \emph{wrong}.
horizon = 30
A = rv.RV({0: 1}) # Arrival time A_0
A += X # Arrival time A_1
W = rv.RV({0: 1}) # Waiting time W_1
# Mind that we start at 2 instead of 1
for i in range(2, horizon):
A += X
W = rv.apply_function(Plus, W + S - X)
D = A + W + S # These are not the departure times!
print(f"{W.mean()=:0.2f}, {W.std()=:0.2f}")
print(f"{D.mean()=:0.2f}, {D.std()=:0.2f}")
W.mean()=37.51, W.std()=7.82
D.mean()=158.84, D.std()=9.06
Next, we use simulation to estimate the pmf and cdf of \(W_{30}\) and \(D_{30}\).
The printed values show that the mean and the standard deviations for the waiting time match well between theory and simulation.
However, the standard deviation of d30 differs noticeably, indicating an issue.
num_runs = 1000
w30 = np.zeros(num_runs)
d30 = np.zeros(num_runs)
rng = np.random.default_rng(3)
for run_no in range(num_runs):
x = X.rvs(horizon, rng)
s = S.rvs(horizon, rng)
x[0] = s[0] = w = 0
for i in range(1, horizon):
w = Plus(w + s[i - 1] - x[i])
w30[run_no] = w
a = x.cumsum()
d30[run_no] = a[i] + w + s[i]
print(f"{w30.mean()=:0.2f}, {w30.std()=:0.2f}")
print(f"{d30.mean()=:0.2f}, {d30.std()=:0.2f}")
w30.mean()=37.36, w30.std()=7.61
d30.mean()=158.74, d30.std()=6.45
We now make a plot of the cdf and pmf of \(W_{30}\), see 1.
The keyword cumulative=True converts a histogram into a cdf.6
Recall that a histogram approximates a pmf.
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(6, 3))
ax1.set_title("Waiting times cdf")
ax1.set_xlabel("Time")
ax1.plot(
W.support(), [W.cdf(k) for k in W.support()], c='k', lw=0.75, label="exact"
)
ax1.hist(
w30,
bins=30,
density=True,
cumulative=True,
histtype='step',
align="right",
color='k',
lw=0.5,
label="sim",
)
ax1.legend()
ax2.set_title("Waiting times pmf")
ax2.set_xlabel("Time")
ax2.hist(
w30,
bins=30,
density=True,
histtype='step',
align="right",
color='k',
lw=0.7,
label="sim",
)
x = W.support()
y = [W.pmf(k) for k in W.support()]
ax2.plot(x, y, color='k', lw=0.75, label="exact")
ax2.legend()
fig.tight_layout()
fig.savefig("../figures/waiting_time_distribution.pdf")
./../figures/waiting_time_distribution.pdf
Let us now discuss why the theoretical \(D_{30}\) and simulated d30 have different standard deviations.
The reason is that \(A_k\) and \(W_k\) are negatively correlated.
Recall that \(A_k = A_{k-1}+X_k\) and \(W_k = [W_{k-1} + S_{k-1} - X_k]^+\).
If \(X_k\) is very large, \(A_k\) is large, but \(W_k\) is zero (or small), and then \(D_k \approx A_{k} + S_{k}\) .
In probabilistic arithmetic addition, of two rvs is only allowed when they are independent.7
Handling such dependencies requires the modeling of the joint distribution of rvs.
The python package lea supports this.
The expression D = A + W + S violates this assumption.
But why then is the recursion \(W_k = [W_{k-1} + S_{k-1} - X_k]^{+}\) correct?
This is because \(W_{k-1}\) \emph{is} independent of \(S_{k-1}\) and \(X_{k}\).
Stochastic dependence is easily overlooked. To avoid making mistakes, check analytical results with simulation. If the results disagree, there is probably an error8 Often not just one error! somewhere.
In general, the cdf behaves more regularly than the pmf, even though the pmf seems easier to interpret. To show this, modify the support of \(X\) to \(\{3, 4.1, 5\}\), keep all else the same, and simulate. The resulting pmf in 2 differs markedly from that in 1!
./../figures/waiting_time_distribution_pmf_4.1.pdf
X = rv.RV({4.1: 1 / 3, 5: 1 / 3, 3: 1 / 3})
S = rv.RV({4: 1 / 3, 5: 1 / 3, 7: 1 / 3})
horizon = 30
A = rv.RV({0: 1}) # arrival time
W = rv.RV({0: 1}) # waiting time
for i in range(1, horizon):
A += X
W = rv.apply_function(Plus, W + S - X)
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(6, 3))
ax1.set_title("Waiting times cdf")
ax1.set_xlabel("Time")
ax1.plot(
W.support(), [W.cdf(k) for k in W.support()], c='k', lw=0.75, label="exact"
)
ax2.set_title("Waiting times pmf")
ax2.set_xlabel("Time")
ax2.plot(
W.support(), [W.pmf(k) for k in W.support()], c='k', lw=0.75, label="exact"
)
fig.tight_layout()
fig.savefig("../figures/waiting_time_distribution_pmf_4.1.pdf")
We finish this section with coding the multi-server queue of \cref{eq:41}. We take \(\mu(i)=1\) for all servers.
import numpy as np
c = 3
N = 10
one = np.ones(c, dtype=int) # vector with ones
mu = np.array([1, 1, 1])
X = np.ones(N + 1, dtype=int)
S = 5 * np.ones(N, dtype=int)
w = np.zeros(c, dtype=int)
W = J = A = D = 0
for k in range(1, N):
A += X[k] # arrival time
s = w.argmin() # server chosen
W = w[s] # waiting time
S = S[k] / mu[s]
J = W + S # sojourn time
D = A + J # departure time
print(k, S, W, w)
w[s] += S # update w
w = np.maximum(0, w - X[k + 1] * one)
Claim: This code prints the mean departure time.
import numpy as np
rng = np.random.default_rng(3)
labda, mu = 3, 4
num = 10
X = rng.exponential(scale=1 / labda, size=num)
Z = rng.exponential(scale=1 / mu, size=num)
X[0] = Z[0] = 0
A = X.cumsum()
D = np.zeros_like(X)
for i in range(1, num):
D[i] = max(A[i], D[i - 1]) + Z[i]
print(D.mean())
Solution
Solution, for real
False. The correct quantity is \texttt{D[1:].mean()}.
Consider a continuous-time queue with inter-arrival times \(\{X_k\}\), service times \(\{S_k\}\), waiting times \(\{W_k\}\), and sojourn times \(\{J_k\}\). Claim: we can compute the waiting times and sojourn times using the recursion: \[ W_k = [J_{k-1} - X_k]^+, \quad J_k = W_k + S_k. \]
Solution
Solution, for real
True.
Claim: This code uses the recursion \(D_k = \max\{D_{k-1}, A_k\} + S_k\) in the correct way to compute the probability distribution of \(\{D_k\}\).
A = rv.RV({0: 1})
D = rv.RV({0: 1})
for i in range(1, horizon):
A += X
D = rv.compose_function(max, D, A) + S
Solution
Solution, for real
False, it neglects the dependence between \(A_k\) and \(D_{k-1}\).
Claim: In the \(G/G/c\) queue, the next rule selects the queue with the smallest sojourn time.
for k in range(1, N):
A += X[k] # arrival time
s = w.argmin() # server chosen
W = w[s] # waiting time
S = S[k] / mu[s]
J = W + S # sojourn time
D = A + J # departure time
print(k, S, W, w)
w[s] += S # update w
w = np.maximum(0, w - X[k + 1] * one)
Solution
Solution, for real
False. It picks the server with the smallest waiting time. When the service rates are different, the sojourn time is also affected by the speed of the server.
This work is licensed by the University of Groningen under a
Creative Commons Attribution-ShareAlike 4.0 International License.