In this section we analyze the correctness of \eqref{org8619836} and \eqref{orgf33074a} for continuous memoryless rvs, i.e., exponentially distributed rvs.
I decided to analyze this in as much detail as I could think of, hoping that this would provide me with a lead to see how to generalize the equation \(\E M = \E L + \E X\) such that it covers also the case with geometric rvs.
First we need to recall some basic facts about the exponential distribution.
Show that \(X\) is memoryless.
Solution
With \(t \geq s\),
\begin{equation*}
\P{X > t|X > s} = \frac{\P{X > t}}{\P{X > s}} = e^{-\lambda t} e^{\lambda s} = e^{-\lambda(t-s)} = \P{X > t - s}.
\end{equation*}
Show that \(\E X = 1/\lambda\).
Hint
Use partial integration.
Solution
It is essential that you know both methods to solve this integral.
\begin{align*}
\E X
&= \int_{0}^{\infty} x f_{X}(x) \d x \\
&= \int_{0}^{\infty} x \lambda e^{-\lambda x} \d x \\
&= - xe^{-\lambda x} \biggr|_{0}^{\infty} + \int_{0}^{\infty} e^{-\lambda x} \d x \\
&= - \frac 1 \lambda e^{-\lambda x} \biggr|_{0}^{\infty} = \frac 1 \lambda.
\end{align*}
Substitution is also a very important technique to solve such integrals. Here we go again:
\begin{align*}
\E X
&= \int_{0}^{\infty} x f_{X}(x) \d x \\
&= \int_{0}^{\infty} x \lambda e^{-\lambda x} \d x \\
&= \frac 1 \lambda \int_{0}^{\infty} u e^{- u} \d u,
\end{align*}
by the substitution \(u=\lambda x \implies \d u = \d (\lambda x) \implies \d u = \lambda \d x \implies \d x = \d u / \lambda\).
With partial integration (do it!), the integral evaluates to \(1\).
Now we can shift our attention to the rvs \(L\) and \(M\).
Show that \(F_L(x) = 1-e^{-2\lambda x}\).
Hint
Using independence and the specific property of the r.v. \(L\) that \(\{L > x\} \iff \{X > x, Y > x\}\):
Solution
With the hint,
\begin{align*}
G_L(x) = \P{L > x} &= \P{X > x, Y > x} = G_X(x)^{2} = e^{-2\lambda x}.
\end{align*}
The result follows since \(F_{L}(x) = 1-G_{L}(x)\).
Clearly, this implies that \(L\sim \Exp{2\lambda}\) and \(\E L = 1/(2\lambda) = \E X /2\). Hence, we see that \eqref{org17bdd80} holds now.
Moreover, with the same trick we see that the distribution function \(F_{M}\) for the maximum \(M\) is given by
\begin{equation*}
F_{M}(v) = \P{M\leq v} = \P{X \leq v, Y \leq v} = (F_{X}(v))^{2} = (1-e^{-\lambda v})^{2}.
\end{equation*}
Of course this is a nice trick, but it is not a method that allows us to compute the distribution for more general functions of \(X\) and \(Y\).
For more general cases, we have to use the fundamental bridge and LOTUS, that is,
for any set \(A\) in the domain of \(X \times Y\)
\begin{equation*}
\begin{split}
\P{g(X, Y) \in A}
&= \E{\1{g(X,Y)\in A}}
= \iint \1{g(x,y) \in A} f_{X,Y}(x,y) \d x \d y \\
&= \iint_{g^{-1}(A)} f_{X,Y}(x, y) \d x \d y
= \iint_{\{(x,y): g(x,y) \in A\}} f_{X,Y}(x,y) \d x \d y.
\end{split}
\end{equation*}
The joint CDF \(F_{X,Y}\) then follows because \(F_{X,Y}(x,y) = \P{X\leq x, Y\leq y} = \E{\1{X\leq x, Y\leq y}}\).
A warning is in place: conceptually this approach is easy, but doing the integration can be very challenging (or impossible). However, this expression is very important as this is the preferred way to compute distributions by numerical methods and simulation.
Use the fundamental bridge to re-derive the above expression for \(F_M(v)\).
Solution
\begin{align*}
\P{M\leq v}
&= \E{\1{M\leq v}} \\
&=\int_0^{\infty} \int_0^{\infty} \1{x\leq v, y\leq v} f_{XY}(x,y) \d x \d y \\
&= \int_0^{\infty} \int_0^{\infty} \1{x\leq v, y\leq v} f_{X}(x) f_{Y}(y) \d x \d y \\
&= \int_0^{v} f_{X}(x) \d x \int_0^{v} f_{Y}(y) \d y \\
&= F_{X}(v) F_Y(v) = (F_X(v))^2.
\end{align*}
Show that the density of \(M\) has the form \(f_{M}(v)=\partial_{v} F_M(v) = 2(1-e^{-\lambda v}) \lambda e^{-\lambda v}\).
Solution
\begin{equation*}
f_M(v) = F_M'(v) = 2 F_X(v) f_X(v) = 2(1-e^{-\lambda v}) \lambda e^{- \lambda v}.
\end{equation*}
Use the density \(f_{M}\) to show that \(\E M = 2\E X - \E L\).
Solution
\begin{align*}
\E M
&= \int_{0}^{\infty} v f_M(v) \d v = \\
&= 2 \lambda \int_{0}^{\infty} v (1-e^{-\lambda v}) e^{-\lambda v} \d v = \\
&= 2 \lambda \int_{0}^{\infty} v e^{-\lambda v} \d v - 2 \lambda \int_{0}^{\infty} v e^{-2 \lambda v} \d v \\
&= 2 \E X - \E L,
\end{align*}
where the last equality follows from the previous exercises.
Recalling that we already obtained \(\E L = \E X/ 2\), we see that \(\E M = 2 \E X - \E L = 3\E X/2\), which settles the truth of \eqref{org17bdd80}.
We can also compute the densities \(f_{M}(y)\) (and \(f_{L}(x)\) by marginalizing the joint density \(f_{L,M}(x,y)\).
However, for this, we first need the joint distribution \(F_{L,M}\), and then we can get \(f_{L,M}\) by differentiation, i.e., \(f_{X,Y} = \partial_{x}\partial_y F_{X,Y}\). Let us try this approach too.
Use the fundamental bridge to show that for \(u\leq v\),
\begin{equation*}
F_{L,M}(u,v) = \P{L\leq u, M\leq v} = 2\int_0^u (F_Y(v)- F_Y(x)) f_X(x) \d x.
\end{equation*}
Solution
First the joint distribution. With \(u\leq v\),
\begin{align*}
F_{L,M}(u,v) &= \P{L\leq u, M \leq v} \\
&= 2\iint \1{x \leq u, y\leq v, x\leq y} f_{X,Y}(x,y)\d x \d y \\
&= 2\int_0^u \int_x^v f_Y(y) \d y f_X(x) \d x & \text{independence} \\
&= 2\int_0^u (F_Y(v)- F_Y(x)) f_X(x) \d x.
\end{align*}
Take partial derivatives to show that
\begin{equation*}
f_{L,M}(u,v) = 2f_X(u) f_Y(v)\1{u\leq v}.
\end{equation*}
Solution
Taking partial derivatives,
\begin{align*}
f_{L,M}(u,v)
&=\partial_v\partial_{u}F_{L,M}(u,v) \\
&=2 \partial_v\partial_{u} \int_0^u (F_Y(v)- F_Y(x)) f_X(x) \d x \\
&=2 \partial_v \left\{(F_Y(v)- F_Y(u)) f_X(u) \right \} \\
&=2 f_X(u)\partial_v F_Y(v) \\
&=2 f_X(u)f_Y(v).
\end{align*}
In 1 marginalize out \(L\) to find \(f_M\), and marginalize out \(M\) to find \(f_L\).
Solution
\begin{align*}
f_M(v) &= \int_0^{\infty} f_{L,M}(u,v) \d u \\
&= 2 \int_0^{v} f_{X}(u) f_Y(v) \d u \\
&= 2 f_Y(v) \int_0^{v} f_{X}(u) \d u \\
&= 2f_Y(v) F_X(v), \\
f_L(u) &= \int_0^{\infty} f_{L,M}(u,v) \d v \\
&= 2 f_{X}(u) \int_u^{\infty} f_Y(v) \d v \\
&= 2 f_{X}(u) G_{Y}(u).
\end{align*}
We did a number of checks for the case \(X, Y, \iid, \sim\Exp{\lambda}\),
but I have a second way to check the consistency of our results.
For this I use the idea that the geometric distribution is the discrete analog of the exponential distribution.
Now we study how this works, and that by taking proper limits we can obtain the results for the continuous setting from the discrete setting.
First, let's try to obtain an intuitive understanding of how \(X\sim \Geo{\lambda/n}\) approaches \(Y\sim\Exp{\lambda}\) as \(n\to\infty\).
For this, divide the interval \([0, \infty)\) into many small intervals of length \(1/n\).
Let \(X\sim \Geo{\lambda/n}\) for some \(\lambda > 0\) and \(n\gg 0\). Then take some \(x\geq 0\) and let \(i\) be such that \(x\in[i/n, (i+1)/n)\).
Show that
\begin{equation}
\label{org01835a1}
\P{X/n \approx x} \approx \frac{\lambda} n \left(1-\frac \lambda n\right)^{x n}.
\end{equation}
Solution
First,
\begin{align*}
\P{X/n \approx x} &= \P{X/n \in [i/n, (i+1)/n]} = \P{X \in [i, i+1]} = p q^{i} \\
&\approx p q^{n x} = \frac{\lambda} n \left(1-\frac \lambda n\right)^{xn},
\end{align*}
since \(p=\lambda/n\), \(q=1-p=1-\lambda/n\), and \(i=nx\).
Next, introduce the highly intuitive notation \(\d x = \lim_{n\to \infty} 1/n\), and use the standard limit \((1-\lambda/n)^n\to e^{-\lambda}\) as \(n\to\infty\) to see that \eqref{org01835a1} converges to
\begin{equation*}
\P{X/n \approx x} \to \lambda e^{-\lambda x} \d x = f_{X}(x) \d x,\quad\text{ as } n \to \infty.
\end{equation*}
If you don't like this trick with \(\d x\), here is another method, based on moment-generating functions.
Derive the moment-generating function \(M_{X/n}(s)\) of \(X/n\) when \(X\sim \Geo{p}\).
Then, let \(p = \lambda/n\), and show that \(\lim_{n\to\infty}M_{X/n}(s) = M_{Y}(s)\), where \(Y \sim \Exp{\lambda}\).
Hint
If you recall the Poisson distribution, you know that \(e^{\lambda} = \sum_{i=0}^{\infty}\lambda^{i}/i!\).
In fact, this is precisely Taylor's expansion of \(e^{\lambda}\).
Solution
\begin{align*}
M_{X/n}(s)
&= \E{e^{s X/n}} = \sum_{i} e^{s i/n} p q^{i} = p\sum_{i} (qe^{s/n})^{i} = \frac{p}{1-qe^{s/n}}.
\end{align*}
With \(p=\lambda/n\) this becomes
\begin{align*}
M_{X/n}(s)
&= \frac{\lambda/n}{1-(1-\lambda/n) (1+s/n + 1/n^{2}\times(\cdots))} \\
&= \frac{\lambda/n}{\lambda/n - s/n + 1/n^{2}\times (\cdots)} \\
&= \frac{\lambda}{\lambda - s + 1/n\times(\cdots)} \\
&\to \frac{\lambda}{\lambda - s}, \quad\text{as } n\to \infty,
\end{align*}
where we write \(1/n^{2}\times(\cdots)\) for all terms that will disappear when we take the limit \(n\to \infty\).
This is just handy notation to hide details in which we are not interested.
With these limits in place, we can relate the minimum \(L=\min\{X, Y\}\) for the discrete and the continuous settings.
Suppose that \(X, Y\sim \Geo{\lambda/n}\), then check that \(\lim_{n\to\infty}\E{L/n} = 1/2\lambda\).
Hint
Use \eqref{org347686d}.
Solution
\begin{align*}
\E{L/n} &= \frac 1 n \E L = \frac 1 n \frac{q^2}{1-q^{2}} \\
&= \frac 1 n \frac{(1-\lambda/n)^2}{1-(1-\lambda/n)^{2}} \\
&= \frac 1 n \frac{1-2 \lambda/n + (\lambda/n)^2}{2 \lambda /n + (\lambda/n)^{2}} \\
&= \frac{1-2 \lambda/n + (\lambda/n)^2}{2 \lambda + \lambda^{2}/n} \\
&\to \frac 1{2\lambda}.
\end{align*}
Clearly, \(1/2\lambda=\E X/2\) when \(X\sim\Exp{\lambda}\).
Here is yet another check on the correctness of \(f_M(x)\).
Show that the PMF \(\P{M=k}\) for the discrete \(M\) in 1 converges to \(f_M(x)\) of 1 when \(n\to \infty\). Take \(k\) suitable.
Solution
Take \(p=\lambda/n\), \(q=1-\lambda/n\), and \(k \approx x n\), hence \(k/n \approx x\). Then,
\begin{align*}
\P{M/n=k/n}
&= 2 p q^{k/n}(1-q^{k/n}) + p^2q^{2k/n} \\
&= 2 \frac \lambda n \left(1-\frac \lambda n\right)^{k/n}\left(1-\left(1-\frac \lambda n\right)^{k/n}\right) + \frac{\lambda^2}{n^2}\left(1-\frac \lambda n \right)^{2k/n} \\
&\to 2 \lambda \d x e^{-\lambda x} \left(1 - e^{-\lambda x}\right) + \lambda^{2} \d x^{2}e^{-2\lambda}.
\end{align*}
Now observe that the second term, proportional to \(\d x^{2}\) can be neglected.
Finally a third way to check the above results, namely by verifying \eqref{orgf33074a}, i.e. \(\E{M-L} = \E X\).
For this, we compute the joint CDF \(f_{L, M-L}(x, y)\). With this, you'll see directly how to compute \(\E{M-L}\).
Use the fundamental bridge to obtain
\begin{equation*}
F_{L,M-L}(x,y) = (1- e^{-2\lambda x}) (1-e^{-\lambda y}) = F_L(x) F_Y(y).
\end{equation*}
Hint
The idea is not difficult, but the technical details require attention, in particular the limits in the integrations.
Solution
\begin{align*}
F_{L, M-L}(x, y)
&=\P{L\leq x, M-L\leq y}\\
&= 2\P{X\leq x, Y-X \leq y, X \leq Y} \\
&= 2\int_0^{\infty} \int_0^{\infty} \1{u\leq x, v-u\leq y, u \leq v} f_{X,Y}(u, v) \d u \d v\\
&= 2\int_0^{\infty} \int_0^{\infty} \1{u\leq x, v-u\leq y, u\leq v} \lambda^2e^{-\lambda u}e^{-\lambda v}\d u \d v\\
&= 2\int_0^{x} \int_0^{\infty} \1{u\leq v\leq u+y} \lambda^2e^{-\lambda u}e^{-\lambda v}\d v \d u\\
&= 2\int_0^{x} \lambda e^{-\lambda u} \int_u ^{u + y} \lambda e^{-\lambda v}\d v \d u\\
&= 2\int_0^{x} \lambda e^{-\lambda u} (-e^{-\lambda v}) \biggr|_u^{u+y}\d u \\
&= 2\int_0^{x} \lambda e^{-\lambda u} (e^{-\lambda u} - e^{-\lambda (u+y)})\d u \\
&= 2\lambda\int_0^{x} e^{-2 \lambda u}\d u - 2\lambda \int_{0}^{x} e^{-\lambda (2u+y)}\d u \\
&= 2\lambda\int_0^{x} e^{-2 \lambda u}\d u - 2\lambda e^{-\lambda y }\int_{0}^{x} e^{-2 \lambda u}\d u \\
&= (1-e^{-\lambda y}) 2 \lambda \int_0^{x} e^{-2 \lambda u} \d u\\
&= (1-e^{-\lambda y}) ( -e^{-2\lambda u}) \biggr|_0^{x} \\
&= (1-e^{-\lambda y})(1- e^{-2\lambda x}).
\end{align*}
Conclude that \(M-L\) and \(L\) are independent, and \(M-L\sim Y\).
Solution
As \(F_{L,M-L}(x,y) = F_Y(y) F_L(x)\). So the CDF factors as a function of \(x\) only and a function of \(y\) only. This implies that \(L\) and \(M-L\) are independent, and moreover that \(F_{M-L}(y) = F_Y(y)\), so \(M-L \sim Y\). We can also see this from the joint PDF:
\begin{equation*}
f_{L, M-L}(x,y) = \partial_x \partial_y (F_Y(y) F_L(x)) = f_Y(y) f_L(x),
\end{equation*}
so the joint PDF (of course) also factors. The independence now follows from BH 7.1.21.
#but it is also okay to conclude it directly from the CDF factoring as a function of \(x\) only and a function of \(y\) only.
Because \(L\) and \(M-L\) are independent, the conditional density equals the marginal density:
\begin{equation*}
f_{M-L|L}(y|x) = \frac{f_{L, M-L}(x, y)}{f_L(x)} = \frac{f_Y(y) f_L(x)}{f_L(x)} = f_Y(y).
\end{equation*}
By the above exercise, we find that \(\E{M-L} = \E Y = \E X\), as \(X\) and \(Y\) are iid.
This makes me wonder whether \(M-L\) and \(L\) are also independent for the discrete case, i.e., when \(X, Y\) iid and \(\sim \Geo{p}\).
Hence, we should check that for all \(i, j\)
\begin{equation}
\label{org0679a78}
\P{L=i, M-L=j} = \P{L=i} \P{M-L=j}.
\end{equation}
Use 1 to see that
\begin{equation*}
\P{L=i, M-L=j} = 2p^2q^{2i+j} \1{j\geq 1} + p^2q^{2i}\1{j=0}.
\end{equation*}
Hint
Realize that \(\P{L=i, M-L=j}= \P{L=i, M=i+j}\). Now fill in the formula of 1.
Now for the RHS.
Derive that
\begin{equation*}
\P{M-L=j} = \frac{2p^2q^{j}}{1-q^2}\1{j\geq 1} + \frac{p^2q^{j}}{1-q^2}\1{j=0}.
\end{equation*}
Solution
Suppose \(j > 0\) (for \(j=0\) the maths is the same). Then,
\begin{align*}
\P{M-L=j}
&=2\sum_{i=0}^{\infty}\P{X =i, Y=i+j } =
&=2\sum_{i=0}^{\infty}p q^i pq^{i+j} =
&=2p^{2}q^{j}\sum_{i=0}^{\infty}q^{2i} =
&=2p^{2}q^{j}/(1-q^2).
\end{align*}
Recalling that \(\P{L=i} = (1-q^2)q^{2i}\), it follows right away from \eqref{org0679a78} that \(L\) and \(M-L\) are independent.
Interestingly, from 1 we see \(M-L\sim Y\) for the continuous case. However, here, for the geometric case, \(\P{M-L=j} \neq pq^{j} = \P{Y=j}\). This explains why \(\E M \neq \E L + \E X\) for geometric rvs: we should be more careful in how to split \(M\) in terms of \(L\) and \(X\).
All in all, we have checked and double checked all our expressions and limits for the geometric and exponential distribution.
We had success too: the solution of the last exercise provides the key to understand why \eqref{org8619836} and \eqref{orgf33074a} are true for exponentially distributed rvs, but not for geometric random variables.
In fact, in the solutions we can see the term corresponding to \(X=Y=i\) for \(X,Y\sim \Geo{p}\) becomes negligibly small when \(n\to 0\).
In other words, \(\P{X=Y}>0\) when \(X\) and \(Y\) are discrete, but \(\P{X=Y}=0\) when \(X\) and \(Y\) are continuous.
Moreover, by 1, \(\E M = \E{L + M-L} = \E L + \E{M-L}\), but \(\E{M-L} \neq \E X\). So, to resolve our leading problem we should reconsider \(\E{M-L}\).