next up previous
Next: Continuous Random Variables Up: Special Discrete Distributions Previous: Special Discrete Distributions

Bernoulli and Related Distributions

The simplest experiment we can model is one in which there are just two possible outcomes. By convention, one of these is labelled $S$ or $Success$ and the other is labelled $F$ or $Failure$. Such experiments are referred to as Bernoulli trials. The corresponding Bernoulli random variable assigns the value 1 to the outcome $S$ and 0 to $F$. If $p$ denotes the probability of observing the outcome labelled $S$, then the p.m.f. of a Bernoulli random variable $N$ is given by

\begin{displaymath}
p(x) = \left\{ \begin{array}{ll} 1-p,&\ x=0,\\
p,&\ x=1,\\
0,&\ {\rm otherwise}.
\end{array} \right.
\end{displaymath}

The corresponding distribution function is,

\begin{displaymath}
F(x) = \left\{ \begin{array}{ll} 0,&\ x<0,\\
1-p,&\ 0\le x<1,\\
1,&\ x\ge 1.
\end{array} \right.
\end{displaymath}

Note that this distribution is characterized by the single parameter, $p$.

The expected value and variance of a Bernoulli r.v. can be obtained easily:

\begin{displaymath}
E(X) = (0)(1-p) + (1)p = p,
\end{displaymath}


\begin{displaymath}
E(X^2) = (0^2)(1-p) + (1^2)p = p,
\end{displaymath}

and so,

\begin{displaymath}
Var(X) = E(X^2) - [E(X)]^2 = p - p^2 = p(1-p).
\end{displaymath}

Note that the s.d. of the Bernoulli is just

\begin{displaymath}
SD(X) = \sqrt{p(1-p)}.
\end{displaymath}

A direct extension of this experiment is an experiment in which a series of $n$ independent Bernoulli trials are performed, each with the same probability $p$ of success. Let $X_i$ denote the $i^{th}$ Bernoulli random variable, and let $N$ denote the total number of successes among the $n$ trials. Then,

\begin{displaymath}
N = \sum_{i=1}^n X_i.
\end{displaymath}

Note that the gambler's problem described in the previous section is an example of this type of experiment in which $S$ represents the event that the first gambler wins a game and each game is a Bernoulli trial. Using the same arguments here as we did for the gambler's problem, we can see that the p.m.f. of $N$ is given by,

\begin{displaymath}
p(k) = P(N=k) = {n\choose k}p^k(1-p)^{n-k},\ 0\le k\le n.
\end{displaymath}

As noted earlier, the p.m.f of this random variable involves terms of the binomial series, and so this random variable is referred to as the binomial random variable and its distribution is referred to as the binomial distribution. This distribution is characterized by two parameters, the number of trials, $n$ and the success probability, $p$.

The binomial distribution can be used to model a sampling experiment in which a sample of $n$ objects is to be randomly selected with replacement from a population that consists of two types of objects. Let $p$ denote the proportion of the population that are type 1 objects. The sampling can be viewed as a sequence of Bernoulli trials, with $S$ denoting the event that the first type is selected on a trial. Since the sampling is done with replacement, the second selection trial is identical to the first and the outcome of the second trial does not depend on the outcome of the first trial, since the object selected on the first trial is returned to the population. Hence, the number of type 1 objects selected in such an experiment would have a binomial distribution with $n$ trials and success probability $p$.

If the sampling is performed without replacement, then the trials will no longer be independent and the probability of success for each trial will no longer be the same. However, if the sample size is small compared to the population size, then the binomial distribution is a reasonable approximation to the actual probabilities.

The expected value and variance can be derived directly from the Bernoulli distribution.

\begin{displaymath}
E(N) = E\left[\sum_{k=1}^n X_k\right] = \sum_{k=1}^n E(X_k) = np.
\end{displaymath}

Since $N$ can be expressed as a sum of independent Bernoulli r.v.'s, then the variance of this sum is the sum of the individual variances,

\begin{displaymath}
Var(N) = \sum_{k=1}^n Var(X_k) = np(1-p),
\end{displaymath}

and so,

\begin{displaymath}
SD(N) = \sqrt{np(1-p)}.
\end{displaymath}

Example. Suppose that we are interested in the proportion of defectives within a large population, and so randomly select a sample of size $n$ from this population. If $n$ is small compared to the population size, then we can use the binomial distribution as an approximation to the distribution of the number of defectives that will be found in the sample. Let $N$ denote this number and note that

\begin{displaymath}
\hat{p}_n = \frac{N}{n}
\end{displaymath}

represents the proportion of defectives in the sample. We would expect that the sample proportion should be fairly close to the population proportion $p$. This can be expressed probabilistically by applying Chebychev's inequality. Let $c$ denote an arbitrarily small positive real number. Then,

\begin{eqnarray*}
P(\vert\hat{p}_n - p\vert > c) &=& P(\vert N - np\vert > cn) \\
&\le & \frac{np(1-p)}{c^2n^2} \\
&=& \frac{p(1-p)}{nc^2}.
\end{eqnarray*}

This means that if $n$ is large, then the probability that the sample proportion $\hat{p}_n$ is more that $c$ from the population proportion $p$ is very small. In fact, as $n\rightarrow \infty$, then this probability goes to 0.

A different extension of Bernoulli trials is to continue performing the trials until the first success is observed. Let $G$ denote the number of trials required to obtain the first success. Under the assumption that the trials are independent with the same probability of success, we have,

\begin{displaymath}
P(G=k) = (1-p)^{k-1}p,\ k\ge 1.
\end{displaymath}

This follows from the fact that the first success occurs on trial $k$ if and only if the first $k-1$ trials are failures and trial $k$ is a success. Since the p.m.f. of this random variable involves terms of the geometric series, this random variable is referred to as the geometric random variable and its distribution is referred to as the geometric distribution.

A related random variable that is sometimes more convenient to work with is $Y$, defined to be the number of failures observed before the first success occurs. It can be seen that $Y$ is related to $G$ by $Y=G-1$. Hence, its p.m.f. is given by,

\begin{displaymath}
p_Y(k) = P(Y=k) = (1-p)^kp,\ k\ge 0.
\end{displaymath}

The expected value and variance of the geometric distribution are:

\begin{eqnarray*}
E[G] &=& \frac{1}{p},\\
Var(G) &=& \frac{q}{p^2}\\
E[Y] &=& E[G-1] = \frac{q}{p}\\
Var(Y) &=& Var(G) = \frac{q}{p^2}.
\end{eqnarray*}


next up previous
Next: Continuous Random Variables Up: Special Discrete Distributions Previous: Special Discrete Distributions
Larry Ammann
2013-12-17