next up previous
Next: Permutations without replacement Up: Class Notes Previous: Measures of Association

Introduction to Probability Models

Probability is a mathematical description of a process whose outcome is uncertain. We call such a process an experiment. This could be something as simple as tossing a coin or as complicated as a large-scale clinical trial consisting of three phases involving hundreds of patients and a variety of treatments. The sample space of an experiment is the set of all possible outcomes of the experiment, and an event is a set of possible outcomes, that is, a subset of the sample space.

For example, the sample space of an experiment in which three coins are tossed consists of the outcomes


while the sample space of an experiment in which a disk drive is selected and the time until first failure is observed would consist of the positive real numbers. In the first case, the event that exactly one head is observed is the set $\{HHT,HTH,THH\}$. In the second case, the event that the time to first failure of the drive exceeds 1000 hours is the interval $(1000,\infty)$.

Probability arose originally from descriptions of games of chance - gambling - that have their origins far back in human history. It is usually interpreted as the proportion or percentage of times a particular outcome is observed if the experiment is repeated a large number of times. We can think of this proportion as representing the likelihood of that outcome occurring whenever the experiment is performed. Probability is formally defined to be a function that assigns a real number to each event associated with an experiment according to a set of basic rules. These rules are designed to coincide with our intuitive notions of likelihood, but they must also be mathematically consistent.

This mathematical representation is simplest when the sample space contains a finite or countably infinite number of elements. However, our mathematics and our intuition collide when working with an experiment that has an uncountable sample space, for example an interval of real numbers. Consider for example the following experiment. You purchase a spring driven clock, set it at 12:00 (ignore AM and PM), wind the clock and let it run until it stops. We can represent the sample space of this experiment as the interval, $[0,12)$, and we can ask questions such as

  1. What is the probability the clock stops between 1:00 and 2:00?
  2. What is the probability the clock stops between 4:00 and 4:30?
  3. What is the probability the clock stops between 7:05 and 7:06?

We can answer each of these questions using our intuitive ideas of likelihood. For the first question, since we know nothing about the clock, we can assume that there is no preference of one interval of time over any other interval of time for the clock to stop. Therefore, we would expect that each of the 12 intervals of length one hour are equally likely to contain the stopping time of the clock, and so the likelihood that it stops between 1:00 and 2:00 would be $1/12$. Similarly, the likelihood that it stops between 4:00 and 4:30 would be $1/24$ since there are 24 intervals of length $1/2$ hour, and the likelihood that it stops between 7:05 and 7:06 would be $1/720$ since there are 720 intervals of length one minute. In each case our intuition tells us that the likelihood of an event for this experiment is the reciprocal of the number of non-overlapping intervals of the same length, since each such interval is assumed to be equally likely to contain the stopping point of the clock. Note also that the interval $[1,2)$, corresponding to the times between 1:00 and 2:00, contains the non-overlapping intervals, $[1,1.5)$ and $[1.5,2)$. Each of these intervals would have likelihoods $1/24$ and the sum of these two likelihoods equals the likelihood of the entire interval. This illustrates the additive nature of likelihood that we have for this concept.

A problem occurs if we ask a question such as what is the probability that the clock stops at precisely $\sqrt{2}$ minutes past 1? In this case there is an uncountably infinite number of such times in the interval $[0,12)$, so that the likelihood we would assign to such an event would be $1/\infty = 0$. However, the sum of the likelihoods for all such events between 1:00 and 2:00 would be 0, not $1/12$ as we have derived above. This inconsistency requires that we modify the rules somewhat. In the case of uncountably infinite sample spaces, we only require that probability be defined for an interesting set of events. In the case of the clock experiment, this interesting set of events would consist of all interval subsets of the sample space with positive length along with events that can be formed from countable unions and intersections of such intervals. This collection of events is referred to as the probability space for the experiment. In the case of finite or countably infinite sample spaces, the probability space can be the set of all possible subsets of the sample space. Unless specified otherwise, all events used here are assumed to be in the probability space.

The basic rules or axioms of probability are then:

  1. Probability is a function $P:\mathcal{F}\rightarrow [0,1]$, where $\mathcal{F}$ is the probability space. That is, the probability function assigns a number between 0 and 1 to each event in the probability space.
  2. $P(S) = 1$, where $S$ is the sample space. That is, the probability that an outcome in the sample space occurs is 1.
  3. For any countable collection of mutually exclusive events in $\mathcal{F}$, $A_i,\ i\ge 1$, we have

P(\bigcup_{i=1}^\infty A_i) = \sum_{i=1}^\infty P(A_i).

    That is, the probability of the union of non-overlapping events equals the sum of the individual probabilities.
All other properties of probability derive from these basic axioms along with any additional definitions we construct.

As noted previously, when working with experiments that have equally likely outcomes, it is only necessary to count the number of outcomes contained in events to determine their probabilities. Events in many such experiments involve the selection of objects from a population. There are different methods used for counting outcomes for such situations depending on whether or not the selection order of the selected objects is recognized and whether a selected object is returned (selection with replacement) to the population before the next selection is made or not returned (selection without replacement). We use the term permutation to refer to selection of objects in which selection order is distinguished and use the term combinations to refer to the case in which selection order is not distinguished. We will consider here three of these methods, permutations with and without replacement, and combinations without replacement.

next up previous
Next: Permutations without replacement Up: Class Notes Previous: Measures of Association