Probability is a mathematical description of a process whose outcome is
uncertain. We call such a process an **experiment**. This could be
something as simple as tossing a coin or as complicated as a large-scale
clinical trial consisting of three phases involving hundreds of patients and
a variety of treatments. The **sample space** of an experiment is the
set of all possible outcomes of the experiment, and an **event** is a
set of possible outcomes, that is, a subset of the sample space.

For example, the sample space of an experiment in which three coins are tossed
consists of the outcomes

while the sample space of an experiment in which a disk drive is selected and the time until first failure is observed would consist of the positive real numbers. In the first case, the event that exactly one head is observed is the set . In the second case, the event that the time to first failure of the drive exceeds 1000 hours is the interval .

Probability arose originally from descriptions of games of chance - gambling
- that have their origins far back in human history. It is usually interpreted
as the proportion or percentage of times a particular outcome is observed if
the experiment is repeated a large number of times. We can think of this
proportion as representing the *likelihood* of that outcome occurring
whenever the experiment is performed. **Probability** is formally defined
to be a function that assigns a real number to each event associated with an
experiment according to a set of basic rules. These rules are designed to
coincide with our intuitive notions of likelihood, but they must also be
mathematically consistent.

This mathematical representation is simplest when the sample space contains a finite or countably infinite number of elements. However, our mathematics and our intuition collide when working with an experiment that has an uncountable sample space, for example an interval of real numbers. Consider for example the following experiment. You purchase a spring driven clock, set it at 12:00 (ignore AM and PM), wind the clock and let it run until it stops. We can represent the sample space of this experiment as the interval, , and we can ask questions such as

- What is the probability the clock stops between 1:00 and 2:00?
- What is the probability the clock stops between 4:00 and 4:30?
- What is the probability the clock stops between 7:05 and 7:06?

We can answer each of these questions using our intuitive ideas of likelihood. For the first question, since we know nothing about the clock, we can assume that there is no preference of one interval of time over any other interval of time for the clock to stop. Therefore, we would expect that each of the 12 intervals of length one hour are equally likely to contain the stopping time of the clock, and so the likelihood that it stops between 1:00 and 2:00 would be . Similarly, the likelihood that it stops between 4:00 and 4:30 would be since there are 24 intervals of length hour, and the likelihood that it stops between 7:05 and 7:06 would be since there are 720 intervals of length one minute. In each case our intuition tells us that the likelihood of an event for this experiment is the reciprocal of the number of non-overlapping intervals of the same length, since each such interval is assumed to be equally likely to contain the stopping point of the clock. Note also that the interval , corresponding to the times between 1:00 and 2:00, contains the non-overlapping intervals, and . Each of these intervals would have likelihoods and the sum of these two likelihoods equals the likelihood of the entire interval. This illustrates the additive nature of likelihood that we have for this concept.

A problem occurs if we ask a question such as what is the probability that the
clock stops at precisely minutes past 1? In this case there is an
uncountably infinite number of such times in the interval , so that the
likelihood we would assign to such an event would be . However,
the sum of the likelihoods for all such events between 1:00 and 2:00 would be
0, not as we have derived above. This inconsistency requires that we
modify the rules somewhat. In the case of uncountably infinite sample spaces,
we only require that probability be defined for an *interesting*
set of events. In the case of the clock experiment, this *interesting*
set of events would consist of all interval subsets of the sample space with
positive length along with events that can be formed from countable unions and
intersections of such intervals. This collection of events is referred to as
the **probability space** for the experiment. In the case of finite or
countably infinite sample spaces, the probability space can be the set of all
possible subsets of the sample space. Unless specified otherwise, all events
used here are assumed to be in the probability space.

The basic rules or axioms of probability are then:

- Probability is a function , where is the probability space. That is, the probability function assigns a number between 0 and 1 to each event in the probability space.
- , where is the sample space. That is, the probability that an outcome in the sample space occurs is 1.
- For any countable collection of mutually exclusive events in
, , we have

That is, the probability of the union of non-overlapping events equals the sum of the individual probabilities.

As noted previously, when working with experiments that have equally likely
outcomes, it is only necessary to count the number of outcomes contained in
events to determine their probabilities. Events in many such experiments
involve the selection of objects from a population. There are different
methods used for counting outcomes for such situations depending on whether
or not the selection order of the selected objects is recognized and whether a
selected object is returned (**selection with replacement**) to the
population before the next selection is made or not returned (**selection
without replacement**). We use the term **permutation** to refer to
selection of objects in which selection order is distinguished and use the term
**combinations** to refer to the case in which selection order is not
distinguished. We will consider here three of these methods, permutations
with and without replacement, and combinations without replacement.

2016-09-28