Next: Comparison of Population Variances Up: Statistical Decisions Previous: Hypothesis tests for a

Hypothesis tests to compare means of two populations - independent samples

The tests described in the previous section involve a comparison between a population mean and a standard. More commonly occurring situations involve a comparison of the means in two populations. Suppose for example we would like to compare the mean salaries of female and male financial analysts. This comparison can be expressed as a test of the hypotheses

where represents the population mean salary of all male financial analysts and represents the population mean salary of all female financial analysts. This problem is stated as a two-sided hypothesis so that we can detect an increase as well as a decrease in female salaries compared to male salaries. We will assume that these populations have approximately normal distributions or that we have large sample sizes so that the central limit theorem can be applied. The simplest way to make this comparison is to separately select random samples from each group. This sampling method produces independent samples. Let denote the population mean and standard deviation of male salaries, let denote the population mean and standard deviation of female salaries, and let denote the sample sizes, sample means, and sample standard deviations for the respective samples. It would be reasonable to base our decision on , the difference between the sample means. To construct a test statistic based on this difference, we need to determine its sampling distribution. That is, we must find the distribution of from all possible samples of size for males and for females. Let

Statistical theory shows that if the populations are approximately normal or if the sample sizes are large, then the distribution of

has approximately a t-distribution with degrees of freedom given by

Under the assumption that the null hypothesis is true, then

has approximately a t-distribution with degrees of freedom . Strong evidence for this two-sided alternative would be sample means that are far apart. Therefore, the p-value is , where

This test is referred to as Welch's approximation to the two-sample t-test.

Care should be taken with one sided-alternatives, since only one direction indicates strong evidence for the alternative. If the hypotheses are

then strong evidence for the alternative hypothesis would be a value of that is a large positive number. If is much larger than , then the decision should be to not reject the null hypothesis even though the sample means are far apart. Likewise, if the hypotheses are

then strong evidence for the alternative hypothesis would be a value of that is a large negative number. If is much larger than , then the decision should be to not reject the null hypothesis. The easiest way to handle these one-sided hypotheses is the form the test statistic according to the alternative hypothesis. If the hypotheses are

then let

The p-value is . If is larger than , then would be negative and so this p-value would be greater than 0.5 and we would not reject the null hypothesis. If the hypotheses are

then let

The p-value in this case is . If is larger than , then would be negative and so this p-value would be greater than 0.5 and we would not reject the null hypothesis.

The validity of this two-sample test depends on the assumption of normality of the population. If the populations are not normally distributed and if the sample sizes are not sufficiently large to compensate for this non-normality via the Central Limit Theorem, then the p-values obtained as described above will not be valid. There is a non-parametric test called the Wilcoxon Mann-Whitney rank sum test that can be used in place of the two-sample t-test. Most statistical computer packages include this test as part of their set of two-sample test methods, but this method will not be discussed here.

Example. Suppose we wish to test the hypotheses

based on a random sample of 25 male financial analysts and a random sample of 18 female financial analysts using a 10% level of significance. Suppose that the salaries in these samples give , , , . It will be easier to express the salaries in $1000 dollar units rather than dollars, so the data becomes , , , . Then The degrees of freedom are 2*pt(-2.257,27.6), which gives 0.032. So our decision is to reject the null hypothesis at the 10% level of significance. Since we now believe that there is a difference between the means, we could ask how great is that difference. This can be accomplished with a confidence interval for the difference between the population means. This confidence interval has the form where the degrees of freedom for the t-value is the same as for the test statistic, and the standard deviation is the denominator of the test statistic, A 95% confidence interval for the difference between the mean salaries for males and females is This confidence interval expressed in dollars is [$500,\$10,500]. That is, we are 95% confident that the difference between the means is within this interval. Note that all of these values are positive, indicating that the mean for males is greater than the mean for females.

Next: Comparison of Population Variances Up: Statistical Decisions Previous: Hypothesis tests for a
ammann
2017-11-16