Suppose a large corporation has 20 open positions for entry-level accountants and all applications are
initially screened to identify those who satisfy the job requirements. Suppose there were 200 qualified
applicants, 110 of which were male and 90 were female, and that of those hired, 16 were male and 4 were
female. These results can be summarized in the following table:

Hired | Not Hired | Total | |

M | 16 | 94 | 110 |

F | 4 | 86 | 90 |

Total | 20 | 180 | 200 |

Note that 10% of all qualified applicants were hired, but 14.5% (16/110) of qualified male applicants were
hired and 4.4% (4/90) of qualified female applicants were hired. So the chances of being hired appear to
differ between males and females. We say that hiring and gender are independent if the probabilities of
being hired for males and females are the same as the overall probability, *0.10*. Therefore, to have
exact independence between hiring and gender in this case, the company would needed to have hired
10% of qualified male applicants and 10% of qualified female applicants. The table of expected frequencies
in this case would be

Hired | Not Hired | Total | |

M | 11 | 98 | 110 |

F | 9 | 82 | 90 |

Total | 20 | 180 | 200 |

We can construct a measure of distance between the actual frequencies and the expected frequencies
under independence given by

where

The sampling distribution for this distance is approximately a chi-square distribution with degrees of freedom given by

is the area to the right of

1 - pchisq(5.409,1) = 0.020.If we use 5% level of significance for this test, then we would reject the null hypothesis and conclude that hiring and gender are not independent.

2015-05-01