Suppose a large corporation has 20 open positions for entry-level accountants and all applications
are initially screened to identify those who satisfy the job requirements. Suppose there were 200
qualified applicants, 110 of which were male and 90 were female, and that of those hired, 16 were
male and 4 were female. These results can be summarized in the following table:

Hired | Not Hired | Total | |

M | 16 | 94 | 110 |

F | 4 | 86 | 90 |

Total | 20 | 180 | 200 |

Note that 10% of all qualified applicants were hired, but 14.5% (16/110) of qualified male
applicants were hired and 4.4% (4/90) of qualified female applicants were hired. So the chances
of being hired appear to differ between males and females. We say that hiring and gender are
independent if the probabilities of being hired for males and females are the same as the overall
probability, *0.10*. Therefore, to have exact independence between hiring and gender in this
case, the company would needed to have hired 10% of qualified male applicants and 10% of qualified
female applicants. The table of expected frequencies in this case would be

Hired | Not Hired | Total | |

M | 11 | 98 | 110 |

F | 9 | 81 | 90 |

Total | 20 | 180 | 200 |

We can construct a measure of distance between the actual frequencies and the expected frequencies
under independence given by

where

The sampling distribution for this distance is approximately a chi-square distribution with degrees of freedom given by

is the area to the right of

1 - pchisq(5.612,1) = 0.0178.If we use 5% level of significance for this test, then we would reject the null hypothesis and conclude that hiring and gender are not independent.

2017-11-16