2. A study was conducted to see if a student's year in school had any effect on how well they did in their classes. The following data for all the students in all their classes was obtained from the Registrar.

 A B C D F Fresh 1569 1567 1214 432 987 Soph 1484 1581 1322 212 643 Jun 1592 1609 1472 153 421 Sen 1685 1623 1211 75 315

Use a chi-square test to determine if there is a significant difference between the grades received by students in different classes.

First we need to get our row and column totals.

 A B C D F Totals Fresh 1569 1567 1214 432 987 5769 Soph 1484 1581 1322 212 643 5242 Jun 1592 1609 1472 153 421 5247 Sen 1685 1623 1211 75 315 4909 Totals 6330 6380 5219 872 2366 21167

Observed frequencies

Now we can enter the expected frequencies into each cell in the table. The expected frequency is computed by (row total)x(column total)/(grand total).

 A B C D F Totals Fresh 1725.22 1738.85 1422.42 237.66 644.85 5796 Soph 1567.62 1580.00 1292.48 215.95 585.94 5242 Jun 1569.12 1581.51 1293.72 216.16 586.50 5247 Sen 1468.04 1479.63 1210.38 202.23 548.72 4909 Totals 6330 6380 5219 872 2366 21167

Expected frequencies

Since these values have all been rounded off to the nearest hundredth, you may find slight discrepancies in the row and column totals due to round off error. We now compute the difference between the observed and expected frequencies in each cell.

 A B C D F Fresh -156.22 -171.85 -208.42 194.34 342.15 Soph -83.62 1.00 29.52 -3.95 57.06 Jun 22.88 27.49 178.28 -63.16 -165.50 Sen 216.96 143.37 0.62 -127.23 -233.72

Observed frequencies - expected frequencies

The row and column totals should all be 0 at this point. Again, in some cells you will find slight discrepancies due to round off error.

The chi2 statistic is obtained by adding up the squares of theses differences divided by the expected value in each cell.

 A B C D F Totals Fresh 14.15 16.98 30.54 158.92 181.54 Soph 4.46 0.00 0.67 0.07 5.56 Jun 0.33 0.48 24.57 18.45 46.70 Sen 32.06 13.89 0.00 80.05 99.55 Totals 728.97

chi2 = 728.97

The number of degrees of freedom is (4-1)(5-1) = 12. This value of is way off the charts of the critical values of chi2 so we reject the null hypothesis. That means that students in different classes can expect to get different grades. Looking at the tables, we would conclude that the upper division students get significantly better grades than the lower division students.

The Data Desk printout looks like

you can see that they rounded the chi-square score off to the nearest tenth. They also display the P-value which is so small that we would reject the null hypothesis.