Frequency table & Chi-square test

Command:    

Statistics
Next selectCategorical data
Next selectFrequency table & Chi-square test

Description

The Frequency table & Chi-square test procedure can be used for the following:

  • To test the hypothesis that for one classification table (e.g. gender), all classification levels have the same frequency.
  • To test the relationship between two classification factors (e.g. gender and profession).

Required input

In the Frequency table & Chi-square test dialog box, one or two discrete variables with the classification data must be identified. Classification data may either be numeric or alphanumeric (string) values. If required, you can convert a continuous variable into a discrete variable using the IF function (see elsewhere).

Results

After you have completed the dialog box, click the OK button to obtain the frequency table with the relevant statistics.

Chi-square test

When you want to test the hypothesis that for one single classification table (e.g. gender), all classification levels have the same frequency, then identify only one discrete variable in the dialog form. In this case the null hypothesis is that all classification levels have the same frequency. If the calculated P-value is low (P<0.05), then you reject the null hypothesis and the alternative hypothesis that there is a significant difference between the frequencies of the different classification levels must be accepted.

In a single classification table the mode of the observations is the most common observation or category (the observation with the highest frequency). A unimodal distribution has one mode; a bimodal distribution, two modes.

When you want to study the relationship between two classification factors (e.g. gender and profession), then identify the two discrete variables in the dialog form. In this case the null hypothesis is that the two factors are independent. If the calculated P-value is low (P<0.05), then the null hypothesis is rejected and you accept the alternative hypothesis that there is a relation between the two factors.

Note that when the degrees of freedom is equal to 1, e.g. in case of a 2x2 table, MedCalc uses Yates' correction for continuity.

Chi-square test for trend

If the table has two columns and three or more rows (or two rows and three or more columns), and the categories can be quantified, MedCalc will also perform a Chi-square test for trend. This calculation tests whether there is a linear trend between row (or column) number and the fraction of subjects in the left column (or top row). The chi-square test for trend provides a more powerful test than the unordered independence test above.

If there is no meaningful order in the row (or column) categories, then you should ignore this calculation.

Analysis of 2x2 table

  • When the number of expected frequencies in the 2x2 table is low (in case the total number of observations is less than 20), the table should be tested using Fisher's exact test;
  • When the two classification factors are not independent, or when you want to test the difference between proportions in related or paired observations (e.g. in studies in which patients serve as their own control), you must use the McNemar test.

Literature

  • Altman DG (1991) Practical statistics for medical research. London: Chapman and Hall.

See also