11.3. Test of Independence: Contingency Tables

 

General Case:

Suppose there are two variables, column variable (with m categories) and

row variable (with p categories). We want test the hypothesis

Row variable is independent column vairalbe v.s.

 Row variable is not independent column variable.

Suppose the sample size is n. The contingency table is

 

Column Variable (m columns)

 

1

...

j

m

proportions

 

 

 

Row

Variable

1

(p rows)

1

i

p

 

proportions

1

If  is true, then the expected numbers under  are

 

Column Variable (m columns)

 

1

...

j

m

proportions

 

 

 

Row

Variable

1

(p rows)

1

i

p

 

proportions

1

 

Note:

where

and

.

 

Chi-Square Test:

Let

As  for every i and j, the chi-square test with level of significance  for

Row variable is independent column variable 

vs.

 Row variable is not independent column variable.

is to

,

where  can be obtained by

.

In addition,

.

 

Example 3:

The following data are the number of people who are in favor of, are not in favor of, and

have no comment on, some proposal:

 

Favor

Not Favor

No Comment

Male

252

145

203

Female

148

105

147

Please test if female and male differ in their opinions about the proposal

at 5% level of significance.

[solution:]

The column totals are  

while the row totals are .

In addition, the total number is 1000. The table for the expected numbers  is

 

Favor

Not Favor

No Comment

Row Total

Male

600

Female

400

Column Total

400

250

350

1000

Thus,

Since

,

we do not reject .

 

JavaStatSoft:

Chi-square test:

Statistics -> Tests -> Chi-Square Tests -> Test of Independence