Running the χ2 test
For this tutorial we will use data from an example offered in STAT 500 Applied Statistics where participants were asked to give their party affiliation (Democrat or Republican) and their opinion on a tax reform bill (Favor, Indifferent, or Opposed).
Favor | Indifferent | Opposed | |
Democrat |
|
|
|
Republican |
|
|
|
The researcher wants to know whether a relationship exists between party affiliation and opinion, which corresponds to the following statistical hypotheses:
Using the data.frame()
function we can easily code the
data from the above table into R going from top to bottom row-wise then
left to right column-wise. By putting the variable name before the
=
in each line we can set the desired names for each column
then set the row.names
option to add the desired names to
the rows.
<- data.frame(Favor = c(138, 64),
opinion Indifferent = c(83, 67),
Opposed = c(64, 84),
row.names = c("Democrat", "Republican"))
opinion
## Favor Indifferent Opposed
## Democrat 138 83 64
## Republican 64 67 84
Note that if we wanted our columns to be party affiliation and rows
the opinion on the tax reform bill we can simply change how we set up
our data frame (or transpose them with t()
) and the
χ2 test will still have the same statistical results.
Now that we have our data in a data frame we can run the
χ2 test with the chisq.test()
function. We will
assign the results to a new object so that we can get some additional
information out of it later.
<- chisq.test(opinion)
opinion.chisq
opinion.chisq
##
## Pearson's Chi-squared test
##
## data: opinion
## X-squared = 22.152, df = 2, p-value = 1.548e-05
The output of the chisq.test()
function gives us the
test statistic (X-squared = 22.152
), the degrees of freedom
(df = 2
), and the p-value associated with the test
statistics (p-value = 1.548e-05
). Importantly for answering
our original question, the p-value is much less than 0 so that we can
conclude that an association does exist between party affiliation and a
person’s opinion on the tax reform bill.
With the χ2 test the expected observations if the null
hypothesis was true are calculated. Since we assigned the results to an
object we can append to it $expected
to print a table of
the expected counts.
$expected opinion.chisq
## Favor Indifferent Opposed
## Democrat 115.14 85.5 84.36
## Republican 86.86 64.5 63.64
Comparing the expected counts with the observed counts in our original table we can see that respondents who identified as Republican had more Opposed responses than expected compared to those who identified as Democrat, for who had more responses for Favor. Conversely, the observed and expected counts almost match for the Indifferent responses from both parties.
If we wanted to run post-hoc analyses to statistically determine
which specific responses are different we could consider proportion
tests with prop.test()
or further χ2 tests for
each pairwise comparison with multiple test corrections. However, in
this case it is quite clear that Democrat respondents view the
tax bill more favorably than those who identify as
Republican.
Full code block
# Put data into a data frame and print
<- data.frame(Favor = c(138, 64),
opinion Indifferent = c(83, 67),
Opposed = c(64, 84),
row.names = c("Democrat", "Republican"))
opinion
# Fit chi-square test and print results
<- chisq.test(opinion)
opinion.chisq
opinion.chisq
# Print table of expected counts
$expected opinion.chisq