chi-sq-test.knit

Running the χ² test

For this tutorial we will use data from an example offered in STAT 500 Applied Statistics where participants were asked to give their party affiliation (Democrat or Republican) and their opinion on a tax reform bill (Favor, Indifferent, or Opposed).

	Favor	Indifferent	Opposed
Democrat	138	83	64
Republican	64	67	84

The researcher wants to know whether a relationship exists between party affiliation and opinion, which corresponds to the following statistical hypotheses:

H₀: No relationship exists between party affiliation and opinion on the tax reform bill H_A: There is a significant relationship between party affiliation and opinion on the tax reform bill

Using the data.frame() function we can easily code the data from the above table into R going from top to bottom row-wise then left to right column-wise. By putting the variable name before the = in each line we can set the desired names for each column then set the row.names option to add the desired names to the rows.

opinion <- data.frame(Favor = c(138, 64),
                      Indifferent = c(83, 67),
                      Opposed = c(64, 84),
                      row.names = c("Democrat", "Republican"))

opinion

##            Favor Indifferent Opposed
## Democrat     138          83      64
## Republican    64          67      84

Note that if we wanted our columns to be party affiliation and rows the opinion on the tax reform bill we can simply change how we set up our data frame (or transpose them with t()) and the χ² test will still have the same statistical results.

Now that we have our data in a data frame we can run the χ² test with the chisq.test() function. We will assign the results to a new object so that we can get some additional information out of it later.

opinion.chisq <- chisq.test(opinion)

opinion.chisq

## 
##  Pearson's Chi-squared test
## 
## data:  opinion
## X-squared = 22.152, df = 2, p-value = 1.548e-05

The output of the chisq.test() function gives us the test statistic (X-squared = 22.152), the degrees of freedom (df = 2), and the p-value associated with the test statistics (p-value = 1.548e-05). Importantly for answering our original question, the p-value is much less than 0 so that we can conclude that an association does exist between party affiliation and a person’s opinion on the tax reform bill.

With the χ² test the expected observations if the null hypothesis was true are calculated. Since we assigned the results to an object we can append to it $expected to print a table of the expected counts.

opinion.chisq$expected

##             Favor Indifferent Opposed
## Democrat   115.14        85.5   84.36
## Republican  86.86        64.5   63.64

Comparing the expected counts with the observed counts in our original table we can see that respondents who identified as Republican had more Opposed responses than expected compared to those who identified as Democrat, for who had more responses for Favor. Conversely, the observed and expected counts almost match for the Indifferent responses from both parties.

If we wanted to run post-hoc analyses to statistically determine which specific responses are different we could consider proportion tests with prop.test() or further χ² tests for each pairwise comparison with multiple test corrections. However, in this case it is quite clear that Democrat respondents view the tax bill more favorably than those who identify as Republican.

Full code block

# Put data into a data frame and print
opinion <- data.frame(Favor = c(138, 64),
                      Indifferent = c(83, 67),
                      Opposed = c(64, 84),
                      row.names = c("Democrat", "Republican"))

opinion

# Fit chi-square test and print results
opinion.chisq <- chisq.test(opinion)

opinion.chisq

# Print table of expected counts
opinion.chisq$expected

Running the χ2 test

Full code block

Running the χ² test