Logo

Data Learning Center Statistical Guides

MANOVA.knit

MANOVA in R

In this tutorial we will analyze a data set where red and white wines (type) were identified as one of three quality levels (quality) along with five measures of their chemical attributes:

  • residual_sugar - grams per liter of sugars remaining after fermentation.
  • pH - acidity as measured by the negative log10 of the solution’s hydronium ion concentration.
  • alcohol - percent volume that is alcohol.
  • chlorides - grams per liter of sodium chloride.
  • sulphates - grams per liter of potassium sulphate.

Using the MANOVA test we want to identify whether any of these chemical attributes vary between red and white wines as well as their quality. To assist us with our analysis we will employ functions from three different packages:

  • car (Companion to Applied Regression) to add more statistical tests for our analysis.
  • rstatix for pipe-friendly statistical functions.
  • tidyverse which contains multiple packages for data manipulation, visualization, and analysis (more information on tidyverse).

While these libraries are not required to perform a MANOVA test in R, they do provide additional tools to improve the processes of checking assumptions and performing statistical analyses for multiple variables.

library(car)
library(rstatix)
library(tidyverse)

After loading the packages we can bring in wine quality data set using the read_csv() function, a tidy-friendly function to read .csv files similar to read.csv(). We will also get a look at our data after it is loaded in using head().

wine <- read_csv("../../dat/wine-quality.csv")

head(wine)
## # A tibble: 6 × 7
##   type  quality residual_sugar    pH alcohol chlorides sulphates
##   <chr> <chr>            <dbl> <dbl>   <dbl>     <dbl>     <dbl>
## 1 red   good               1.8  3.38    11.8     0.066      0.72
## 2 red   good               2    2.94     9.2     0.082      0.66
## 3 red   good               3    3.16    11.5     0.103      1   
## 4 red   good               2.5  3.27     9.3     0.097      0.6 
## 5 red   good               2.6  3.36     9.4     0.093      0.86
## 6 red   good               3.6  3.39    11       0.067      0.66

Note that the type and quality variables are encoded as characters. This is usually fine, but for ease of use we should go ahead and recode them to factors. For the type variable we can code is as a normal factor using facter(). However since the quality variable is actually on a scale where some values are greater or lesser than others we should recode it as an ordered factor. We can do so using the same function but setting the levels option to a concatenated list of values using c() and setting the ordered option to TRUE. Note that when setting an ordered factor you will want the list passed to levels to be in order (typically from lowest to highest for ease of interpretation). To check that everything is coded properly and to get another look at our data set we will then use the str() function to see its structure in R.

wine <- wine %>%
  mutate(type = factor(wine$type),
         quality = factor(wine$quality,
                       levels = c("mediocre", "good", "very good"),
                       ordered = TRUE))

str(wine)
## tibble [60 × 7] (S3: tbl_df/tbl/data.frame)
##  $ type          : Factor w/ 2 levels "red","white": 1 1 1 1 1 1 1 1 1 1 ...
##  $ quality       : Ord.factor w/ 3 levels "mediocre"<"good"<..: 2 2 2 2 2 2 2 2 2 2 ...
##  $ residual_sugar: num [1:60] 1.8 2 3 2.5 2.6 3.6 2 2 8.8 2 ...
##  $ pH            : num [1:60] 3.38 2.94 3.16 3.27 3.36 3.39 3.22 3.37 3.38 3.11 ...
##  $ alcohol       : num [1:60] 11.8 9.2 11.5 9.3 9.4 11 11.2 9.4 10.1 10.1 ...
##  $ chlorides     : num [1:60] 0.066 0.082 0.103 0.097 0.093 0.067 0.12 0.082 0.084 0.067 ...
##  $ sulphates     : num [1:60] 0.72 0.66 1 0.6 0.86 0.66 0.58 0.5 0.64 0.97 ...

From the output of str() we can see that type and quality are now coded as a factor and an ordered factor where its levels are ordered as mediocre < good < very good, respectively. Now that our data is set up properly we can begin testing whether or not the MANOVA assumptions are valid.

Checking assumptions

Before performing the MANOVA analysis we should first check whether our data set meets its assumptions. These assumptions include:

  • Adequate sample size,
  • No multivariate outliers or high leverage points,
  • Approximately multivariate normality,
  • No major multicollinearity, and
  • Homogeneity of covariances, or homoscedasticity

Additionally, while not an assumption having a balanced design (equal group sizes) improves the robustness of the MANOVA test when some of these assumptions are violated, particularly when the data are not approximately multivariate normally distributed.

Sample size

For the MANOVA test to be valid, the number of observations for each group must be greater than the number of outcome variables. With this data, we have 2 levels in the type variable and 3 levels in the quality variable for \(2*3 = 6\) total groups, so we will want the number of observations for each group to be greater than 6. To easily check this we can pipe (%>%) our data into the group_by() function to create groups based on the type and quality variables. We then pipe the grouped data into summarize() and create a column named n which will be the total counts of each group as performed by the n() function.

wine %>%
  group_by(type, quality) %>%
  summarize(n = n())
## # A tibble: 6 × 3
## # Groups:   type [2]
##   type  quality       n
##   <fct> <ord>     <int>
## 1 red   mediocre     10
## 2 red   good         10
## 3 red   very good    10
## 4 white mediocre     10
## 5 white good         10
## 6 white very good    10

As we can see from the output each group has enough observations to meet this assumption. Additionally, as there are 10 observations in each group we can say that the design is balanced.

Multivariate outliers

To flag multivariate outliers we can calculate the Mahalanobis distances for each point, then compare those distances against a χ2 distribution where the degrees of freedom is equal to the number of outcome variables (in this case 3) and α = 0.001. We first pipe our data set into group_by() to again group our independent variables of interest, type and quality, which then is piped into mahalanobis_distance() to calculate the Mahalanobis distances. We will also use the filter() function to only print out significant outliers, otherwise R would print out the results for every data point.

wine %>%
  group_by(type, quality) %>%
  mahalanobis_distance() %>%
  filter(is.outlier == TRUE)
## # A tibble: 0 × 7
## # … with 7 variables: residual_sugar <dbl>, pH <dbl>, alcohol <dbl>, chlorides <dbl>,
## #   sulphates <dbl>, mahal.dist <dbl>, is.outlier <lgl>

Since the output is empty we can conclude that there are no significant multivariate outliers as determined from the Mahalanobis distances.

Multivariate normality

To determine whether the data is approximately multivariate normally distributed we can use the multivariate Shapiro-Wilk test of normality, an extension of the univariate Shapiro-Wilk normality test. For this test we first select the variables we want to test for approximately normal distributions which are the 5 chemical composition variables. Using select() we can select these variables from our data set and pipe them into mshapiro_test() to perform the multivariate Shapiro-Wilk normality test.

Note: with select() we can include the name for every variable we want to select from the data set, however when their columns are in sequence we can use : to select the first and last columns we want to select along with all columns in between.

wine %>%
  select(residual_sugar:sulphates) %>%
  mshapiro_test()
## # A tibble: 1 × 2
##   statistic  p.value
##       <dbl>    <dbl>
## 1     0.899 0.000116

The results of the test indicates that our data does deviate from an approximately multivariate normal distribution, and since we did not identify any outliers from the Mahalanobis distances this departure is unlikely due to extreme points. Although this is not ideal, since we have a balanced design there are methods to improve the robustness of the MANOVA which will be covered later. To note, if our sample sizes are large enough (n ≥ 20) then we can also ignore violations to this assumption due to the central limit theorem.

Multicollinearity

The next assumption we should test is whether a high degree of multicollinearity exists among the dependent variables. A simple way to do so is to calculate the correlations between each variable and determine whether they are below a certain threshold. Here, we will use a threshold of r2 ≥ 0.90 to determine whether we need to remove any variables. To calculate the correlations between the dependent variables we can simply select them with select() then pipe them into the cor_mat() function which will print a matrix of pairwise Person correlations.

wine %>%
  select(residual_sugar:sulphates) %>%
  cor_mat()
## # A tibble: 5 × 6
##   rowname        residual_sugar    pH alcohol chlorides sulphates
## * <chr>                   <dbl> <dbl>   <dbl>     <dbl>     <dbl>
## 1 residual_sugar           1    -0.33   -0.36     -0.29     -0.3 
## 2 pH                      -0.33  1       0.18      0.11      0.11
## 3 alcohol                 -0.36  0.18    1        -0.22      0.14
## 4 chlorides               -0.29  0.11   -0.22      1         0.36
## 5 sulphates               -0.3   0.11    0.14      0.36      1

From the correlation matrix we see that all of the pairwise correlations are well below our threshold and thus can conclude that there is not a problematic degree of multicollinearity in our data. If we were to see high multicollinearity where variables should be removed then it would be best to remove only one variable then reassess multicollinearity on the remaining variables, then repeating if necessary.

Homoscedasticity

Equal covariances, or homoscedasticity , can be tested using Box’s M-test for homogeneity of covariance matrices. We will again use select() to select the dependent variables from the data set then pipe them into the box_m() function which will be used to perform Box’s M-test. However, box_m() function requires an argument, group, to know how to group the data points. If we had only one independent variable we would use it for this argument, however since we have two grouping factors we will be analyzing in the MANOVA both should be used in this argument. So, we will use the fct_cross() function to cross quality and type and create the 6 groupings for the Box’s M-test.

wine %>%
  select(residual_sugar:sulphates) %>%
  box_m(fct_cross(wine$quality, wine$type))
## # A tibble: 1 × 4
##   statistic    p.value parameter method                                             
##       <dbl>      <dbl>     <dbl> <chr>                                              
## 1      145. 0.00000204        75 Box's M-test for Homogeneity of Covariance Matrices

The significant results from the Box’s M-test indicate that there are not equal variances between the groups, or that they have heteroscedasticity . This is not surprising since our data also violated the assumption of approximately multivariate normal distributions. However, since we have a balanced design we can similarly use statistical methods to improve the robustness of the MANOVA even though this assumption was violated.

One-way MANOVA

Now that we have checked our assumptions and decided to continue with our analysis we can perform the MANOVA test. Although we have two grouping variables in the interests of this guide we will begin with a one-way MANOVA analysis using quality as our only independent variable.

To perform a MANOVA test we can use the manova() function with two necessary arguments, formula and data. In the formula argument we will list each of our variables in the cbind() function to combine their columns, followed by ~ then our independent variable, quality. We then add our data set, wine, to the data argument. We will also assign the results of the test to a new object names wine_manova1. We will observe these results with the Manova() function, which will print a MANOVA table using one of four test statistics. The default test statistic is the Pillai’s trace, though the Wilk’s lambda, Hotelling’s trace, or Roy’s largest root statistics can instead be used by changing the test.statistic option. Each statistic has advantages and disadvantages against the other statistics depending on design balanced and violations of assumptions, and it is usually a good idea to asses them all (more information on these statistics can be found here).

wine_manova1 <- manova(cbind(pH, alcohol, residual_sugar, chlorides, sulphates) ~ quality,
                       data = wine)

Manova(wine_manova1)
Manova(wine_manova1, test.statistic = "Wilks")
Manova(wine_manova1, test.statistic = "Hotelling-Lawley")
Manova(wine_manova1, test.statistic = "Roy")
## 
## Type II MANOVA Tests: Pillai test statistic
##         Df test stat approx F num Df den Df  Pr(>F)  
## quality  2   0.33506   2.1734     10    108 0.02472 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Type II MANOVA Tests: Wilks test statistic
##         Df test stat approx F num Df den Df  Pr(>F)  
## quality  2   0.67572    2.295     10    106 0.01764 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Type II MANOVA Tests: Hotelling-Lawley test statistic
##         Df test stat approx F num Df den Df  Pr(>F)  
## quality  2   0.46394   2.4125     10    104 0.01272 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Type II MANOVA Tests: Roy test statistic
##         Df test stat approx F num Df den Df   Pr(>F)   
## quality  2   0.42653   4.6065      5     54 0.001426 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the results we see that each of the test statistics provide statistically significant results, indicating that there is a different in the means of at least one of the chemical composition variables when grouped by their associated wine quality. To determine which of these variables vary significantly by wine quality we will need to perform post-hoc univariate ANOVA tests for each of the 5 chemical composition variables.

Post-hoc tests

ANOVA

Since we have a 5 chemical composition variables, instead of writing many lines of code to independently run an ANOVA test for each of them we can use tidyverse and rstatix to run and print the tests in a more elegant way.

First, we will pipe our data set into pivot_longer() which will convert our data from “wide” to “long” format. This is done by creating two new columns where in the first each row is the name of one of the variable (titled name by default) and the second column the value of that observation (titled value by default). We will list the columns we want to lengthen with c() and assign them to the cols argument.

Note: if you are unfamiliar with converting data between wide and long format or want to see how the data is transformed then run the below code that comes before piping into group_by().

The lengthened data set is then piped into group_by() to put each variable into its own grouping. The grouped data is then piped into welch_anova_test(), with a model formula of value ~ quality to perform ANOVAs for each variable.

Note that we are using the Welch one-way ANOVA test rather than the standard one-way ANOVA, which can be done with anova_test(), because the assumption of homoscedascticity was violated.

wine %>%
  pivot_longer(cols = c(pH, alcohol, residual_sugar, chlorides, sulphates)) %>%
  group_by(name) %>%
  welch_anova_test(value ~ quality)
## # A tibble: 5 × 8
##   name           .y.       n statistic   DFn   DFd        p method     
## * <chr>          <chr> <int>     <dbl> <dbl> <dbl>    <dbl> <chr>      
## 1 alcohol        value    60     10.9      2  37.0 0.000193 Welch ANOVA
## 2 chlorides      value    60      0.58     2  37.1 0.566    Welch ANOVA
## 3 pH             value    60      1.37     2  37.7 0.266    Welch ANOVA
## 4 residual_sugar value    60      1.09     2  37.2 0.346    Welch ANOVA
## 5 sulphates      value    60      2.51     2  37.4 0.095    Welch ANOVA

From the results of the five tests we can see that only alcohol varies significantly by wine quality. However, as we have 3 levels of wine qualities we are not certain how these levels differ. We will need to perform a second post-hoc test to make these pairwise comparisons.

Pairwise comparisons

There are multiple post-hoc pairwise tests that can be performed, but for this gudie will we use Tukey’s honest significant differences test. Because we only had one chemical composition variable that was significant in the ANOVA test we do not need to convert our data from wide to long to run all of our tests. Instead, all we need to do is pipe our data set into tukey_hsd() with the model formula as alcohol ~ quality. Additionally, because quality is an ordered factor we should pass TRUE to the ordered argument so that the order of the levels is accounted for before taking their differences.

wine %>%
  tukey_hsd(alcohol ~ quality, ordered = TRUE)
## # A tibble: 3 × 9
##   term    group1   group2    null.value estimate conf.low conf.high    p.adj p.adj.signif
## * <chr>   <chr>    <chr>          <dbl>    <dbl>    <dbl>     <dbl>    <dbl> <chr>       
## 1 quality mediocre good               0    0.907   0.0585      1.75 0.0336   *           
## 2 quality mediocre very good          0    1.55    0.707       2.40 0.000135 ***         
## 3 quality good     very good          0    0.648  -0.200       1.50 0.166    ns

From the results we can see that the mediocre wines have significantly different alcohol content compared to good and very good quality of wines, for which do not vary in alcohol content significantly. The estimate column shows the differences in the means between group2 and group1, indicating that the alcohol content in mediocre wine is 0.907% less than good wines and 1.55% less than very good wines on average.

Two-way MANOVA

The two-way MANOVA can be written similarly to the one-way MANOVA except we add type to the right of the ~ in the model formula. Additionally, we use * to indicate that we want to include both type and the interaction between quality and type in our model. Because we are adding this interaction, we should use a type 3 sum of squares approach instead of the default type 2 sum of squares by assigning 3 to the type argument.

wine_manova2 <- manova(cbind(pH, alcohol, residual_sugar, chlorides, sulphates) ~ quality * type,
                       data = wine)

Manova(wine_manova2, type = 3)
Manova(wine_manova2, test.statistic = "Wilks", type = 3)
Manova(wine_manova2, test.statistic = "Hotelling-Lawley", type = 3)
Manova(wine_manova2, test.statistic = "Roy", type = 3)
## 
## Type III MANOVA Tests: Pillai test statistic
##              Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)   1   0.99821   5570.7      5     50 < 2.2e-16 ***
## quality       2   0.36076      2.2     10    102   0.02065 *  
## type          1   0.78403     36.3      5     50 1.635e-15 ***
## quality:type  2   0.24110      1.4     10    102   0.19173    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Type III MANOVA Tests: Wilks test statistic
##              Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)   1   0.00179   5570.7      5     50 < 2.2e-16 ***
## quality       2   0.66153      2.3     10    100   0.01808 *  
## type          1   0.21597     36.3      5     50 1.635e-15 ***
## quality:type  2   0.77248      1.4     10    100   0.20148    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Type III MANOVA Tests: Hotelling-Lawley test statistic
##              Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)   1    557.07   5570.7      5     50 < 2.2e-16 ***
## quality       2      0.48      2.3     10     98   0.01598 *  
## type          1      3.63     36.3      5     50 1.635e-15 ***
## quality:type  2      0.28      1.4     10     98   0.21175    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Type III MANOVA Tests: Roy test statistic
##              Df test stat approx F num Df den Df    Pr(>F)    
## (Intercept)   1    557.07   5570.7      5     50 < 2.2e-16 ***
## quality       2      0.39      4.0      5     51   0.00389 ** 
## type          1      3.63     36.3      5     50 1.635e-15 ***
## quality:type  2      0.18      1.8      5     51   0.12545    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In the output above we see that again each of the four test statistics are in agreement. First, the interaction term, quality:type is insignificant. Conversely, both the quality and type terms are significant in each test, indicating that at least one of the chemical composition variables vary significantly by wine quality and by the type of wine. However, again the MANOVA does not indicate which of these variables vary significantly, so we will need to perform a post-hoc two-way ANOVA test.

Post-hoc tests

ANOVA

Again, the method for performing the post-hoc two-way ANOVAs as with the one-way ANOVAs except that we need to include the second term. Since we did not see significance in the interaction term in the MANOVA we can exclude it in the follow-up models by using + to add type instead of *. Additionally, we will use anova_test() instead of welch_anova_test() as the latter function can only perform one-way ANOVAs. To apply a correction for heteroscedasticity we assign TRUE to the white.adjust option.

wine %>%
  pivot_longer(cols = c(pH, alcohol, residual_sugar, chlorides, sulphates)) %>%
  group_by(name) %>%
  anova_test(value ~ quality + type, white.adjust = TRUE)
## # A tibble: 10 × 7
##    name           Effect    DFn   DFd      F        p `p<.05`
##  * <chr>          <chr>   <dbl> <dbl>  <dbl>    <dbl> <chr>  
##  1 alcohol        quality     2    56 10.3   1.58e- 4 "*"    
##  2 alcohol        type        1    56  0.048 8.28e- 1 ""     
##  3 chlorides      quality     2    56  1.12  3.35e- 1 ""     
##  4 chlorides      type        1    56 83.7   1.04e-12 "*"    
##  5 pH             quality     2    56  1.42  2.5 e- 1 ""     
##  6 pH             type        1    56  8.58  5   e- 3 "*"    
##  7 residual_sugar quality     2    56  1.47  2.38e- 1 ""     
##  8 residual_sugar type        1    56 18.1   7.9 e- 5 "*"    
##  9 sulphates      quality     2    56  2.76  7.2 e- 2 ""     
## 10 sulphates      type        1    56 22.3   1.6 e- 5 "*"

From the results of the two-way ANOVA tests we see that alcohol is the only variable that varies by quality (similar to our one-way MANOVA results) while chlorides, pH, residual_sugar, and sulphates all vary significantly by type. Again however, the results of the two-way ANOVA do not tell us how exactly the levels in these variables vary, so we will need to follow-up with post-hoc pairwise comparison tests.

Pairwise comparisons

For the post-hoc pairwise comparisons we will only want to test the grouped variables that had signficant results. So, we will run two sets of code, one for the variables that significantly vary by quality and a second for those significantly varying by type. The first set will be the same as with the one-way MANOVA, where the data set is piped to tukey_hsd() with the model formula as alcohol ~ quality.

wine %>%
  tukey_hsd(alcohol ~ quality, ordered = TRUE)
## # A tibble: 3 × 9
##   term    group1   group2    null.value estimate conf.low conf.high    p.adj p.adj.signif
## * <chr>   <chr>    <chr>          <dbl>    <dbl>    <dbl>     <dbl>    <dbl> <chr>       
## 1 quality mediocre good               0    0.907   0.0585      1.75 0.0336   *           
## 2 quality mediocre very good          0    1.55    0.707       2.40 0.000135 ***         
## 3 quality good     very good          0    0.648  -0.200       1.50 0.166    ns

Note that the results are the same as those from the one-way MANOVA post-hoc test, so are interpeted the same.

The second block of code is similar to the one we used to perform the two-way ANOVAs with a couple of changes. First, we will not include alcohol in the list of columns to lengthen by pivot_longer(). Second, we use tukey_hsd() with a model formula of value ~ type to run a Tukey’s test for each of the 4 variables.

wine %>%
  pivot_longer(cols = c(pH, residual_sugar, chlorides, sulphates)) %>%
  group_by(name) %>%
  tukey_hsd(value ~ type)
## # A tibble: 4 × 10
##   name           term  group1 group2 null.value estimate conf.low conf.high    p.adj p.adj.signif
## * <chr>          <chr> <chr>  <chr>       <dbl>    <dbl>    <dbl>     <dbl>    <dbl> <chr>       
## 1 chlorides      type  red    white           0  -0.0356  -0.0432   -0.0280 1.52e-11 ****        
## 2 pH             type  red    white           0  -0.106   -0.177    -0.0352 4.06e- 3 **          
## 3 residual_sugar type  red    white           0   4.59     2.48      6.71   5.61e- 5 ****        
## 4 sulphates      type  red    white           0  -0.188   -0.268    -0.108  1.5 e- 5 ****

From these results we see that since there are only two types of wine in our type group, there is only one pairwise comparison for each of the chemical composition variables, each being statistically significant. From the estimate column we can see how these means vary between red and white wines, with chlorides, pH, and sulphates being lower and residual_sugar higher in concentration on average in white wines.

DLC_statistical_guides