Repeated
measures ANOVA in R
In this guide we will be using a data set provided by STAT 510 - Applied Time Series Analysis, which contains observations from an experiment designed to detect phlebitis (venous inflammation) by measuring temperature during intravenous administration of a drug of interest in the ear of animals over time. The data set has four variables:
- Animal - the unique identifier for the 15 animals tested.
- Treatment - one of three treatments administered:
- the drug in a carrier solution,
- only carrier solution, or
- saline solution.
- Time - the time point in minutes when the temperature was recorded.
- Y - the difference in temperature between the treated and untreated ears of the animal.
First, we will load the data set using read.csv()
, then
we will use the head()
and str()
functions to
take a peak at the data and its structure.
<- read.csv("../../dat/phlebitis.csv")
phleb
head(phleb)
str(phleb)
## Animal Treatment Time Y
## 1 1 drug 0 -0.3
## 2 1 drug 30 -0.2
## 3 1 drug 60 1.2
## 4 1 drug 90 3.1
## 5 2 drug 0 -0.5
## 6 2 drug 30 2.2
## 'data.frame': 60 obs. of 4 variables:
## $ Animal : int 1 1 1 1 2 2 2 2 3 3 ...
## $ Treatment: chr "drug" "drug" "drug" "drug" ...
## $ Time : int 0 30 60 90 0 30 60 90 0 30 ...
## $ Y : num -0.3 -0.2 1.2 3.1 -0.5 2.2 3.3 3.7 -1.1 2.4 ...
From the output we see that Animal was assigned as an
integer while Treatment was classified as a character, when we
would actually want them to be of the factor type. We will then reassign
both variables using factor()
.
$Treatment <- factor(phleb$Treatment)
phleb$Animal <- factor(phleb$Animal) phleb
Now that our variables are in the formats we want we can move on to our analysis. There are multiple statistical approaches to choose from when analyzing repeated measures experiments, so for this guide we will focus on just a few of them.
Split-plot ANOVA approach
The split-plot ANOVA is the most commonly used approach for repeated measure designs. For our phlebitis data we can use the split-plot ANOVA in two ways: 1) reassigning Time as a factor to perform a two-way ANOVA, or 2) keep Time as a numerical variable to perform an ANCOVA.
Time as a factor
Although the repeated measures ANOVA can be performed in base
R
, to reduce complexity we will instead use the
lme4
package, which contains functions built around
mixed-effect modeling. Additionally, we will load the
lmerTest
package which expands on the lme4
package by including more readable statistical output and the
car
package which will can easily perform type 2 and type 3
sum of squares.
library(lme4)
library(lmerTest)
library(car)
From the lme4
package we will use the
lmer()
function to fit our model, in which we will code a
model formula with the response varaible Y on the left side of
a ~
and the predictor variables on thr right side. In the
predictor variables we will include Treatment and Time
with their interaction, where rather than recoding Time as a
factor within the data set we can assign it as a factor using
factor()
within the model formula.. To include our repeated
measure, we will add (1 | Animal)
to the model formula to
indicate that measurements were repeated on each animal. We will then
use the Anova()
function from the car
package
to print out the resulting type 3 sum of squares table.
<- lmer(Y ~ Treatment * factor(Time) + (1 | Animal), data = phleb)
phleb_lmer
Anova(phleb_lmer, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: Y
## Chisq Df Pr(>Chisq)
## (Intercept) 2.5421 1 0.11085
## Treatment 0.8040 2 0.66897
## factor(Time) 2.3690 3 0.49943
## Treatment:factor(Time) 17.4003 6 0.00792 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
From the results there is clearly a statistically significant
interaction between Treatment and Time, indicating
that the difference in ear temperature (Y) changes differently
between at least some of the treatments and how those temperatures vary
over time is also different for at least some of the treatments. To
visualize these variations we can use the with()
and
interaction.plot()
functions to plot the differences in ear
temperature over time for each treatment.
with(phleb,
interaction.plot(Time, Treatment, Y))
From the interaction plot it appears that the differences in ear
temperatures for both the carrier and saline groups
remain relatively flat over the time points while the drug
treatment rises over time. However, while we know from the ANOVA table
that there is a significant interaction, we are given no information
regarding which comparisons between Treatment and Time
are statistically different. There are a few methods available to
perform post-hoc pairwise tests to determine which comparisons differ
statistically, but for this guide we will compute estimate marginal
means (EMMs), or least-squares means, using the emmeans
package.
After loading the emmeans
package we will use the
emmeans()
function to compute the EMMs and pairwise
comparisons on our model. In the function we will include
list(pairwise ~ Treatment * factor(Time))
to specify that
we want to calculate the pairwise comparisons among our two factors,
Treatment and Time. We will additionally include
adjust = "tukey"
to perform Tukey’s multiple comparisons
adjustments. We will then use the $
operator on the results
to only print out the pairwise differences in the interaction between
Treatment and Time rather than all pairwise
comparisons.
library(emmeans)
<- emmeans(phleb_lmer, list(pairwise ~ Treatment * factor(Time)), adjust = "tukey")
phleb_emm
$`pairwise differences of Treatment, Time` phleb_emm
## 1 estimate SE df t.ratio p.value
## carrier Time0 - drug Time0 -0.34 0.514 45.8 -0.661 0.9999
## carrier Time0 - saline Time0 0.10 0.514 45.8 0.194 1.0000
## carrier Time0 - carrier Time30 -0.06 0.481 36.0 -0.125 1.0000
## carrier Time0 - drug Time30 -1.96 0.514 45.8 -3.810 0.0188
## carrier Time0 - saline Time30 -0.82 0.514 45.8 -1.594 0.9025
## carrier Time0 - carrier Time60 -0.52 0.481 36.0 -1.081 0.9937
## carrier Time0 - drug Time60 -2.46 0.514 45.8 -4.782 0.0010
## carrier Time0 - saline Time60 -0.16 0.514 45.8 -0.311 1.0000
## carrier Time0 - carrier Time90 -0.58 0.481 36.0 -1.206 0.9850
## carrier Time0 - drug Time90 -3.16 0.514 45.8 -6.142 <.0001
## carrier Time0 - saline Time90 -0.80 0.514 45.8 -1.555 0.9160
## drug Time0 - saline Time0 0.44 0.514 45.8 0.855 0.9993
## drug Time0 - carrier Time30 0.28 0.514 45.8 0.544 1.0000
## drug Time0 - drug Time30 -1.62 0.481 36.0 -3.368 0.0668
## drug Time0 - saline Time30 -0.48 0.514 45.8 -0.933 0.9983
## drug Time0 - carrier Time60 -0.18 0.514 45.8 -0.350 1.0000
## drug Time0 - drug Time60 -2.12 0.481 36.0 -4.408 0.0044
## drug Time0 - saline Time60 0.18 0.514 45.8 0.350 1.0000
## drug Time0 - carrier Time90 -0.24 0.514 45.8 -0.467 1.0000
## drug Time0 - drug Time90 -2.82 0.481 36.0 -5.863 0.0001
## drug Time0 - saline Time90 -0.46 0.514 45.8 -0.894 0.9989
## saline Time0 - carrier Time30 -0.16 0.514 45.8 -0.311 1.0000
## saline Time0 - drug Time30 -2.06 0.514 45.8 -4.004 0.0108
## saline Time0 - saline Time30 -0.92 0.481 36.0 -1.913 0.7439
## saline Time0 - carrier Time60 -0.62 0.514 45.8 -1.205 0.9859
## saline Time0 - drug Time60 -2.56 0.514 45.8 -4.976 0.0005
## saline Time0 - saline Time60 -0.26 0.481 36.0 -0.541 1.0000
## saline Time0 - carrier Time90 -0.68 0.514 45.8 -1.322 0.9718
## saline Time0 - drug Time90 -3.26 0.514 45.8 -6.337 <.0001
## saline Time0 - saline Time90 -0.90 0.481 36.0 -1.871 0.7682
## carrier Time30 - drug Time30 -1.90 0.514 45.8 -3.693 0.0258
## carrier Time30 - saline Time30 -0.76 0.514 45.8 -1.477 0.9393
## carrier Time30 - carrier Time60 -0.46 0.481 36.0 -0.956 0.9978
## carrier Time30 - drug Time60 -2.40 0.514 45.8 -4.665 0.0015
## carrier Time30 - saline Time60 -0.10 0.514 45.8 -0.194 1.0000
## carrier Time30 - carrier Time90 -0.52 0.481 36.0 -1.081 0.9937
## carrier Time30 - drug Time90 -3.10 0.514 45.8 -6.026 <.0001
## carrier Time30 - saline Time90 -0.74 0.514 45.8 -1.438 0.9491
## drug Time30 - saline Time30 1.14 0.514 45.8 2.216 0.5472
## drug Time30 - carrier Time60 1.44 0.514 45.8 2.799 0.2144
## drug Time30 - drug Time60 -0.50 0.481 36.0 -1.040 0.9955
## drug Time30 - saline Time60 1.80 0.514 45.8 3.499 0.0432
## drug Time30 - carrier Time90 1.38 0.514 45.8 2.682 0.2673
## drug Time30 - drug Time90 -1.20 0.481 36.0 -2.495 0.3746
## drug Time30 - saline Time90 1.16 0.514 45.8 2.255 0.5211
## saline Time30 - carrier Time60 0.30 0.514 45.8 0.583 1.0000
## saline Time30 - drug Time60 -1.64 0.514 45.8 -3.188 0.0929
## saline Time30 - saline Time60 0.66 0.481 36.0 1.372 0.9617
## saline Time30 - carrier Time90 0.24 0.514 45.8 0.467 1.0000
## saline Time30 - drug Time90 -2.34 0.514 45.8 -4.548 0.0021
## saline Time30 - saline Time90 0.02 0.481 36.0 0.042 1.0000
## carrier Time60 - drug Time60 -1.94 0.514 45.8 -3.771 0.0209
## carrier Time60 - saline Time60 0.36 0.514 45.8 0.700 0.9999
## carrier Time60 - carrier Time90 -0.06 0.481 36.0 -0.125 1.0000
## carrier Time60 - drug Time90 -2.64 0.514 45.8 -5.132 0.0003
## carrier Time60 - saline Time90 -0.28 0.514 45.8 -0.544 1.0000
## drug Time60 - saline Time60 2.30 0.514 45.8 4.471 0.0027
## drug Time60 - carrier Time90 1.88 0.514 45.8 3.654 0.0287
## drug Time60 - drug Time90 -0.70 0.481 36.0 -1.455 0.9431
## drug Time60 - saline Time90 1.66 0.514 45.8 3.227 0.0848
## saline Time60 - carrier Time90 -0.42 0.514 45.8 -0.816 0.9995
## saline Time60 - drug Time90 -3.00 0.514 45.8 -5.831 <.0001
## saline Time60 - saline Time90 -0.64 0.481 36.0 -1.331 0.9691
## carrier Time90 - drug Time90 -2.58 0.514 45.8 -5.015 0.0005
## carrier Time90 - saline Time90 -0.22 0.514 45.8 -0.428 1.0000
## drug Time90 - saline Time90 2.36 0.514 45.8 4.587 0.0019
##
## Degrees-of-freedom method: kenward-roger
## P value adjustment: tukey method for comparing a family of 12 estimates
With Time set as a factor, to look at all pairwise comparisons in the interaction between Time with 4 levels and Treatment with 3 levels we have a number of comparisons equal to:
\[\frac{k_{Time}*k_{Treatment}*(k_{Time}*k_{Treatment}-1)}{2} = 66\]
Though this information could be useful depending on the purpose of the experiment, it can be fairly tedious to analyze and interpret 66 comparisons. Instead, for this experiment it may be best to interpret the results as the drug significantly increases the ear temperatures at the 30, 60, and 90 minute time points compared to the saline and carrier groups.
Time as a numeric variable
As Time is continuous it may be more suitable to code it as
a numeric variable rather than a factor in the split-plot model,
therefore performing an ANCOVA with repeated measures rather than a
two-way ANOVA. We can then fit and summarize a similar model as above
but without wrapping Time in factor()
.
<- lmer(Y ~ Treatment * Time + (1 | Animal), data = phleb)
phleb_lmer
Anova(phleb_lmer, type = 3)
## Analysis of Deviance Table (Type III Wald chisquare tests)
##
## Response: Y
## Chisq Df Pr(>Chisq)
## (Intercept) 3.9066 1 0.048096 *
## Treatment 2.5515 2 0.279221
## Time 2.0621 1 0.150997
## Treatment:Time 13.2946 2 0.001298 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note that from the results we again have a statistically significant
interaction term. However, it also does not give us information
regarding which comparisons in Treatment and Time are
statistically different, so we will again employ the
emmeans()
function to test these pairwise comparisons and
use the $
operator to only print out the comparisons for
the interaction.
<- emmeans(phleb_lmer, list(pairwise ~ Treatment * Time), adjust = "tukey")
phleb_emm
$`pairwise differences of Treatment, Time` phleb_emm
## 1 estimate SE df t.ratio p.value
## carrier Time45 - drug Time45 -1.69 0.302 12 -5.597 0.0003
## carrier Time45 - saline Time45 -0.13 0.302 12 -0.431 0.9037
## drug Time45 - saline Time45 1.56 0.302 12 5.166 0.0006
##
## Degrees-of-freedom method: kenward-roger
## P value adjustment: tukey method for comparing a family of 3 estimates
From the results of the Tukey-adjusted pairwise comparisons we can see that the drug treatment is significantly higher than both the carrier and saline treatments, of which are not statistically different from one another. However, it is important to note that since we used Time as a continuous variable this comparison was made over the average of the time measures and not for each time point. So, while we do have simpler statistical results to interpret compared to the previous model, we do lose some information on which time points the effect of the drug is significant. Specifically, the results here do not show that at the first time point (0) there was not a significant difference between any of the treatments, something we would expect in a good experiemntal design.
Autoregressive model of order 1 approach
With some experiments where the response is expected to change over time we may also make the assumption that those changes per time point rely on their values at the previous time point. In these cases we may want to use an autoregressive model of order 1 (AR(1)) structured model to account for measurements at the previous time point.
To do so, we will need to employ another library, nlme
,
to fit a linear model using generalized least squares (GLS) that
includes the AR(1) component. We will use the gls()
function to fit this model with a similar model formula as above,
however we will add the random effects component in
corAR1()
to specify the AR(1) correlation structure for the
model. We will also assign Time as a factor so that we can
assess the differences among the drugs at each time point. We then
similarly use the Anova()
function from the
car
package to print a summary of the results.
library(nlme)
<- gls(Y ~ Treatment * factor(Time), data = phleb, correlation = corAR1(form = ~ 1 | Animal))
phleb_gls
Anova(phleb_gls, type = 3)
## Analysis of Deviance Table (Type III tests)
##
## Response: Y
## Df Chisq Pr(>Chisq)
## (Intercept) 1 2.5675 0.1091
## Treatment 2 0.8121 0.6663
## factor(Time) 3 1.9670 0.5793
## Treatment:factor(Time) 6 16.6644 0.0106 *
## ---
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
From the ANOVA table above we see again that there is a statistically
significant interaction between Treatment and Time. To
test which groups vary from one another we will again employ the
emmeans()
function to perform Tukey-adjusted pairwise
comparisons.
<- emmeans(phleb_gls, list(pairwise ~ Treatment * Time), adjust = "tukey")
phleb_emm
$`pairwise differences of Treatment, Time` phleb_emm
## 1 estimate SE df t.ratio p.value
## carrier Time0 - drug Time0 -0.34 0.512 43.2 -0.664 0.9999
## carrier Time0 - saline Time0 0.10 0.512 43.2 0.195 1.0000
## carrier Time0 - carrier Time30 -0.06 0.432 33.3 -0.139 1.0000
## carrier Time0 - drug Time30 -1.96 0.512 43.2 -3.829 0.0186
## carrier Time0 - saline Time30 -0.82 0.512 43.2 -1.602 0.8990
## carrier Time0 - carrier Time60 -0.52 0.490 47.4 -1.060 0.9950
## carrier Time0 - drug Time60 -2.46 0.512 43.2 -4.806 0.0010
## carrier Time0 - saline Time60 -0.16 0.512 43.2 -0.313 1.0000
## carrier Time0 - carrier Time90 -0.58 0.506 46.9 -1.147 0.9905
## carrier Time0 - drug Time90 -3.16 0.512 43.2 -6.173 <.0001
## carrier Time0 - saline Time90 -0.80 0.512 43.2 -1.563 0.9129
## drug Time0 - saline Time0 0.44 0.512 43.2 0.860 0.9992
## drug Time0 - carrier Time30 0.28 0.512 43.2 0.547 1.0000
## drug Time0 - drug Time30 -1.62 0.432 33.3 -3.747 0.0280
## drug Time0 - saline Time30 -0.48 0.512 43.2 -0.938 0.9982
## drug Time0 - carrier Time60 -0.18 0.512 43.2 -0.352 1.0000
## drug Time0 - drug Time60 -2.12 0.490 47.4 -4.323 0.0040
## drug Time0 - saline Time60 0.18 0.512 43.2 0.352 1.0000
## drug Time0 - carrier Time90 -0.24 0.512 43.2 -0.469 1.0000
## drug Time0 - drug Time90 -2.82 0.506 46.9 -5.575 0.0001
## drug Time0 - saline Time90 -0.46 0.512 43.2 -0.899 0.9988
## saline Time0 - carrier Time30 -0.16 0.512 43.2 -0.313 1.0000
## saline Time0 - drug Time30 -2.06 0.512 43.2 -4.024 0.0108
## saline Time0 - saline Time30 -0.92 0.432 33.3 -2.128 0.6079
## saline Time0 - carrier Time60 -0.62 0.512 43.2 -1.211 0.9852
## saline Time0 - drug Time60 -2.56 0.512 43.2 -5.001 0.0006
## saline Time0 - saline Time60 -0.26 0.490 47.4 -0.530 1.0000
## saline Time0 - carrier Time90 -0.68 0.512 43.2 -1.328 0.9704
## saline Time0 - drug Time90 -3.26 0.512 43.2 -6.368 <.0001
## saline Time0 - saline Time90 -0.90 0.506 46.9 -1.779 0.8206
## carrier Time30 - drug Time30 -1.90 0.512 43.2 -3.712 0.0255
## carrier Time30 - saline Time30 -0.76 0.512 43.2 -1.485 0.9369
## carrier Time30 - carrier Time60 -0.46 0.432 33.3 -1.064 0.9944
## carrier Time30 - drug Time60 -2.40 0.512 43.2 -4.688 0.0015
## carrier Time30 - saline Time60 -0.10 0.512 43.2 -0.195 1.0000
## carrier Time30 - carrier Time90 -0.52 0.490 47.4 -1.060 0.9950
## carrier Time30 - drug Time90 -3.10 0.512 43.2 -6.056 <.0001
## carrier Time30 - saline Time90 -0.74 0.512 43.2 -1.446 0.9470
## drug Time30 - saline Time30 1.14 0.512 43.2 2.227 0.5403
## drug Time30 - carrier Time60 1.44 0.512 43.2 2.813 0.2105
## drug Time30 - drug Time60 -0.50 0.432 33.3 -1.156 0.9890
## drug Time30 - saline Time60 1.80 0.512 43.2 3.516 0.0425
## drug Time30 - carrier Time90 1.38 0.512 43.2 2.696 0.2626
## drug Time30 - drug Time90 -1.20 0.490 47.4 -2.447 0.3967
## drug Time30 - saline Time90 1.16 0.512 43.2 2.266 0.5143
## saline Time30 - carrier Time60 0.30 0.512 43.2 0.586 1.0000
## saline Time30 - drug Time60 -1.64 0.512 43.2 -3.204 0.0912
## saline Time30 - saline Time60 0.66 0.432 33.3 1.527 0.9222
## saline Time30 - carrier Time90 0.24 0.512 43.2 0.469 1.0000
## saline Time30 - drug Time90 -2.34 0.512 43.2 -4.571 0.0021
## saline Time30 - saline Time90 0.02 0.490 47.4 0.041 1.0000
## carrier Time60 - drug Time60 -1.94 0.512 43.2 -3.790 0.0207
## carrier Time60 - saline Time60 0.36 0.512 43.2 0.703 0.9999
## carrier Time60 - carrier Time90 -0.06 0.432 33.3 -0.139 1.0000
## carrier Time60 - drug Time90 -2.64 0.512 43.2 -5.157 0.0003
## carrier Time60 - saline Time90 -0.28 0.512 43.2 -0.547 1.0000
## drug Time60 - saline Time60 2.30 0.512 43.2 4.493 0.0027
## drug Time60 - carrier Time90 1.88 0.512 43.2 3.673 0.0283
## drug Time60 - drug Time90 -0.70 0.432 33.3 -1.619 0.8898
## drug Time60 - saline Time90 1.66 0.512 43.2 3.243 0.0833
## saline Time60 - carrier Time90 -0.42 0.512 43.2 -0.820 0.9995
## saline Time60 - drug Time90 -3.00 0.512 43.2 -5.861 <.0001
## saline Time60 - saline Time90 -0.64 0.432 33.3 -1.480 0.9358
## carrier Time90 - drug Time90 -2.58 0.512 43.2 -5.040 0.0005
## carrier Time90 - saline Time90 -0.22 0.512 43.2 -0.430 1.0000
## drug Time90 - saline Time90 2.36 0.512 43.2 4.610 0.0019
##
## Degrees-of-freedom method: satterthwaite
## P value adjustment: tukey method for comparing a family of 12 estimates
Similar to the split-plot models above the drug treatment causes higher differences in ear temperature on average at the 30, 60, and 90 minute time points compared to the carrier and saline treatments, which are not statistically different from one another at any time point.