Spline Model

Example 1: Stress Treatment Effects

Read the new data set representing a randomized-control experiment into treatment and control groups. Stress are measured weekly for 14 weeks.

dat1 <- read.table("lecture12ex1.csv", sep=",", header=TRUE)

See descriptive statistics of all variables within the data set.

library(psych)
describe(dat1)
##        vars     n    mean      sd median trimmed     mad min   max range  skew
## rowid     1 14000 7000.50 4041.60 7000.5 7000.50 5189.10   1 14000 13999  0.00
## pid       2 14000  500.50  288.69  500.5  500.50  370.65   1  1000   999  0.00
## time      3 14000    7.50    4.03    7.5    7.50    5.19   1    14    13  0.00
## stress    4 14000   50.21    9.85   51.0   50.39   10.38  11    83    72 -0.20
## treat     5 14000    0.50    0.50    0.5    0.50    0.74   0     1     1  0.00
## sleep     6 14000  423.58   58.69  423.0  423.53   60.79 226   635   409  0.01
## age       7 14000   36.64    6.20   36.0   36.58    5.93  18    60    42  0.11
##        kurtosis    se
## rowid     -1.20 34.16
## pid       -1.20  2.44
## time      -1.21  0.03
## stress     0.10  0.08
## treat     -2.00  0.00
## sleep     -0.13  0.50
## age        0.00  0.05

In this example, the time will be separated into three periods: baseline, treatment, and follow-up. The treatment starts right after measuring Week 5. The treatments are implemented at Week 5, 6, 7, 8, 9, and 10. The measurements at Week 6, 7, 8, 9, 10, and 11 are the direct impacts of the treatment so these measurements are included in the treatment period. Thus, Week 12, 13, and 14 are the pure follow-up period.

Separate Time Variables into Three Variables
Time Centered.Time Baseline Treatment Follow.up
1 0 0 0 0
2 1 1 0 0
3 2 2 0 0
4 3 3 0 0
5 4 4 0 0
6 5 4 1 0
7 6 4 2 0
8 7 4 3 0
9 8 4 4 0
10 9 4 5 0
11 10 4 6 0
12 11 4 6 1
13 12 4 6 2
14 13 4 6 3
dat1$timec <- dat1$time - 1
dat1$timebase <- dat1$timec
dat1[dat1$timebase > 4, "timebase"] <- 4
dat1$timetreat <- dat1$timec - dat1$timebase
dat1[dat1$timetreat > 6, "timetreat"] <- 6
dat1$timefollow <- dat1$timec - dat1$timebase - dat1$timetreat

Check the null model to investigate the intraclass correlation of stress.

library(lme4)
out1a <- lmer(stress ~ 1 + (1|pid), data=dat1, REML=FALSE)
summary(out1a)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + (1 | pid)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
## 103315.8 103338.4 -51654.9 103309.8    13997 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.7791 -0.6489  0.0212  0.6755  3.2755 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  pid      (Intercept)  9.076   3.013   
##  Residual             88.021   9.382   
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  50.2054     0.1239     405

Use timec to predict job satisfaction without random slopes.

out1b <- lmer(stress ~ 1 + timec + (1|pid), data=dat1, REML=FALSE)
summary(out1b)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + timec + (1 | pid)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
## 102801.3 102831.5 -51396.6 102793.3    13996 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8198 -0.6463  0.0086  0.6626  3.2683 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  pid      (Intercept)  9.321   3.053   
##  Residual             84.592   9.197   
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 53.08240    0.17628  301.13
## timec       -0.44262    0.01928  -22.95
## 
## Correlation of Fixed Effects:
##       (Intr)
## timec -0.711

Let’s use three time variables, timebase, timetreat, and timefollow, to model the changes in stress.

out1c <- lmer(stress ~ 1 + timebase + timetreat + timefollow + (1|pid), data=dat1, REML=FALSE)
anova(out1b, out1c)
## Data: dat1
## Models:
## out1b: stress ~ 1 + timec + (1 | pid)
## out1c: stress ~ 1 + timebase + timetreat + timefollow + (1 | pid)
##       npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)    
## out1b    4 102801 102831 -51397   102793                         
## out1c    6 102693 102738 -51341   102681 112.21  2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The model with three time variables was significantly better than the model with only one time variable. Check the output of the model with three time variables.

summary(out1c)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + timebase + timetreat + timefollow + (1 | pid)
##    Data: dat1
## 
##      AIC      BIC   logLik deviance df.resid 
## 102693.1 102738.3 -51340.5 102681.1    13994 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.8092 -0.6436  0.0062  0.6596  3.3402 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  pid      (Intercept)  9.373   3.062   
##  Residual             83.865   9.158   
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 52.44186    0.23946 218.998
## timebase    -0.01506    0.07770  -0.194
## timetreat   -0.87487    0.04711 -18.572
## timefollow   0.58362    0.10854   5.377
## 
## Correlation of Fixed Effects:
##            (Intr) timebs timtrt
## timebase   -0.775              
## timetreat   0.191 -0.588       
## timefollow -0.053  0.163 -0.574

Let’s investigate the change in stress across times. First, make each line represent the raw stress level of each individual across time.

library(ggplot2)
## 
## Attaching package: 'ggplot2'
## The following objects are masked from 'package:psych':
## 
##     %+%, alpha
dat1_2 <- dat1[dat1$pid%%20 == 0,]
dat1_2$treat <- factor(dat1_2$treat, labels=c("Control", "Treatment"))
ggplot(data=dat1_2, aes(x=time, y=stress, group=pid, colour=treat)) + geom_line()

Second, in each individual, all three time variables are used to predict stress levels. Then, the predicted values of stress are saved as stress2. Then, the predicted stress force a linear change within individuals. See the spline change of stress level in each client.

datl <- split(dat1, dat1$pid)
stresspred <- lapply(datl, function(x) { predict(lm(stress ~ timebase + timetreat + timefollow, data=x)) })
dat1$stress2 <- do.call(c, stresspred)
dat1_2 <- dat1[dat1$pid%%20 == 0,]
dat1_2$treat <- factor(dat1_2$treat, labels=c("Control", "Treatment"))
ggplot(data=dat1_2, aes(x=time, y=stress2, group=pid, colour=treat)) + geom_line()

Next, test the random slopes of each time variable. The model with the random slope of timebase was not convergent and the fixed effects of timebase was not significant too. Because timebase is not the crucial effect of the research, timebase is dropped from further analyses.

out1d <- lmer(stress ~ 1 + timebase + timetreat + timefollow + (1 + timebase|pid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
## boundary (singular) fit: see help('isSingular')
anova(out1c, out1d)
## Data: dat1
## Models:
## out1c: stress ~ 1 + timebase + timetreat + timefollow + (1 | pid)
## out1d: stress ~ 1 + timebase + timetreat + timefollow + (1 + timebase | pid)
##       npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)    
## out1c    6 102693 102738 -51341   102681                         
## out1d    8 102422 102483 -51203   102406 274.68  2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, the random slopes of timetreat was significant.

out1e <- lmer(stress ~ 1 + timebase + timetreat + timefollow + (1 + timetreat|pid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out1c, out1e)
## Data: dat1
## Models:
## out1c: stress ~ 1 + timebase + timetreat + timefollow + (1 | pid)
## out1e: stress ~ 1 + timebase + timetreat + timefollow + (1 + timetreat | pid)
##       npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)    
## out1c    6 102693 102738 -51341   102681                         
## out1e    8 101783 101843 -50884   101767 913.99  2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, the random slopes of timefollow was significant.

out1f <- lmer(stress ~ 1 + timebase + timetreat + timefollow + (1 + timefollow|pid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out1c, out1f)
## Data: dat1
## Models:
## out1c: stress ~ 1 + timebase + timetreat + timefollow + (1 | pid)
## out1f: stress ~ 1 + timebase + timetreat + timefollow + (1 + timefollow | pid)
##       npar    AIC    BIC logLik deviance  Chisq Df Pr(>Chisq)    
## out1c    6 102693 102738 -51341   102681                         
## out1f    8 102450 102510 -51217   102434 247.44  2  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

In sum, two random effects were significant. Thus, both random effects are analyzed in the same model and remove timebase from the model.

out1g <- lmer(stress ~ 1 + timetreat + timefollow + (1 + timetreat+ timefollow|pid),  data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
summary(out1g)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + timetreat + timefollow + (1 + timetreat + timefollow |  
##     pid)
##    Data: dat1
## Control: lmerControl(optimizer = "Nelder_Mead")
## 
##      AIC      BIC   logLik deviance df.resid 
## 101766.7 101842.1 -50873.3 101746.7    13990 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0056 -0.6302  0.0125  0.6550  3.2871 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  pid      (Intercept)  0.3362  0.5799              
##           timetreat    1.3151  1.1468    0.20      
##           timefollow   1.8105  1.3455   -0.67 -0.31
##  Residual             74.5464  8.6340              
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 52.40588    0.11110 471.680
## timetreat   -0.88024    0.05103 -17.248
## timefollow   0.58704    0.10957   5.358
## 
## Correlation of Fixed Effects:
##            (Intr) timtrt
## timetreat  -0.445       
## timefollow  0.096 -0.474

Let’s plot the scatterplot matrix of the random effects. The easiest way is to generate data from the output of the random slope models. See the Random effects: section. The standard deviations and the correlation matrix of the random effects are saved. Next, see the means of each random effect in the Fixed effects: section. From means, standard deviations, and the correlation matrix, data are simulated by the mvrnorm function in the MASS package. Then, the simulated data are used to create scatterplot matrix using the pairs.panels in the psych package.

tempsd <- c(0.5799, 1.1468, 1.3455)
tempr <- matrix(c(1, 0.2, -0.67,
           0.2, 1, -0.31,
           -0.67, -0.31, 1), 3, 3)
library(lavaan)
tempcov <- cor2cov(tempr, tempsd)
library(MASS)
temp <- mvrnorm(1000, c(52.41, -0.88, 0.587), tempcov)
colnames(temp) <- c("intcept", "timetreat", "timefollow")
library(psych)
pairs.panels(temp, main = "Scatter Plot Matrix of the Random Effects") 

Next, the scatterplots of each pair are created.

plot(temp[,1], temp[,2], xlab="Baseline Stress", ylab="Stress Change in Treatment Period")
abline(h = 0, col="red", lty=2)
abline(v = 0, col="red", lty=2)

plot(temp[,1], temp[,3], xlab="Baseline Stress", ylab="Stress Change in Follow-up Period")
abline(h = 0, col="red", lty=2)
abline(v = 0, col="red", lty=2)

plot(temp[,2], temp[,3], xlab="Stress Change in Treatment Period", ylab="Stress Change in Follow-up Period")
abline(h = 0, col="red", lty=2)
abline(v = 0, col="red", lty=2)

Next, all covariates are added in the model. sleephrs is the only time-varying covariate. It is separated into individual means and individual-mean-centered variable.

dat1$sleephrs <- dat1$sleep / 60
dat1$avesleep <- ave(dat1$sleephrs, dat1$pid)
dat1$diffsleep <- dat1$sleephrs - dat1$avesleep

All time-invariant covariates, avesleep, treat, and age are grand-mean centered. Thus, the grand means must be calculated by (a) making a new data set where rows represent each individual and (b) calculating the means across individuals.

dat1a <- dat1[!duplicated(dat1$pid),]
round(apply(dat1a, 2, mean), 3)
##      rowid        pid       time     stress      treat      sleep        age 
##   6994.000    500.500      1.000     52.543      0.500    424.507     36.638 
##      timec   timebase  timetreat timefollow    stress2   sleephrs   avesleep 
##      0.000      0.000      0.000      0.000     52.442      7.075      7.060 
##  diffsleep 
##      0.015

Next, all covariates are included in the models.

out1h <- lmer(stress ~ 1 + timetreat + timefollow + diffsleep + I(avesleep - 7.075) + I(treat - 0.5) + I(age - 36.638) + (1 + timetreat+ timefollow|pid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
summary(out1h)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + timetreat + timefollow + diffsleep + I(avesleep -  
##     7.075) + I(treat - 0.5) + I(age - 36.638) + (1 + timetreat +  
##     timefollow | pid)
##    Data: dat1
## Control: lmerControl(optimizer = "Nelder_Mead")
## 
##      AIC      BIC   logLik deviance df.resid 
## 101443.5 101549.2 -50707.7 101415.5    13986 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0920 -0.6324  0.0072  0.6530  3.2911 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  pid      (Intercept)  2.446   1.564               
##           timetreat    1.306   1.143    -0.65      
##           timefollow   1.713   1.309     0.16 -0.30
##  Residual             73.511   8.574               
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##                     Estimate Std. Error t value
## (Intercept)         52.41799    0.11955 438.465
## timetreat           -0.88473    0.05077 -17.427
## timefollow           0.59457    0.10847   5.481
## diffsleep           -1.91626    0.14072 -13.617
## I(avesleep - 7.075)  0.18586    0.12980   1.432
## I(treat - 0.5)      -2.82121    0.18266 -15.445
## I(age - 36.638)      0.01326    0.01724   0.769
## 
## Correlation of Fixed Effects:
##             (Intr) timtrt tmfllw dffslp I(-7.0 I(-0.5
## timetreat   -0.625                                   
## timefollow   0.153 -0.472                            
## diffsleep   -0.006  0.006 -0.005                     
## I(vs-7.075)  0.017  0.000  0.000  0.001              
## I(tret-0.5)  0.000  0.000  0.000  0.003  0.003       
## I(g-36.638)  0.009  0.000  0.000  0.000  0.520  0.004

Next, all time-invariate covariates are centered at the calculated grand means. Also, all cross-level interactions are added to predict the random slopes of time.

out1i <- lmer(stress ~ 1 + timetreat + timefollow + diffsleep + timetreat*I(avesleep - 7.075) + timetreat*I(treat - 0.5) + timetreat*I(age - 36.638) + timefollow*I(avesleep - 7.075) + timefollow*I(treat - 0.5) + timefollow*I(age - 36.638) + (1 + timetreat+ timefollow|pid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead", calc.derivs = FALSE))
summary(out1i)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + timetreat + timefollow + diffsleep + timetreat *  
##     I(avesleep - 7.075) + timetreat * I(treat - 0.5) + timetreat *  
##     I(age - 36.638) + timefollow * I(avesleep - 7.075) + timefollow *  
##     I(treat - 0.5) + timefollow * I(age - 36.638) + (1 + timetreat +  
##     timefollow | pid)
##    Data: dat1
## Control: lmerControl(optimizer = "Nelder_Mead", calc.derivs = FALSE)
## 
##      AIC      BIC   logLik deviance df.resid 
## 101023.0 101173.9 -50491.5 100983.0    13980 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0674 -0.6351  0.0053  0.6621  3.3091 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  pid      (Intercept)  0.3553  0.5960              
##           timetreat    0.4344  0.6591    0.46      
##           timefollow   1.3450  1.1598   -0.79  0.14
##  Residual             73.5184  8.5743              
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##                                 Estimate Std. Error t value
## (Intercept)                    52.418891   0.110471 474.503
## timetreat                      -0.885219   0.041316 -21.426
## timefollow                      0.593959   0.106793   5.562
## diffsleep                      -1.892950   0.139462 -13.573
## I(avesleep - 7.075)             0.251976   0.156970   1.605
## I(treat - 0.5)                  0.059906   0.220894   0.271
## I(age - 36.638)                 0.013452   0.020844   0.645
## timetreat:I(avesleep - 7.075)  -0.035437   0.058707  -0.604
## timetreat:I(treat - 0.5)       -1.864007   0.082616 -22.562
## timetreat:I(age - 36.638)       0.002091   0.007796   0.268
## timefollow:I(avesleep - 7.075) -0.033808   0.151753  -0.223
## timefollow:I(treat - 0.5)       1.188626   0.213551   5.566
## timefollow:I(age - 36.638)     -0.018400   0.020154  -0.913
## 
## Correlation matrix not shown by default, as p = 13 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it

Plot the expected changes in the stress level in both treatment and control groups. First, the data from Individual 1 are selected to get the values of time, timetreat, and timefollow at each time point. Next, the regression equation of expect means is obtained from the fixed effects results involving treat, timetreat, and timefollow. Next, the expected means of each treatment condition and each time point are calculated and plotted.

dat1g1 <- dat1[dat1$pid == 1,]
w <- 1
ytreat <- 52.42 + 0.06*(w - 0.5) - 0.88*dat1g1$timetreat - 1.864*(w-0.5)*dat1g1$timetreat + 0.59*dat1g1$timefollow + 1.189*(w-0.5)*dat1g1$timefollow
w <- 0
yctrl <- 52.42 + 0.06*(w - 0.5) - 0.88*dat1g1$timetreat - 1.864*(w-0.5)*dat1g1$timetreat + 0.59*dat1g1$timefollow + 1.189*(w-0.5)*dat1g1$timefollow
plot(dat1g1$time, ytreat, type="n", xlab="time", ylab="stress", ylim=c(40,55))
lines(dat1g1$time, ytreat, col="red")
lines(dat1g1$time, yctrl, col="green")
legend(1, 43, c("Treatment", "Control"), lty=1, col=c("red", "green"))

treat significantly predicted the random slopes of timetreat and timefollow. Thus, both cross-level interactions are probed by the interactions package. Before using the package, all variables must be centered by creating new variables.

dat1$avesleepc <- dat1$avesleep - 7.075
dat1$treatc <- dat1$treat - 0.5
dat1$agec <- dat1$age - 36.638
out1in <- lmer(stress ~ 1 + timetreat + timefollow + diffsleep + timetreat*avesleepc + timetreat*treatc + timetreat*agec + timefollow*avesleepc + timefollow*treatc + timefollow*agec + (1 + timetreat+ timefollow|pid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead", calc.derivs = FALSE))
summary(out1in)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: stress ~ 1 + timetreat + timefollow + diffsleep + timetreat *  
##     avesleepc + timetreat * treatc + timetreat * agec + timefollow *  
##     avesleepc + timefollow * treatc + timefollow * agec + (1 +  
##     timetreat + timefollow | pid)
##    Data: dat1
## Control: lmerControl(optimizer = "Nelder_Mead", calc.derivs = FALSE)
## 
##      AIC      BIC   logLik deviance df.resid 
## 101023.0 101173.9 -50491.5 100983.0    13980 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.0674 -0.6351  0.0053  0.6621  3.3091 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr       
##  pid      (Intercept)  0.3553  0.5960              
##           timetreat    0.4344  0.6591    0.46      
##           timefollow   1.3450  1.1598   -0.79  0.14
##  Residual             73.5184  8.5743              
## Number of obs: 14000, groups:  pid, 1000
## 
## Fixed effects:
##                       Estimate Std. Error t value
## (Intercept)          52.418891   0.110471 474.503
## timetreat            -0.885219   0.041316 -21.426
## timefollow            0.593959   0.106793   5.562
## diffsleep            -1.892950   0.139462 -13.573
## avesleepc             0.251976   0.156970   1.605
## treatc                0.059906   0.220894   0.271
## agec                  0.013452   0.020844   0.645
## timetreat:avesleepc  -0.035437   0.058707  -0.604
## timetreat:treatc     -1.864007   0.082616 -22.562
## timetreat:agec        0.002091   0.007796   0.268
## timefollow:avesleepc -0.033808   0.151753  -0.223
## timefollow:treatc     1.188626   0.213551   5.566
## timefollow:agec      -0.018400   0.020154  -0.913
## 
## Correlation matrix not shown by default, as p = 13 > 12.
## Use print(x, correlation=TRUE)  or
##     vcov(x)        if you need it

First, the simple slopes of timetreat at each treatment condition is calculated.

treatcval <- c(0, 1) - 0.5
library(interactions)
ss11 <- sim_slopes(model=out1in, pred=timetreat, modx=treatc, modx.values=treatcval)
ss11
## JOHNSON-NEYMAN INTERVAL 
## 
## When treatc is OUTSIDE the interval [-0.54, -0.42], the slope of timetreat
## is p < .05.
## 
## Note: The range of observed values of treatc is [-0.50, 0.50]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of timetreat when treatc = -0.50: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.05   0.06     0.81   0.42
## 
## Slope of timetreat when treatc =  0.50: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -1.82   0.06   -31.06   0.00

Plot the interaction where X-axis is timetreat and each line represents each value of treat.

interact_plot(model=out1in, pred=timetreat, modx=treatc, modx.values=treatcval)

Second, the simple slopes of timefollow at each treatment condition is calculated.

ss12 <- sim_slopes(model=out1in, pred=timefollow, modx=treatc, modx.values=treatcval)
ss12
## JOHNSON-NEYMAN INTERVAL 
## 
## When treatc is OUTSIDE the interval [-0.85, -0.30], the slope of timefollow
## is p < .05.
## 
## Note: The range of observed values of treatc is [-0.50, 0.50]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of timefollow when treatc = -0.50: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.00   0.15     0.00   1.00
## 
## Slope of timefollow when treatc =  0.50: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   1.19   0.15     7.85   0.00

Plot the interaction where X-axis is timefollow and each line represents each value of treat.

interact_plot(model=out1in, pred=timefollow, modx=treatc, modx.values=treatcval)

Third, the expected differences between treatment conditions at each time is calculated. However, the time variables are separated into timetreat and timefollow. Thus, timetreat is treated as the first moderator and timefollow is treated as the second moderator. To test the expected differences at all time points, all possible data points of timetreat and timefollow are included. Note that the time used for this probing interaction is more than one hour.

timetreatval <- c(0, 1, 2, 3, 4, 5, 6)
timefollowval <- c(0, 1, 2, 3)
ss13 <- sim_slopes(model=out1in, pred=treatc, modx=timetreat, mod2=timefollow, modx.values=timetreatval, mod2.values=timefollowval)
ss13
## ████████████████████ While timefollow (2nd moderator) = 3.00 ███████████████████ 
## 
## JOHNSON-NEYMAN INTERVAL 
## 
## When timetreat is OUTSIDE the interval [1.27, 2.57], the slope of treatc is
## p < .05.
## 
## Note: The range of observed values of timetreat is [0.00, 6.00]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of treatc when timetreat = 6.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -7.56   0.57   -13.32   0.00
## 
## Slope of treatc when timetreat = 5.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -5.69   0.56   -10.14   0.00
## 
## Slope of treatc when timetreat = 4.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -3.83   0.57    -6.75   0.00
## 
## Slope of treatc when timetreat = 3.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -1.97   0.59    -3.35   0.00
## 
## Slope of treatc when timetreat = 2.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -0.10   0.61    -0.17   0.87
## 
## Slope of treatc when timetreat = 1.00: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   1.76   0.65     2.70   0.01
## 
## Slope of treatc when timetreat = 0.00: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   3.63   0.70     5.19   0.00
## 
## ████████████████████ While timefollow (2nd moderator) = 2.00 ███████████████████ 
## 
## JOHNSON-NEYMAN INTERVAL 
## 
## When timetreat is OUTSIDE the interval [0.82, 1.75], the slope of treatc is
## p < .05.
## 
## Note: The range of observed values of timetreat is [0.00, 6.00]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of treatc when timetreat = 6.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -8.75   0.42   -20.60   0.00
## 
## Slope of treatc when timetreat = 5.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -6.88   0.40   -17.24   0.00
## 
## Slope of treatc when timetreat = 4.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -5.02   0.38   -13.16   0.00
## 
## Slope of treatc when timetreat = 3.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -3.15   0.39    -8.02   0.00
## 
## Slope of treatc when timetreat = 2.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -1.29   0.42    -3.11   0.00
## 
## Slope of treatc when timetreat = 1.00: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.57   0.45     1.27   0.21
## 
## Slope of treatc when timetreat = 0.00: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   2.44   0.49     4.98   0.00
## 
## ████████████████████ While timefollow (2nd moderator) = 1.00 ███████████████████ 
## 
## JOHNSON-NEYMAN INTERVAL 
## 
## When timetreat is OUTSIDE the interval [0.35, 0.96], the slope of treatc is
## p < .05.
## 
## Note: The range of observed values of timetreat is [0.00, 6.00]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of treatc when timetreat = 6.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -9.94   0.37   -27.11   0.00
## 
## Slope of treatc when timetreat = 5.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -8.07   0.31   -26.19   0.00
## 
## Slope of treatc when timetreat = 4.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -6.21   0.26   -23.56   0.00
## 
## Slope of treatc when timetreat = 3.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -4.34   0.24   -18.14   0.00
## 
## Slope of treatc when timetreat = 2.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -2.48   0.24   -10.19   0.00
## 
## Slope of treatc when timetreat = 1.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -0.62   0.27    -2.27   0.02
## 
## Slope of treatc when timetreat = 0.00: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   1.25   0.32     3.87   0.00
## 
## ████████████████████ While timefollow (2nd moderator) = 0.00 ███████████████████ 
## 
## JOHNSON-NEYMAN INTERVAL 
## 
## When timetreat is OUTSIDE the interval [-0.21, 0.25], the slope of treatc
## is p < .05.
## 
## Note: The range of observed values of timetreat is [0.00, 6.00]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of treatc when timetreat = 6.00: 
## 
##     Est.   S.E.   t val.      p
## -------- ------ -------- ------
##   -11.12   0.42   -26.37   0.00
## 
## Slope of treatc when timetreat = 5.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -9.26   0.35   -26.49   0.00
## 
## Slope of treatc when timetreat = 4.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -7.40   0.28   -26.13   0.00
## 
## Slope of treatc when timetreat = 3.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -5.53   0.23   -24.31   0.00
## 
## Slope of treatc when timetreat = 2.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -3.67   0.16   -22.45   0.00
## 
## Slope of treatc when timetreat = 1.00: 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -1.80   0.19    -9.49   0.00
## 
## Slope of treatc when timetreat = 0.00: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.06   0.22     0.27   0.78

Heteroscedastic Models

Example 2: Changes in Sense of Belonging in Undergraduate Students

Read the data set from Lecture 11.

dat2 <- read.table("lecture11ex2.csv", sep=",", header=TRUE)

Run the model where time is centered at the first year and stress is separated into Level-1 and Level-2 covariates by group mean centering and predicts the sense of belonging.

dat2$timec <- dat2$time - 1
dat2$avestress <- ave(dat2$stress, dat2$pid)
dat2$diffstress <- dat2$stress - dat2$avestress
library(lme4)
out2 <- lmer(mem ~ 1 + timec + diffstress + avestress + (1 + timec|pid), data=dat2, REML=FALSE)
summary(out2)
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: mem ~ 1 + timec + diffstress + avestress + (1 + timec | pid)
##    Data: dat2
## 
##      AIC      BIC   logLik deviance df.resid 
##  25552.9  25602.0 -12768.5  25536.9     3392 
## 
## Scaled residuals: 
##      Min       1Q   Median       3Q      Max 
## -3.10411 -0.48077  0.00681  0.49606  2.61723 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr 
##  pid      (Intercept) 234.68   15.319        
##           timec        35.29    5.941   -0.15
##  Residual              25.92    5.091        
## Number of obs: 3400, groups:  pid, 850
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept) 56.78964    2.57724  22.035
## timec       -1.88457    0.21823  -8.636
## diffstress   0.40878    0.01212  33.730
## avestress    0.10050    0.05011   2.006
## 
## Correlation of Fixed Effects:
##            (Intr) timec  dffstr
## timec      -0.045              
## diffstress -0.004  0.011       
## avestress  -0.977  0.000  0.003

The lme4 package does not have features for heteroscedastic models. Rather the nlme package is used instead. Let’s run the previous model with nlme.

library(nlme)
out2a <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat2, method="ML", na.action=na.omit)
summary(out2a)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat2 
##        AIC   BIC    logLik
##   25552.95 25602 -12768.47
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.319143 (Intr)
## timec        5.940565 -0.152
## Residual     5.090837       
## 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.78964 2.5787499 2548 22.02216  0.0000
## timec       -1.88457 0.2183537 2548 -8.63082  0.0000
## diffstress   0.40878 0.0121265 2548 33.70962  0.0000
## avestress    0.10050 0.0501375  848  2.00458  0.0453
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.045              
## diffstress -0.004  0.011       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.104097747 -0.480769826  0.006814801  0.496061463  2.617219474 
## 
## Number of Observations: 3400
## Number of Groups: 850

Next, run the models where the residual variances are different across four time points. The weights argument sets the variance structure. In this case, the variance is identical at each value of timec.

out2b <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat2, method="ML", na.action=na.omit, weights=varIdent(form = ~1|timec))
summary(out2b)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat2 
##       AIC      BIC    logLik
##   25510.7 25578.15 -12744.35
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.332924 (Intr)
## timec        5.888768 -0.154
## Residual     6.527777       
## 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | timec 
##  Parameter estimates:
##         0         1         2         3 
## 1.0000000 0.6696634 0.9035280 0.3313754 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.91609 2.5920715 2548 21.95776  0.0000
## timec       -1.78621 0.2110654 2548 -8.46281  0.0000
## diffstress   0.40682 0.0119806 2548 33.95677  0.0000
## avestress    0.09579 0.0503951  848  1.90079  0.0577
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.046              
## diffstress -0.005  0.014       
## avestress  -0.977  0.000  0.004
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.559353311 -0.410386627  0.009440652  0.406160670  2.627260285 
## 
## Number of Observations: 3400
## Number of Groups: 850

Compare the models with equal and unequal variances across time. The unequal residual variances were significant.

anova(out2a, out2b)
##       Model df      AIC      BIC    logLik   Test L.Ratio p-value
## out2a     1  8 25552.95 25602.00 -12768.47                       
## out2b     2 11 25510.70 25578.15 -12744.35 1 vs 2 48.2462  <.0001

Let’s calculate residual standard deviations at each time point.

out2bsigma <- as.numeric(VarCorr(out2b)[3, 2])
out2bsigmatime <- out2bsigma * coef(out2b$modelStruct$varStruct, uncons = FALSE)
c(out2bsigma, out2bsigmatime)
##                 1        2        3 
## 6.527777 4.371413 5.898029 2.163145

Next, run the models where the residual variances have the exponential changes. The weights argument is specified that the time variable predicts the residual variance by the exponential function.

out2c <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat2, method="ML", na.action=na.omit, weights=varExp(form = ~time))
summary(out2c)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat2 
##        AIC      BIC   logLik
##   25552.81 25607.99 -12767.4
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.271240 (Intr)
## timec        5.899544 -0.144
## Residual     5.560063       
## 
## Variance function:
##  Structure: Exponential of variance covariate
##  Formula: ~time 
##  Parameter estimates:
##       expon 
## -0.03465414 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.81090 2.5799359 2548 22.02028   0.000
## timec       -1.87412 0.2170987 2548 -8.63256   0.000
## diffstress   0.40759 0.0121412 2548 33.57090   0.000
## avestress    0.09976 0.0501617  848  1.98877   0.047
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.045              
## diffstress -0.004  0.012       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.20408953 -0.48006748  0.01228221  0.49351786  2.65221045 
## 
## Number of Observations: 3400
## Number of Groups: 850

The exponential change in residual variances is not significantly better than the equal variances model.

anova(out2a, out2c)
##       Model df      AIC      BIC    logLik   Test L.Ratio p-value
## out2a     1  8 25552.95 25602.00 -12768.47                       
## out2c     2  9 25552.81 25607.99 -12767.40 1 vs 2 2.14082  0.1434

Let’s calculate residual standard deviations at each time point.

out2csigma <- as.numeric(VarCorr(out2c)[3, 2])
expvalue <- coef(out2c$modelStruct$varStruct, uncons = FALSE)
out2csigma * exp(expvalue * (1:4))
## [1] 5.370684 5.187756 5.011058 4.840378

Next, run the models where the residual variances is the power function of time.

out2d <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat2, method="ML", na.action=na.omit, weights=varPower(form = ~time))
summary(out2d)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat2 
##        AIC      BIC    logLik
##   25551.99 25607.17 -12766.99
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.252205 (Intr)
## timec        5.883418 -0.141
## Residual     5.495255       
## 
## Variance function:
##  Structure: Power of variance covariate
##  Formula: ~time 
##  Parameter estimates:
##       power 
## -0.08991349 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.82431 2.5804668 2548 22.02094   0.000
## timec       -1.87242 0.2168923 2548 -8.63293   0.000
## diffstress   0.40734 0.0121381 2548 33.55850   0.000
## avestress    0.09938 0.0501726  848  1.98067   0.048
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.045              
## diffstress -0.004  0.012       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.205885624 -0.476766976  0.009982646  0.491381290  2.649537461 
## 
## Number of Observations: 3400
## Number of Groups: 850

The power change in residual variances is not significantly better than the equal variances model.

anova(out2a, out2d)
##       Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## out2a     1  8 25552.95 25602.00 -12768.47                        
## out2d     2  9 25551.99 25607.17 -12767.00 1 vs 2 2.958496  0.0854

Let’s calculate residual standard deviations at each time point.

out2dsigma <- as.numeric(VarCorr(out2d)[3, 2])
powervalue <- coef(out2d$modelStruct$varStruct, uncons = FALSE)
out2dsigma * (1:4)^powervalue
## [1] 5.495255 5.163227 4.978382 4.851260

Next, run the models where the residual variances is the power function of time with added constant.

out2e <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat2, method="ML", na.action=na.omit, weights=varConstPower(form = ~time), control = lmeControl(maxIter=200, msMaxIter = 200))
summary(out2e)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat2 
##        AIC      BIC   logLik
##   25547.19 25608.51 -12763.6
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.192096 (Intr)
## timec        5.813162 -0.126
## Residual     1.157254       
## 
## Variance function:
##  Structure: Constant plus power of variance covariate
##  Formula: ~time 
##  Parameter estimates:
##      const      power 
##   4.246851 -23.276544 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.88875 2.5840961 2548 22.01495  0.0000
## timec       -1.86365 0.2160190 2548 -8.62726  0.0000
## diffstress   0.40658 0.0121021 2548 33.59568  0.0000
## avestress    0.09755 0.0502450  848  1.94158  0.0525
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.044              
## diffstress -0.005  0.012       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max 
## -3.18681037 -0.48089878  0.01053416  0.49459340  2.62536918 
## 
## Number of Observations: 3400
## Number of Groups: 850

The constant power change in residual variances is significantly better than the equal variances model.

anova(out2a, out2e)
##       Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## out2a     1  8 25552.95 25602.00 -12768.47                        
## out2e     2 10 25547.19 25608.51 -12763.60 1 vs 2 9.753754  0.0076

Let’s calculate residual standard deviations at each time point.

out2esigma <- as.numeric(VarCorr(out2e)[3, 2])
paramvar <- coef(out2e$modelStruct$varStruct, uncons = FALSE)
out2esigma * (paramvar[1] + (1:4)^paramvar[2])
## [1] 6.071939 4.914686 4.914685 4.914685

The model with different variances across times and the model where the variances are predicted by the constant power function are compared. The latter model is the restricted model. The test was significant so the model with different variances across times is selected.

anova(out2e, out2b)
##       Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## out2e     1 10 25547.19 25608.51 -12763.60                        
## out2b     2 11 25510.70 25578.15 -12744.35 1 vs 2 38.49245  <.0001
Example 3: Interviewee’s Ratings based on the Interaction between Interviewer’s and Interviewee’s Sex

Read the data set from Lecture 4.

dat3 <- read.table("lecture4ex2.csv", sep=",", header=TRUE)

Run the model predicting interviewee’s ratings by the interaction between interviewer’s and interviewee’s sex. In this model, the residual variances are equal across all combinations of interviewers and interviewees.

out3a <- lme(score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100), random = ~1 + eesex|erid, data=dat3, method="ML", na.action=na.omit)
summary(out3a)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat3 
##        AIC      BIC   logLik
##   61640.99 61705.88 -30811.5
## 
## Random effects:
##  Formula: ~1 + eesex | erid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 4.4849848 (Intr)
## eesex       0.8932304 0.125 
## Residual    4.6652507       
## 
## Fixed effects:  score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100) 
##                Value  Std.Error   DF  t-value p-value
## (Intercept) 75.53870 0.22127122 8997 341.3851  0.0000
## ersex        0.00989 0.31292405  998   0.0316  0.9748
## eesex       -0.41323 0.13790177 8997  -2.9966  0.0027
## I(iq - 100)  0.02942 0.00329961 8997   8.9155  0.0000
## ersex:eesex  1.07365 0.19502911 8997   5.5051  0.0000
##  Correlation: 
##             (Intr) ersex  eesex  I(-100
## ersex       -0.707                     
## eesex       -0.253  0.179              
## I(iq - 100) -0.003  0.002  0.001       
## ersex:eesex  0.179 -0.253 -0.707 -0.008
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.741278287 -0.636668210 -0.004775941  0.640990961  3.432036414 
## 
## Number of Observations: 10000
## Number of Groups: 1000

Next, estimate different residual variances across interviewee’s sex.

out3b <- lme(score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100), random = ~1 + eesex|erid, data=dat3, method="ML", na.action=na.omit, weights=varIdent(form = ~1|eesex))
summary(out3b)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat3 
##        AIC      BIC    logLik
##   61642.39 61714.49 -30811.19
## 
## Random effects:
##  Formula: ~1 + eesex | erid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 4.4790343 (Intr)
## eesex       0.8932344 0.138 
## Residual    4.6937412       
## 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | eesex 
##  Parameter estimates:
##         0         1 
## 1.0000000 0.9878226 
## Fixed effects:  score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100) 
##                Value  Std.Error   DF  t-value p-value
## (Intercept) 75.53870 0.22127116 8997 341.3852  0.0000
## ersex        0.00989 0.31292398  998   0.0316  0.9748
## eesex       -0.41323 0.13790180 8997  -2.9966  0.0027
## I(iq - 100)  0.02941 0.00329918 8997   8.9143  0.0000
## ersex:eesex  1.07365 0.19502915 8997   5.5051  0.0000
##  Correlation: 
##             (Intr) ersex  eesex  I(-100
## ersex       -0.707                     
## eesex       -0.253  0.179              
## I(iq - 100) -0.003  0.002  0.001       
## ersex:eesex  0.179 -0.253 -0.707 -0.008
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.728448706 -0.635616701 -0.004869759  0.641990770  3.409480268 
## 
## Number of Observations: 10000
## Number of Groups: 1000

The likelihood ratio test indicated that residual variances are not significantly different across interviewee’s sex.

anova(out3a, out3b)
##       Model df      AIC      BIC    logLik   Test   L.Ratio p-value
## out3a     1  9 61640.99 61705.88 -30811.49                         
## out3b     2 10 61642.39 61714.49 -30811.19 1 vs 2 0.6004416  0.4384

Next, estimate different residual variances across interviewer’s sex.

out3c <- lme(score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100), random = ~1 + eesex|erid, data=dat3, method="ML", na.action=na.omit, weights=varIdent(form = ~1|ersex))
summary(out3c)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat3 
##        AIC      BIC    logLik
##   61642.85 61714.95 -30811.43
## 
## Random effects:
##  Formula: ~1 + eesex | erid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 4.4850481 (Intr)
## eesex       0.8929552 0.125 
## Residual    4.6783279       
## 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | ersex 
##  Parameter estimates:
##         0         1 
## 1.0000000 0.9944001 
## Fixed effects:  score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100) 
##                Value  Std.Error   DF  t-value p-value
## (Intercept) 75.53871 0.22138425 8997 341.2108  0.0000
## ersex        0.00988 0.31292764  998   0.0316  0.9748
## eesex       -0.41324 0.13825236 8997  -2.9890  0.0028
## I(iq - 100)  0.02939 0.00329958 8997   8.9061  0.0000
## ersex:eesex  1.07366 0.19502394 8997   5.5053  0.0000
##  Correlation: 
##             (Intr) ersex  eesex  I(-100
## ersex       -0.707                     
## eesex       -0.253  0.179              
## I(iq - 100) -0.003  0.002  0.001       
## ersex:eesex  0.180 -0.253 -0.709 -0.008
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.731018675 -0.635825449 -0.004824803  0.640120756  3.442342623 
## 
## Number of Observations: 10000
## Number of Groups: 1000

The likelihood ratio test indicated that residual variances are not significantly different across interviewer’s sex.

anova(out3a, out3c)
##       Model df      AIC      BIC    logLik   Test   L.Ratio p-value
## out3a     1  9 61640.99 61705.88 -30811.49                         
## out3c     2 10 61642.85 61714.95 -30811.42 1 vs 2 0.1395039  0.7088

Finally, estimate different residual variances across both interviewee’s and interviewer’s sex.

out3d <- lme(score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100), random = ~1 + eesex|erid, data=dat3, method="ML", na.action=na.omit, weights=varIdent(form = ~1|eesex * ersex))
summary(out3d)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat3 
##        AIC      BIC    logLik
##   61646.21 61732.73 -30811.11
## 
## Random effects:
##  Formula: ~1 + eesex | erid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 4.4791314 (Intr)
## eesex       0.8928173 0.139 
## Residual    4.7142122       
## 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | eesex * ersex 
##  Parameter estimates:
##       0*0       1*0       0*1       1*1 
## 1.0000000 0.9847232 0.9913074 0.9823369 
## Fixed effects:  score ~ 1 + ersex + eesex + eesex:ersex + I(iq - 100) 
##                Value  Std.Error   DF  t-value p-value
## (Intercept) 75.53871 0.22144919 8997 341.1108  0.0000
## ersex        0.00988 0.31292985  998   0.0316  0.9748
## eesex       -0.41324 0.13825094 8997  -2.9890  0.0028
## I(iq - 100)  0.02938 0.00329901 8997   8.9058  0.0000
## ersex:eesex  1.07367 0.19502180 8997   5.5054  0.0000
##  Correlation: 
##             (Intr) ersex  eesex  I(-100
## ersex       -0.708                     
## eesex       -0.254  0.180              
## I(iq - 100) -0.003  0.002  0.001       
## ersex:eesex  0.180 -0.253 -0.709 -0.008
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.714797353 -0.636329830 -0.004603456  0.641754174  3.425222190 
## 
## Number of Observations: 10000
## Number of Groups: 1000

The likelihood ratio test indicated that residual variances are not significantly different across interviewer’s sex.

anova(out3a, out3d)
##       Model df      AIC      BIC    logLik   Test   L.Ratio p-value
## out3a     1  9 61640.99 61705.88 -30811.49                         
## out3d     2 12 61646.21 61732.73 -30811.10 1 vs 2 0.7796672  0.8543

Autocorrelation

Example 4: Changes in Sense of Belonging in Undergraduate Students

Read the data set from Lecture 11.

dat4 <- read.table("lecture11ex2.csv", sep=",", header=TRUE)

Again, the nlme package is needed. Let’s make the model predicting sense of belonging where (a) time is centered at Year 1, (b) stress is separated into individual means and deviations from individual means, and (3) time has random slopes.

dat4$timec <- dat4$time - 1
dat4$avestress <- ave(dat4$stress, dat4$pid)
dat4$diffstress <- dat4$stress - dat4$avestress
library(nlme)
out4a <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat4, method="ML", na.action=na.omit)
summary(out4a)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat4 
##        AIC   BIC    logLik
##   25552.95 25602 -12768.47
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.319143 (Intr)
## timec        5.940565 -0.152
## Residual     5.090837       
## 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.78964 2.5787499 2548 22.02216  0.0000
## timec       -1.88457 0.2183537 2548 -8.63082  0.0000
## diffstress   0.40878 0.0121265 2548 33.70962  0.0000
## avestress    0.10050 0.0501375  848  2.00458  0.0453
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.045              
## diffstress -0.004  0.011       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -3.104097747 -0.480769826  0.006814801  0.496061463  2.617219474 
## 
## Number of Observations: 3400
## Number of Groups: 850

The first-order autoregressive model (AR1) is used to predict the error relationship. The correlation argument is used. The corARMA function is used to indicate that the autoregressive model is used. p = 1 means to use the first-order model.

out4e <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat4, method="ML", na.action=na.omit, correlation=corARMA(form = ~ 1 + timec | pid, p = 1))
summary(out4e)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat4 
##        AIC      BIC    logLik
##   25546.58 25601.76 -12764.29
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 15.040700 (Intr)
## timec        5.761745 -0.133
## Residual     5.762395       
## 
## Correlation Structure: AR(1)
##  Formula: ~1 + timec | pid 
##  Parameter estimate(s):
##       Phi 
## 0.2107787 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.73589 2.5786531 2548 22.00214  0.0000
## timec       -1.87752 0.2177654 2548 -8.62177  0.0000
## diffstress   0.40889 0.0120720 2548 33.87091  0.0000
## avestress    0.10181 0.0501331  848  2.03083  0.0426
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.046              
## diffstress -0.004  0.011       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -2.959984993 -0.435218926  0.005818563  0.435684614  2.496634314 
## 
## Number of Observations: 3400
## Number of Groups: 850

The AR1 is significant.

anova(out4a, out4e)
##       Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## out4a     1  8 25552.95 25602.00 -12768.47                        
## out4e     2  9 25546.58 25601.76 -12764.29 1 vs 2 8.370098  0.0038

Plot the error structure.

set.seed(123321)
ar1 <- coef(out4e$modelStruct$corStruct, uncons = FALSE)
out4esigma <- as.numeric(VarCorr(out4e)["Residual", "StdDev"])
e1 <- rnorm(20, 0, out4esigma/sqrt(1 - ar1^2)) 
e2 <- ar1*e1 + rnorm(20, 0, out4esigma)
e3 <- ar1*e2 + rnorm(20, 0, out4esigma)
e4 <- ar1*e3 + rnorm(20, 0, out4esigma)

mye <- c(e1, e2, e3, e4)
myid <- rep(1:20, 4)
myt <- rep(1:4, each=20)
daterror5 <- data.frame(t=myt, y=mye, id=myid)
library(ggplot2)
g <- ggplot(data=daterror5, aes(x=t, y=y, group=id)) 
g + geom_line(aes(color=factor(id)), size=1)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

The second-order autoregressive model (AR2) is used to predict the error relationship and compared with the AR1 model. p = 2 means to use the second-order model in the corARMA function.

out4f <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat4, method="ML", na.action=na.omit, correlation=corARMA(form = ~ 1 + timec | pid, p = 2))
summary(out4f)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat4 
##        AIC      BIC    logLik
##   25547.67 25608.98 -12763.83
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 14.017434 (Intr)
## timec        5.522055 -0.107
## Residual     7.838567       
## 
## Correlation Structure: ARMA(2,0)
##  Formula: ~1 + timec | pid 
##  Parameter estimate(s):
##      Phi1      Phi2 
## 0.4761749 0.1355516 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.64618 2.5773895 2548 21.97812  0.0000
## timec       -1.87812 0.2178019 2548 -8.62307  0.0000
## diffstress   0.40870 0.0120715 2548 33.85683  0.0000
## avestress    0.10435 0.0501044  848  2.08275  0.0376
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.048              
## diffstress -0.004  0.011       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##          Min           Q1          Med           Q3          Max 
## -2.495698943 -0.393252700 -0.007488328  0.368454214  1.939632824 
## 
## Number of Observations: 3400
## Number of Groups: 850

The AR2 model was not significantly better than the AR1 model.

anova(out4e, out4f)
##       Model df      AIC      BIC    logLik   Test   L.Ratio p-value
## out4e     1  9 25546.58 25601.76 -12764.29                         
## out4f     2 10 25547.67 25608.99 -12763.83 1 vs 2 0.9078822  0.3407

Plot the error structure.

set.seed(123321)
ar <- coef(out4f$modelStruct$corStruct, uncons = FALSE)
out4fsigma <- as.numeric(VarCorr(out4f)["Residual", "StdDev"])
out4fsigma <- out4fsigma/sqrt(1 - ar[1]^2 - ar[2]^2 - ((ar[1]^2)*ar[2]/(1 - ar[2])))
e1 <- rnorm(20, 0, out4fsigma)
e2 <- ar[1]*e1 + rnorm(20, 0, out4fsigma)
e3 <- ar[1]*e2 + ar[2]*e1 + rnorm(20, 0, out4fsigma)
e4 <- ar[1]*e3 + ar[2]*e2 + rnorm(20, 0, out4fsigma)
mye <- c(e1, e2, e3, e4)
myid <- rep(1:20, 4)
myt <- rep(1:4, each=20)
daterror6 <- data.frame(t=myt, y=mye, id=myid)
library(ggplot2)
g <- ggplot(data=daterror6, aes(x=t, y=y, group=id)) 
g + geom_line(aes(color=factor(id)), size=1)

The AR1 model with different residual variances across time is used. The weights argument is added in the AR1 model.

out4g <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat4, method="ML", na.action=na.omit, weights=varIdent(form = ~1|timec), correlation=corARMA(form = ~ 1 + timec | pid, p = 1))
summary(out4g)
## Linear mixed-effects model fit by maximum likelihood
##   Data: dat4 
##        AIC      BIC    logLik
##   25493.77 25567.35 -12734.89
## 
## Random effects:
##  Formula: ~1 + timec | pid
##  Structure: General positive-definite, Log-Cholesky parametrization
##             StdDev    Corr  
## (Intercept) 14.711750 (Intr)
## timec        5.794567 -0.108
## Residual     7.054760       
## 
## Correlation Structure: AR(1)
##  Formula: ~1 + timec | pid 
##  Parameter estimate(s):
##       Phi 
## 0.2803926 
## Variance function:
##  Structure: Different standard deviations per stratum
##  Formula: ~1 | timec 
##  Parameter estimates:
##         0         1         2         3 
## 1.0000000 0.7982827 0.9267105 0.1845166 
## Fixed effects:  mem ~ 1 + timec + diffstress + avestress 
##                Value Std.Error   DF  t-value p-value
## (Intercept) 56.74832 2.5775886 2548 22.01605  0.0000
## timec       -1.79050 0.2109160 2548 -8.48917  0.0000
## diffstress   0.40719 0.0119572 2548 34.05388  0.0000
## avestress    0.10043 0.0501128  848  2.00404  0.0454
##  Correlation: 
##            (Intr) timec  dffstr
## timec      -0.046              
## diffstress -0.005  0.013       
## avestress  -0.977  0.000  0.003
## 
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max 
## -3.6979843 -0.3670609 -0.0138252  0.3270481  2.4832686 
## 
## Number of Observations: 3400
## Number of Groups: 850

The heteroscedastic AR1 model is compared with the homoscedastic AR1 model.

anova(out4e, out4g)
##       Model df      AIC      BIC    logLik   Test  L.Ratio p-value
## out4e     1  9 25546.58 25601.76 -12764.29                        
## out4g     2 12 25493.77 25567.35 -12734.89 1 vs 2 58.80574  <.0001

Plot the error structure.

set.seed(123321)
ar1 <- coef(out4g$modelStruct$corStruct, uncons = FALSE)
sdratio <- coef(out4g$modelStruct$varStruct, uncons = FALSE)
out4gsigma <- as.numeric(VarCorr(out4g)["Residual", "StdDev"])
e1 <- rnorm(20, 0, out4gsigma/sqrt(1 - ar1^2))
e2 <- ar1*e1 + rnorm(20, 0, out4gsigma * sdratio[1])
e3 <- ar1*e2 + rnorm(20, 0, out4gsigma * sdratio[2])
e4 <- ar1*e3 + rnorm(20, 0, out4gsigma * sdratio[3])
mye <- c(e1, e2, e3, e4)
myid <- rep(1:20, 4)
myt <- rep(1:4, each=20)
daterror5 <- data.frame(t=myt, y=mye, id=myid)
library(ggplot2)
g <- ggplot(data=daterror5, aes(x=t, y=y, group=id)) 
g + geom_line(aes(color=factor(id)), size=1)

R-squared in Growth Curve Model

Example 5: Changes in Sense of Belonging in Undergraduate Students

Read the data set from Lecture 11.

dat5 <- read.table("lecture11ex2.csv", sep=",", header=TRUE)

Run the model with equal residual variances and without autocorrelation.

dat5$timec <- dat5$time - 1
dat5$avestress <- ave(dat5$stress, dat5$pid)
dat5$diffstress <- dat5$stress - dat5$avestress
library(nlme)
out5a <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat5, method="ML", na.action=na.omit)
sumout5a <- summary(out5a)

See that the output from the summary function is saved as an object. The output will be used in the r2mlm_long_manual function as mentioned in Lecture 8. \(R^2\) can be calculated as follows:

library(r2mlm)
## 
## Please cite as:
##  Shaw, M., Rights, J.D., Sterba, S.S., Flake, J.K. (2023). r2mlm: An R package calculating R-squared measures for multilevel models. Behavior Research Methods, 55(4), 1942-1964. doi:10.3758/s13428-022-01841-4
r2mlm_long_manual(data=dat5, covs=c("timec", "diffstress", "avestress"), random_covs=c("timec"), gammas=coef(sumout5a)[-1, "Value"], clusterID="pid", Tau=getVarCov(out5a), sigma2=out5a$sigma^2, bargraph=FALSE)
## $Decompositions
##                                     total    within     between
## fixed slopes (within)         0.047108859 0.1952116          NA
## fixed slopes (between)        0.003166966        NA 0.004174322
## slope variation (within)      0.122351796 0.5070063          NA
## slope variation (between)     0.000000000        NA 0.000000000
## intercept variation (between) 0.755511013        NA 0.995825678
## residual (within)             0.071861365 0.2977820          NA
## 
## $R2s
##           total    within     between
## f1  0.047108859 0.1952116          NA
## f2  0.003166966        NA 0.004174322
## v1  0.122351796 0.5070063          NA
## v2  0.000000000        NA 0.000000000
## m   0.755511013        NA 0.995825678
## f   0.050275825        NA          NA
## fv  0.172627621 0.7022180 0.004174322
## fvm 0.928138635        NA          NA

Next, the model with equal residual variances and AR1 is analyzed.

out5e <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat5, method="ML", na.action=na.omit, correlation=corARMA(form = ~ 1 + timec | pid, p = 1))
sumout5e <- summary(out5e)

In the AR1 model, the residual variance in the summary function cannot be used directly in the sigma2 argument. However, residual variance can be calculated by \(\sigma^2/(1 - \rho^2)\)

r2mlm_long_manual(data=dat5, covs=c("timec", "diffstress", "avestress"), random_covs=c("timec"), gammas=coef(sumout5e)[-1, "Value"], clusterID="pid", Tau=getVarCov(out5e), sigma2=out5e$sigma^2/(1 - ar1^2), bargraph=FALSE)
## $Decompositions
##                                    total    within    between
## fixed slopes (within)         0.04684901 0.1794703         NA
## fixed slopes (between)        0.00323730        NA 0.00438089
## slope variation (within)      0.11465107 0.4392079         NA
## slope variation (between)     0.00000000        NA 0.00000000
## intercept variation (between) 0.73572216        NA 0.99561911
## residual (within)             0.09954046 0.3813218         NA
## 
## $R2s
##          total    within    between
## f1  0.04684901 0.1794703         NA
## f2  0.00323730        NA 0.00438089
## v1  0.11465107 0.4392079         NA
## v2  0.00000000        NA 0.00000000
## m   0.73572216        NA 0.99561911
## f   0.05008631        NA         NA
## fv  0.16473738 0.6186782 0.00438089
## fvm 0.90045954        NA         NA

Next, the model with unequal residual variances and without autocorrelation is analyzed.

out5b <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat5, method="ML", na.action=na.omit, weights=varIdent(form = ~1|timec))
sumout5b <- summary(out5b)

In the heteroscedastic model, the residual variances are unequal across time. Thus, residual variances are averaged across all data points. Then, put the average in the formula.

out5bsigma <- out5b$sigma
out5bsigmatime <- out5bsigma * coef(out5b$modelStruct$varStruct, uncons = FALSE)
out5berrvar <- c(out5bsigma, out5bsigmatime)^2
out5berrvar <- cbind(1:4, out5berrvar)
matchindex5 <- match(dat5$time, out5berrvar[,1])
dat5$vare <- out5berrvar[matchindex5, 2]
aveerrvar5b <- mean(dat5$vare)
r2mlm_long_manual(data=dat5, covs=c("timec", "diffstress", "avestress"), random_covs=c("timec"), gammas=coef(sumout5b)[-1, "Value"], clusterID="pid", Tau=getVarCov(out5b), sigma2=aveerrvar5b, bargraph=FALSE)
## $Decompositions
##                                     total    within     between
## fixed slopes (within)         0.045863872 0.1928090          NA
## fixed slopes (between)        0.002901599        NA 0.003807234
## slope variation (within)      0.121261997 0.5097782          NA
## slope variation (between)     0.000000000        NA 0.000000000
## intercept variation (between) 0.759226336        NA 0.996192766
## residual (within)             0.070746196 0.2974128          NA
## 
## $R2s
##           total    within     between
## f1  0.045863872 0.1928090          NA
## f2  0.002901599        NA 0.003807234
## v1  0.121261997 0.5097782          NA
## v2  0.000000000        NA 0.000000000
## m   0.759226336        NA 0.996192766
## f   0.048765471        NA          NA
## fv  0.170027468 0.7025872 0.003807234
## fvm 0.929253804        NA          NA

Finally, the model with unequal variances and AR1 is calculated.

out5g <- lme(mem ~ 1 + timec + diffstress + avestress, random = ~1 + timec|pid, data=dat5, method="ML", na.action=na.omit, weights=varIdent(form = ~1|timec), correlation=corARMA(form = ~ 1 + timec | pid, p = 1))
sumout5g <- summary(out5g)

Then, the calculation is quite complicated. The residual variance in the first time point is assumed to be equal to the variance of the white noise. Then, the residual variances of the following time points are calculated based on the impact of the AR1 function. After that, the error variances across time points are averaged and put in the formula.

ar1 <- coef(out5g$modelStruct$corStruct, uncons = FALSE)
sdratio <- coef(out5g$modelStruct$varStruct, uncons = FALSE)
out5gsigmatime <- out5g$sigma * sdratio
out5gerrvar <- c(out5g$sigma, out5gsigmatime)^2
errvar1g <- out5gerrvar[1] / (1 - ar1^2)
errvar2g <- (ar1^2)*errvar1g + out5gerrvar[2]
errvar3g <- (ar1^2)*errvar2g + out5gerrvar[3]
errvar4g <- (ar1^2)*errvar3g + out5gerrvar[4]
out5gerrvar <- cbind(1:4, c(errvar1g, errvar2g, errvar3g, errvar4g))
matchindex5 <- match(dat5$time, out5gerrvar[,1])
dat5$varg <- out5gerrvar[matchindex5, 2]
aveerrvar5g <- mean(dat5$varg)
r2mlm_long_manual(data=dat5, covs=c("timec", "diffstress", "avestress"), random_covs=c("timec"), gammas=coef(sumout5g)[-1, "Value"], clusterID="pid", Tau=getVarCov(out5g), sigma2=aveerrvar5g, bargraph=FALSE)
## $Decompositions
##                                     total    within     between
## fixed slopes (within)         0.045780213 0.1755998          NA
## fixed slopes (between)        0.003175372        NA 0.004295151
## slope variation (within)      0.116899033 0.4483911          NA
## slope variation (between)     0.000000000        NA 0.000000000
## intercept variation (between) 0.736116906        NA 0.995704849
## residual (within)             0.098028476 0.3760091          NA
## 
## $R2s
##           total    within     between
## f1  0.045780213 0.1755998          NA
## f2  0.003175372        NA 0.004295151
## v1  0.116899033 0.4483911          NA
## v2  0.000000000        NA 0.000000000
## m   0.736116906        NA 0.995704849
## f   0.048955585        NA          NA
## fv  0.165854618 0.6239909 0.004295151
## fvm 0.901971524        NA          NA