Read the data set and run two models which are different in only the
fixed effect of goldcolor
. The restricted model,
out1r
, does not have the fixed effect of
goldcolor
and the full model, out1f
, have the
fixed effect of goldcolor
.
dat1 <-read.table("lecture4ex1.csv", sep=",", header=TRUE)
library(lme4)
## Warning: package 'lme4' was built under R version 4.2.3
## Loading required package: Matrix
## Warning: package 'Matrix' was built under R version 4.2.3
out1r <- lmer(consume ~ 1 + I(length - 15) + (1|groupid), data=dat1, REML=FALSE)
out1f <- lmer(consume ~ 1 + I(length - 15) + goldcolor + (1|groupid), data=dat1, REML=FALSE)
To run the likelihood ratio test, the anova
function is
used. If significant, the full model is preferred. If not significant,
the restricted model is preferred.
anova(out1r, out1f)
## Data: dat1
## Models:
## out1r: consume ~ 1 + I(length - 15) + (1 | groupid)
## out1f: consume ~ 1 + I(length - 15) + goldcolor + (1 | groupid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out1r 4 4131.8 4150.6 -2061.9 4123.8
## out1f 5 4127.7 4151.2 -2058.8 4117.7 6.1466 1 0.01317 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Let’s start with the random analysis of covariance model (which is a
submodel of the random intercept model). length
and
goldcolor
are both goldfish-level predictors. In this
model, the effects of both length
and
goldcolor
are fixed across tanks. Note that
control = lmerControl(optimizer ="Nelder_Mead")
is used
here to help the analysis to get a convergent result.
library(lme4)
out1m0 <- lmer(consume ~ 1 + I(length - 15) + goldcolor + (1|groupid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
Next, the effect of length
is varied across tanks.
(1 + I(length - 15)|groupid)
means that both 1
and I(length - 15)
effects are varied across
groupid
. 1
means that the intercepts are
varied across groupid
and I(length - 15)
means
that the slopes of this variable is varied across groupid
.
The anova
function is used to run likelihood ratio test to
check whether allowing random slopes is significant.
out1m0a <- lmer(consume ~ 1 + I(length - 15) + goldcolor + (1 + I(length - 15)|groupid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out1m0, out1m0a)
## Data: dat1
## Models:
## out1m0: consume ~ 1 + I(length - 15) + goldcolor + (1 | groupid)
## out1m0a: consume ~ 1 + I(length - 15) + goldcolor + (1 + I(length - 15) | groupid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out1m0 5 4127.7 4151.2 -2058.8 4117.7
## out1m0a 7 4109.4 4142.3 -2047.7 4095.4 22.299 2 1.438e-05 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Next, the effect of goldcolor
is varied across tanks.
The anova
function is used to run likelihood ratio test to
check whether allowing random slopes is significant.
out1m0b <- lmer(consume ~ 1 + I(length - 15) + goldcolor + (1 + goldcolor|groupid), data=dat1, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out1m0, out1m0b)
## Data: dat1
## Models:
## out1m0: consume ~ 1 + I(length - 15) + goldcolor + (1 | groupid)
## out1m0b: consume ~ 1 + I(length - 15) + goldcolor + (1 + goldcolor | groupid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out1m0 5 4127.7 4151.2 -2058.8 4117.7
## out1m0b 7 4131.1 4164.0 -2058.6 4117.1 0.5479 2 0.7604
The random slope of length
is significantly better than
the fixed slope of length
. However, the random slope of
goldcolor
was not significant. Thus, the model with only
random slope of length
is used.
summary(out1m0a)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: consume ~ 1 + I(length - 15) + goldcolor + (1 + I(length - 15) |
## groupid)
## Data: dat1
## Control: lmerControl(optimizer = "Nelder_Mead")
##
## AIC BIC logLik deviance df.resid
## 4109.4 4142.3 -2047.7 4095.4 804
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.2693 -0.6408 0.0377 0.6230 2.8841
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## groupid (Intercept) 19.36131 4.4001
## I(length - 15) 0.01636 0.1279 0.67
## Residual 5.97388 2.4442
## Number of obs: 811, groups: groupid, 100
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 39.28800 0.46021 85.370
## I(length - 15) 0.87494 0.02321 37.694
## goldcolor -0.46422 0.18375 -2.526
##
## Correlation of Fixed Effects:
## (Intr) I(-15)
## I(lngth-15) 0.394
## goldcolor -0.203 0.026
Check the scatterplot between random effects in this model by the
ranef
and plot
functions. The random
intercepts (on Y axis) is positively correlated with the random slopes
of length
(on X axis). That is, the stronger the effect of
length
in a tank, the more food fish eats in that tank.
ranef1m0a <- ranef(out1m0a)
plot(ranef1m0a)
## $groupid
Again, the random ANCOVA model with two interviewee-level predictors
is used as the baseline model to compare with the random slope models.
Note that the iq
is scaled because the model with
original-scaled iq
was not convergent.
dat2 <-read.table("lecture4ex2.csv", sep=",", header=TRUE)
out2m0 <- lmer(score ~ 1 + eesex + I(scale(iq)) + (1|erid), data=dat2, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
Next, the effect of interviewees’ sex, eesex
, is random.
The anova
function is used to run likelihood ratio test to
check whether allowing random slopes is significant.
out2m0a <- lmer(score ~ 1 + eesex + I(scale(iq)) + (1 + eesex|erid), data=dat2, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out2m0, out2m0a)
## Data: dat2
## Models:
## out2m0: score ~ 1 + eesex + I(scale(iq)) + (1 | erid)
## out2m0a: score ~ 1 + eesex + I(scale(iq)) + (1 + eesex | erid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out2m0 5 61676 61712 -30833 61666
## out2m0a 7 61669 61719 -30828 61655 11.183 2 0.00373 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Next, the effect of interviewees’ IQ, iq
, is random. The
anova
function is used to run likelihood ratio test to
check whether allowing random slopes is significant.
out2m0b <- lmer(score ~ 1 + eesex + I(scale(iq)) + (1 + I(scale(iq))|erid), data=dat2, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out2m0, out2m0b)
## Data: dat2
## Models:
## out2m0: score ~ 1 + eesex + I(scale(iq)) + (1 | erid)
## out2m0b: score ~ 1 + eesex + I(scale(iq)) + (1 + I(scale(iq)) | erid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out2m0 5 61676 61712 -30833 61666
## out2m0b 7 61680 61730 -30833 61666 0.6003 2 0.7407
The random slope of interviewees’ sex was significant so this random
slope is retained. Before checking the results by the
summary
function, analyzing iq
without random
slopes does not require the standard scores of iq
and the
control
argument. Thus, the original score of
iq
centered at 100 is used and the control
argument is removed.
out2m0a <- lmer(score ~ 1 + eesex + I(iq - 100) + (1 + eesex|erid), data=dat2, REML=FALSE)
summary(out2m0a)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: score ~ 1 + eesex + I(iq - 100) + (1 + eesex | erid)
## Data: dat2
##
## AIC BIC logLik deviance df.resid
## 61669.0 61719.4 -30827.5 61655.0 9993
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6718 -0.6331 -0.0005 0.6408 3.3843
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## erid (Intercept) 20.115 4.485
## eesex 1.086 1.042 0.11
## Residual 21.765 4.665
## Number of obs: 10000, groups: erid, 1000
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 75.543630 0.156424 482.943
## eesex 0.123562 0.098957 1.249
## I(iq - 100) 0.029552 0.003304 8.945
##
## Correlation of Fixed Effects:
## (Intr) eesex
## eesex -0.249
## I(iq - 100) -0.002 -0.007
Plot the scatterplot between random intercepts and random slopes.
ranef2m0a <- ranef(out2m0a)
plot(ranef2m0a)
## $erid
From the random slope model, the interviewer-level predictor can be added to predict the ratings. This model is similar to the random intercept model from the previous chapter with the random slope of the effect of interviewees’ sex.
out2m0c <- lmer(score ~ 1 + eesex + I(iq - 100) + ersex + (1 + eesex|erid), data=dat2, REML=FALSE)
summary(out2m0c)
## Linear mixed model fit by maximum likelihood ['lmerMod']
## Formula: score ~ 1 + eesex + I(iq - 100) + ersex + (1 + eesex | erid)
## Data: dat2
##
## AIC BIC logLik deviance df.resid
## 61668.9 61726.5 -30826.4 61652.9 9992
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.6647 -0.6348 -0.0018 0.6427 3.3811
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## erid (Intercept) 20.163 4.490
## eesex 1.086 1.042 0.08
## Residual 21.765 4.665
## Number of obs: 10000, groups: erid, 1000
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 75.321033 0.217765 345.883
## eesex 0.123560 0.098956 1.249
## I(iq - 100) 0.029563 0.003304 8.948
## ersex 0.445192 0.302694 1.471
##
## Correlation of Fixed Effects:
## (Intr) eesex I(-100
## eesex -0.184
## I(iq - 100) -0.002 -0.007
## ersex -0.695 0.000 0.000
The random ANCOVA model is used as the baseline model. Two customer-level predictors are used to test the random slope: age and sex.
dat3 <-read.table("lecture4ex3.csv", sep=",", header=TRUE)
out3m0 <- lmer(sat ~ 1 + I(age - 40) + female + (1|tableid), data=dat3, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
Check whether the age effects on customer satisfaction are different across tables.
out3m0a <- lmer(sat ~ 1 + I(age - 40) + female + (1 + I(age - 40)|tableid), data=dat3, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out3m0, out3m0a)
## Data: dat3
## Models:
## out3m0: sat ~ 1 + I(age - 40) + female + (1 | tableid)
## out3m0a: sat ~ 1 + I(age - 40) + female + (1 + I(age - 40) | tableid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out3m0 5 16553 16582 -8271.5 16543
## out3m0a 7 16555 16595 -8270.3 16541 2.3274 2 0.3123
Check whether the sex effects on customer satisfaction are different across tables.
out3m0b <- lmer(sat ~ 1 + I(age - 40) + female + (1 + female|tableid), data=dat3, REML=FALSE, control = lmerControl(optimizer ="Nelder_Mead"))
anova(out3m0, out3m0b)
## Data: dat3
## Models:
## out3m0: sat ~ 1 + I(age - 40) + female + (1 | tableid)
## out3m0b: sat ~ 1 + I(age - 40) + female + (1 + female | tableid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## out3m0 5 16553 16582 -8271.5 16543
## out3m0b 7 16554 16594 -8269.9 16540 3.1026 2 0.212