Intraclass Correlation

Example 1: The Impact of Couples’ Neuroticism on Distress

Read the data set investigating male-female couples. Males’ distress and females’ distress are predicted by males’ neuroticism, females’ neuroticism, and length of relationship.

dat1 <- read.table("lecture15ex1.csv", sep=",", header=TRUE)

Check the descriptive statistics of the variables inside the data set.

library(psych)
describe(dat1)
##               vars   n   mean    sd median trimmed   mad   min    max  range
## coupleid         1 250 125.50 72.31 125.50  125.50 92.66  1.00 250.00 249.00
## menneuro         2 250  50.83 10.27  51.75   51.16  9.88 17.31  76.30  58.99
## mendistress      3 250  49.78  9.23  50.39   50.07  9.40 25.78  71.54  45.76
## womenneuro       4 250  50.00 10.24  50.29   49.94 10.76 20.67  77.09  56.42
## womendistress    5 250  50.40  8.76  51.13   50.41  8.85 27.55  73.31  45.76
## lengthrela       6 250   2.59  1.67   2.30    2.37  1.25  0.10  10.58  10.48
##                skew kurtosis   se
## coupleid       0.00    -1.21 4.57
## menneuro      -0.35     0.02 0.65
## mendistress   -0.26    -0.30 0.58
## womenneuro    -0.02    -0.30 0.65
## womendistress -0.03    -0.23 0.55
## lengthrela     1.54     3.33 0.11

Intraclass correlations (ICC) of the distinguishable dyads are simply the correlation of the same variable between roles.

with(dat1, cor(menneuro, womenneuro))
## [1] -0.06424998
with(dat1, cor(mendistress, womendistress))
## [1] 0.7704816
Example 2: Personality between best friends

Read the data set investigating best-friend dyads. In this case, two people cannot be separated by roles. Extraversion and neuroticism are measured from both people in the dyads.

dat2 <- read.table("lecture15ex2.csv", sep=",", header=TRUE)

Check the descriptive statistics of the variables inside the data set.

describe(dat2)
##           vars    n   mean     sd median trimmed    mad min  max range skew
## coupleid     1 1000 250.50 144.41  250.5  250.50 185.32   1  500   499 0.00
## subjectid    2 1000 500.50 288.82  500.5  500.50 370.65   1 1000   999 0.00
## neuro        3 1000  49.61  10.02   49.0   49.43  10.38  16   80    64 0.15
## extro        4 1000  49.58   9.68   49.0   49.48  10.38  20   75    55 0.03
##           kurtosis   se
## coupleid     -1.20 4.57
## subjectid    -1.20 9.13
## neuro        -0.15 0.32
## extro        -0.09 0.31

Use the null multilevel models to investigate ICC in indistinguishable dyads.

library(lme4)
out2a <- lmer(extro ~ 1 + (1|coupleid), data=dat2, REML=FALSE)
summary(out2a) 
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: extro ~ 1 + (1 | coupleid)
##    Data: dat2
## 
##      AIC      BIC   logLik deviance df.resid 
##   7325.6   7340.3  -3659.8   7319.6      997 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.1928 -0.5824 -0.0212  0.5686  2.6089 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  coupleid (Intercept) 30.70    5.541   
##  Residual             62.86    7.929   
## Number of obs: 1000, groups:  coupleid, 500
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  49.5760     0.3525   140.6
out2b <- lmer(neuro ~ 1 + (1|coupleid), data=dat2, REML=FALSE)
## boundary (singular) fit: see help('isSingular')
summary(out2b) 
## Linear mixed model fit by maximum likelihood  ['lmerMod']
## Formula: neuro ~ 1 + (1 | coupleid)
##    Data: dat2
## 
##      AIC      BIC   logLik deviance df.resid 
##   7452.8   7467.5  -3723.4   7446.8      997 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3547 -0.6598 -0.0609  0.6378  3.0333 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  coupleid (Intercept)   0.0     0.00   
##  Residual             100.4    10.02   
## Number of obs: 1000, groups:  coupleid, 500
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)  49.6100     0.3168   156.6
## optimizer (nloptwrap) convergence code: 0 (OK)
## boundary (singular) fit: see help('isSingular')

However, when ICC is negative, the regular MLM is inappropriate because ICC is measured via variance. The lower bound is 0. Instead, use the gls function in the nlme package. The gls function is similar to the lm function but it allows correlated residuals. To model the indistinguishable dyads, people from the same dyads are allowed to have correlated residuals. The correlation value is ICC, which can be negative.

library(nlme)
out2c <- gls(extro ~ 1, data = dat2, correlation = corCompSymm(form = ~1|coupleid))
summary(out2c)
## Generalized least squares fit by REML
##   Model: extro ~ 1 
##   Data: dat2 
##        AIC      BIC    logLik
##   7325.848 7340.569 -3659.924
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | coupleid 
##  Parameter estimate(s):
##       Rho 
## 0.3290423 
## 
## Coefficients:
##              Value Std.Error  t-value p-value
## (Intercept) 49.576 0.3528763 140.4912       0
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -3.05552852 -0.67937367 -0.05950718  0.66367038  2.62658092 
## 
## Residual standard error: 9.679504 
## Degrees of freedom: 1000 total; 999 residual
out2d <- gls(neuro ~ 1, data = dat2, correlation = corCompSymm(form = ~1|coupleid))
summary(out2d)
## Generalized least squares fit by REML
##   Model: neuro ~ 1 
##   Data: dat2 
##        AIC      BIC    logLik
##   7332.788 7347.508 -3663.394
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | coupleid 
##  Parameter estimate(s):
##       Rho 
## -0.462961 
## 
## Coefficients:
##             Value Std.Error  t-value p-value
## (Intercept) 49.61 0.2322362 213.6187       0
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -3.35383324 -0.65959053 -0.06086993  0.63763744  3.03251985 
## 
## Residual standard error: 10.02137 
## Degrees of freedom: 1000 total; 999 residual

Another method is to change the long data set to pairwise data set. Then, correlate the variable between actor and partner. To make the pairwise data set, first, make two copies of the data. The first copy has the increasing order of the dyad ID and the increasing order of the subject ID. The second copy has the increasing order of the dyad ID but the decreasing order of the subject ID.

dat2a <- dat2[order(dat2$coupleid, dat2$subjectid),]
head(dat2a)
##   coupleid subjectid neuro extro
## 1        1         1    49    40
## 2        1         2    62    58
## 3        2         3    37    54
## 4        2         4    53    45
## 5        3         5    47    56
## 6        3         6    56    62
dat2b <- dat2[order(dat2$coupleid, -dat2$subjectid),]
head(dat2b)
##   coupleid subjectid neuro extro
## 2        1         2    62    58
## 1        1         1    49    40
## 4        2         4    53    45
## 3        2         3    37    54
## 6        3         6    56    62
## 5        3         5    47    56

Then, the level-1 variables in the second copy are attached to the first copy with different names.

dat2a$extropartner <- dat2b$extro
dat2a$neuropartner <- dat2b$neuro
head(dat2a)
##   coupleid subjectid neuro extro extropartner neuropartner
## 1        1         1    49    40           58           62
## 2        1         2    62    58           40           49
## 3        2         3    37    54           45           53
## 4        2         4    53    45           54           37
## 5        3         5    47    56           62           56
## 6        3         6    56    62           56           47

dat2a is currently in the pairwise format. Find the correlation between the same variables to get the ICC.

with(dat2a, cor(extro, extropartner))
## [1] 0.328148
with(dat2a, cor(neuro, neuropartner))
## [1] -0.463747

Test of Distinguishability

Example 1: The Impact of Couples’ Neuroticism on Distress

Let \(X\) and \(Y\) be level-1 variables and \(A\) and \(B\) be different roles. Test of distinguishability is the combination of the following tests.

  • \(H_0:\mu_{X_A}=\mu_{X_B}\)
  • \(H_0:\mu_{Y_A}=\mu_{Y_B}\)
  • \(H_0:\sigma^2_{X_A}=\sigma^2_{X_B}\)
  • \(H_0:\sigma^2_{Y_A}=\sigma^2_{Y_B}\)
  • \(H_0:\rho_{X_A,Y_A}=\rho_{X_B,Y_B}\)
  • \(H_0:\rho_{X_A,Y_B}=\rho_{X_B,Y_A}\)

If any tests in this set is rejected, the variable used to separate the roles in distinguishable dyads is good. If all null hypotheses are retained, the role variable can be dropped and these dyads can be treated as indistinguishable dyads.

The lavaan package is used to test the distinguishability. In lavaan script, all parameters are listed including \(\mu_{X_A},\mu_{X_B},\mu_{Y_A},\mu_{Y_B},\sigma^2_{X_A},\sigma^2_{X_B},\sigma^2_{Y_A},\sigma^2_{Y_B},\rho_{X_A,Y_A},\rho_{X_B,Y_B},\rho_{X_A,Y_B},\rho_{X_B,Y_A}\). Let’s show the example in the men and women distress data. Four variables are involved in the analysis: mendistress, womendistress, menneuro, and womenneuro.

First, the means of each variable are listed in the script.

scriptmean <- '
mendistress ~ 1
womendistress ~ 1
menneuro ~ 1
womenneuro ~ 1
'

Next, the variances of each variable are listed in the script.

scriptvariance <- '
mendistress ~~ mendistress
womendistress ~~ womendistress
menneuro ~~ menneuro
womenneuro ~~ womenneuro
'

Next, the tested covariances of each variable are listed in the script. Note that, if variances are equal, the equality in covariances is the same as the equality in correlation.

scriptcovariance <- '
mendistress ~~ menneuro 
womendistress ~~ womenneuro
mendistress ~~ womenneuro
womendistress ~~ menneuro
'

Only four covariances have been listed. The other two covariances, \(\rho_{X_A,X_B},\rho_{Y_A,Y_B}\) are attached in the script. These two covariances can be standardized which represent intraclass correlations in distinguishable dyads. Thus, if the test of distinguishability is not rejected, these two covariances cannot be interpreted as ICC.

scriptextra <- '
mendistress ~~ womendistress 
menneuro ~~ womenneuro
'

All four scripts are attached into the same script.

scriptparent <- paste(scriptmean, scriptvariance, scriptcovariance, scriptextra)

Then, use the sem function to run the script on the data. Users may use the summary function to check the output now (we will show it later). The output is not shown here to save space. You will see that all means, variances, and covariances are not equal. This is the unconstrained (or parent) model.

library(lavaan)
outparent <- sem(scriptparent, data=dat1)

Next, make the script of the constrained model where means, variances, and covariances have some equality constraints. In lavaan, users can name each parameter called labels. If any parameters have the same labels, it means that equality constraints are applied.

First, the means of each variable are listed and constrained in the script.

scriptmean2 <- '
mendistress ~ m1*1
womendistress ~ m1*1
menneuro ~ m2*1
womenneuro ~ m2*1
'

The means of mendistress and womendistress are equally constrained by putting the same label as m1. The means of menneuro and womenneuro are equally constrained by putting the same label as m2. Next, the variances of each variable are listed and constrained in the script.

scriptvariance2 <- '
mendistress ~~ v1*mendistress
womendistress ~~ v1*womendistress
menneuro ~~ v2*menneuro
womenneuro ~~ v2*womenneuro
'

Next, the tested covariances of each variable are listed and constrained in the script.

scriptcovariance2 <- '
mendistress ~~ same*menneuro 
womendistress ~~ same*womenneuro
mendistress ~~ diff*womenneuro
womendistress ~~ diff*menneuro 
'

Finally, two additional covariances are attached with labels but without equality constraints.

scriptextra2 <- '
mendistress ~~ dis*womendistress 
menneuro ~~ neuro*womenneuro
'

All four scripts are attached into the same script.

scriptnested <- paste(scriptmean2, scriptvariance2, scriptcovariance2, scriptextra2)

Then, use the sem function to run the script on the data. Users may use the summary function to check the output now. The output is not shown here to save space. You will see that the equality constraints are indeed applied. This is the constrained (or nested) model.

outnested <- sem(scriptnested, data=dat1)

Run the likelihood ratio test. The test was significant so the test of distinguishability is rejected. The role of men and women are valid in distinguish types of people in the dyads.

anova(outnested, outparent)
## 
## Chi-Squared Difference Test
## 
##           Df    AIC    BIC  Chisq Chisq diff   RMSEA Df diff Pr(>Chisq)    
## outparent  0 6941.8 6991.1  0.000                                          
## outnested  6 7023.4 7051.5 93.601     93.601 0.24166       6  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

See the output of the parent model. The standardized columns show the standardized parameters, which you can see the correlation between variables.

summary(outparent, standardize=TRUE)
## lavaan 0.6.17 ended normally after 94 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                        14
## 
##   Number of observations                           250
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   mendistress ~~                                                        
##     menneuro         39.720    6.478    6.132    0.000   39.720    0.421
##   womendistress ~~                                                      
##     womenneuro       41.093    6.219    6.608    0.000   41.093    0.460
##   mendistress ~~                                                        
##     womenneuro       53.721    6.856    7.836    0.000   53.721    0.571
##   womendistress ~~                                                      
##     menneuro         13.432    5.729    2.345    0.019   13.432    0.150
##   mendistress ~~                                                        
##     womendistress    62.025    6.427    9.650    0.000   62.025    0.770
##   menneuro ~~                                                           
##     womenneuro       -6.731    6.639   -1.014    0.311   -6.731   -0.064
## 
## Intercepts:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     mendistress      49.779    0.583   85.448    0.000   49.779    5.404
##     womendistress    50.400    0.553   91.181    0.000   50.400    5.767
##     menneuro         50.830    0.648   78.413    0.000   50.830    4.959
##     womenneuro       50.001    0.646   77.349    0.000   50.001    4.892
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##     mendistress      84.844    7.589   11.180    0.000   84.844    1.000
##     womendistress    76.381    6.832   11.180    0.000   76.381    1.000
##     menneuro        105.053    9.396   11.180    0.000  105.053    1.000
##     womenneuro      104.469    9.344   11.180    0.000  104.469    1.000

Actor-Partner Interdependence Model (APIM)

Example 1: The Impact of Couples’ Neuroticism on Distress

For distinguishable dyads, path analysis is used to run APIM. Again, the lavaan package is used. In this example, psychological distress is predicted by both male’s and female’s neuroticism and length of relationships. The actor effect is mendistress ~ menneuro and womendistress ~ womenneuro. The partner effect is mendistress ~ womenneuro and womendistress ~ menneuro.

library(lavaan)
script1 <- '
mendistress ~ menneuro + womenneuro + lengthrela
womendistress ~ menneuro + womenneuro + lengthrela
mendistress ~~ womendistress
'
out1 <- sem(script1, data=dat1)
summary(out1, standardize=TRUE, rsquare=TRUE)
## lavaan 0.6.17 ended normally after 20 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         9
## 
##   Number of observations                           250
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##   mendistress ~                                                         
##     menneuro          0.357    0.035   10.196    0.000    0.357    0.398
##     womenneuro        0.451    0.036   12.456    0.000    0.451    0.500
##     lengthrela       -1.850    0.224   -8.260    0.000   -1.850   -0.336
##   womendistress ~                                                       
##     menneuro          0.085    0.042    2.014    0.044    0.085    0.099
##     womenneuro        0.291    0.043    6.697    0.000    0.291    0.341
##     lengthrela       -2.300    0.269   -8.544    0.000   -2.300   -0.440
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##  .mendistress ~~                                                        
##    .womendistress    23.757    2.791    8.512    0.000   23.757    0.639
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
##    .mendistress      30.949    2.768   11.180    0.000   30.949    0.365
##    .womendistress    44.694    3.998   11.180    0.000   44.694    0.585
## 
## R-Square:
##                    Estimate
##     mendistress       0.635
##     womendistress     0.415

lavaan can define additional parameters. Thus, users can add or subtract parameters to test some hypotheses. For example, users may have a hypothesis that the actor effects of men is less than the actor effect of women. Users can make separate labels for the actor effects of men (am) and women (aw). Then, diffa is defined as am - aw. If diffa is significant and less than 0, the user’s hypothesis is supported. See the lecture for the details of all additional parameters shown in the script.

script2 <- '
mendistress ~ am*menneuro + pmw*womenneuro + cm*lengthrela
womendistress ~ pwm*menneuro + aw*womenneuro + cw*lengthrela
mendistress ~~ womendistress

# Define New Parameters
apmen := am - pmw                         # Compare actor and partner on mendistress
apwomen := aw - pwm                       # Compare actor and partner on womendistress
diffa := am - aw                          # Compare actor effects on each role
diffp := pmw - pwm                        # Compare partner effects on each role
diffmenneuro := am - pwm                  # Compare the effects on menneuro on men and women distress
diffwomenneuro := pmw - aw                # Compare the effects on womenneuro on men and women distress
difflengthrela := cm - cw                 # Compare the effects on lengthrela on men and women distress
aveactor := (am + aw) / 2                 # Average actor effects
avepartner := (pmw + pwm) / 2             # Average partner effects
diffactorpartner := aveactor - avepartner # Compare actor and partner effects
'
out2 <- sem(script2, data=dat1)
summary(out2)
## lavaan 0.6.17 ended normally after 20 iterations
## 
##   Estimator                                         ML
##   Optimization method                           NLMINB
##   Number of model parameters                         9
## 
##   Number of observations                           250
## 
## Model Test User Model:
##                                                       
##   Test statistic                                 0.000
##   Degrees of freedom                                 0
## 
## Parameter Estimates:
## 
##   Standard errors                             Standard
##   Information                                 Expected
##   Information saturated (h1) model          Structured
## 
## Regressions:
##                    Estimate  Std.Err  z-value  P(>|z|)
##   mendistress ~                                       
##     menneuro  (am)    0.357    0.035   10.196    0.000
##     womenner (pmw)    0.451    0.036   12.456    0.000
##     lengthrl  (cm)   -1.850    0.224   -8.260    0.000
##   womendistress ~                                     
##     menneuro (pwm)    0.085    0.042    2.014    0.044
##     womenner  (aw)    0.291    0.043    6.697    0.000
##     lengthrl  (cw)   -2.300    0.269   -8.544    0.000
## 
## Covariances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##  .mendistress ~~                                      
##    .womendistress    23.757    2.791    8.512    0.000
## 
## Variances:
##                    Estimate  Std.Err  z-value  P(>|z|)
##    .mendistress      30.949    2.768   11.180    0.000
##    .womendistress    44.694    3.998   11.180    0.000
## 
## Defined Parameters:
##                    Estimate  Std.Err  z-value  P(>|z|)
##     apmen            -0.093    0.047   -1.972    0.049
##     apwomen           0.206    0.057    3.629    0.000
##     diffa             0.066    0.054    1.231    0.218
##     diffp             0.366    0.053    6.848    0.000
##     diffmenneuro      0.273    0.033    8.156    0.000
##     diffwomenneuro    0.159    0.034    4.624    0.000
##     difflengthrela    0.450    0.214    2.106    0.035
##     aveactor          0.324    0.029   11.208    0.000
##     avepartner        0.268    0.029    9.304    0.000
##     diffactorprtnr    0.057    0.023    2.506    0.012
Example 3: The Effect of Roommates’ Housework Hours on Satisfaction

For indistinguishable dyads, the gls function with a pairwise data set is used to analyze APIM. Let’s read the new data set of duo roommates. This data analyze the actor and partner effects of housework each roommate spent time on the roommates’ satisfaction.

dat3 <- read.table("lecture15ex3.csv", sep=",", header=TRUE)

Check the descriptive statistics.

describe(dat3)
##           vars    n   mean     sd median trimmed    mad min    max range  skew
## coupleid     1 1000 250.50 144.41  250.5  250.50 185.32   1  500.0 499.0  0.00
## subjectid    2 1000 500.50 288.82  500.5  500.50 370.65   1 1000.0 999.0  0.00
## house        3 1000   3.26   1.00    3.2    3.24   1.04   0    6.3   6.3  0.15
## sat          4 1000  50.02  12.70   50.0   50.06  13.34   6   89.0  83.0 -0.05
##           kurtosis   se
## coupleid     -1.20 4.57
## subjectid    -1.20 9.13
## house        -0.14 0.03
## sat          -0.14 0.40

Make the pairwise data set. house is the amount of time each person spending on housework. housepartner is the amount of time each partner spending on housework.

dat3a <- dat3[order(dat3$coupleid, dat3$subjectid),]
dat3b <- dat3[order(dat3$coupleid, -dat3$subjectid),]
dat3a$housepartner <- dat3b$house

Run the gls function to investigate the actor (house) and the partner (housepartner) effects.

library(nlme)
out3a <- gls(sat ~ 1 + house + housepartner,
            data = dat3a, correlation = corCompSymm(form = ~1|coupleid))
summary(out3a)
## Generalized least squares fit by REML
##   Model: sat ~ 1 + house + housepartner 
##   Data: dat3a 
##        AIC      BIC    logLik
##   7339.369 7363.893 -3664.685
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | coupleid 
##  Parameter estimate(s):
##       Rho 
## 0.3421043 
## 
## Coefficients:
##                 Value Std.Error   t-value p-value
## (Intercept)  34.12828 2.1089358 16.182701       0
## house        -2.20285 0.3532515 -6.235925       0
## housepartner  7.08331 0.3532515 20.051761       0
## 
##  Correlation: 
##              (Intr) house 
## house        -0.890       
## housepartner -0.890  0.632
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -3.08051766 -0.67341594 -0.02880145  0.68980361  2.66457362 
## 
## Residual standard error: 9.754758 
## Degrees of freedom: 1000 total; 997 residual

Let’s say that analysts want to check whether the actor and partner effects are different. In this case, the difference between the regression coefficients of actor effect and partner effect is tested. Thus, house - housepartner is needed. The contrast coeficients for (Intercept), house, and housepartner would be \(\begin{bmatrix}0&1&-1\end{bmatrix}\). The multcomp package can be used to test the contrast.

library(multcomp)
diffhouse <- matrix(c(0, 1, -1), nrow=1, ncol=3)
ct <- glht(out3a, linfct = diffhouse)
summary(ct)
## 
##   Simultaneous Tests for General Linear Hypotheses
## 
## Fit: gls(model = sat ~ 1 + house + housepartner, data = dat3a, correlation = corCompSymm(form = ~1 | 
##     coupleid))
## 
## Linear Hypotheses:
##        Estimate Std. Error z value Pr(>|z|)    
## 1 == 0   -9.286      0.303  -30.65   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## (Adjusted p values reported -- single-step method)

Let’s make another model where the actor and partner effects have the interaction predicting satisfaction.

out3b <- gls(sat ~ 1 + house*housepartner,
             data = dat3a, correlation = corCompSymm(form = ~1|coupleid))
summary(out3b)
## Generalized least squares fit by REML
##   Model: sat ~ 1 + house * housepartner 
##   Data: dat3a 
##        AIC      BIC    logLik
##   7331.238 7360.661 -3659.619
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | coupleid 
##  Parameter estimate(s):
##       Rho 
## 0.3337279 
## 
## Coefficients:
##                       Value Std.Error   t-value p-value
## (Intercept)        45.53177  4.091386 11.128691  0.0000
## house              -5.71393  1.138436 -5.019105  0.0000
## housepartner        3.57223  1.138436  3.137843  0.0018
## house:housepartner  1.11993  0.345484  3.241623  0.0012
## 
##  Correlation: 
##                    (Intr) house  hsprtn
## house              -0.958              
## housepartner       -0.958  0.965       
## house:housepartner  0.860 -0.951 -0.951
## 
## Standardized residuals:
##         Min          Q1         Med          Q3         Max 
## -3.13128937 -0.68620229 -0.02168284  0.67313229  2.63806304 
## 
## Residual standard error: 9.693245 
## Degrees of freedom: 1000 total; 996 residual

The interaction was significant so predictor values for probing interaction are needed. Check the mean and standard deviation of the hours spent on housework. The values of 2.25, 3.25, and 4.25 are used for probing interactions.

mean(dat3$house)
## [1] 3.2566
sd(dat3$house)
## [1] 0.9981647
housevalue <- c(2.25, 3.25, 4.25)

Unfortunately, the interactions package does not support the gls function. Thus, the function I wrote to analyze simple slopes (mentioned in Lecture 13) is used here. Let’s check the effect of house at each level of housepartner.

source("quicksimslopeslme4.R")
quick_sim_slopes(out3b, pred="house", modx="housepartner", modx.values=housevalue)
## $`Conditional Intercept`
##   modx.values      est       se        z p
## 1        2.25 53.56930 1.796044 29.82628 0
## 2        3.25 57.14153 1.196587 47.75378 0
## 3        4.25 60.71376 1.493298 40.65749 0
## 
## $`Conditional Slope`
##   modx.values        est        se         z        p
## 1        2.25 -3.1940930 0.4651709 -6.866494 0.000000
## 2        3.25 -2.0741653 0.3527814 -5.879464 0.000000
## 3        4.25 -0.9542376 0.5208103 -1.832217 0.066919

The resulting coefficients of simple intercepts and simple slopes are used to plot the interaction with the help of the abline function.

cond_house <- quick_sim_slopes(out3b, pred="house", modx="housepartner", modx.values=housevalue)
cond_house_int <- cond_house[[1]][,"est"]
cond_house_slope <- cond_house[[2]][,"est"]
plot(dat3a$house, dat3a$sat, type="n", xlab="Self Housework Hours per Month", 
     ylab="Relationship Satisfaction", main="Actor Effect")
abline(a = cond_house_int[1], b = cond_house_slope[1], col="red")
abline(a = cond_house_int[2], b = cond_house_slope[2], col="green")
abline(a = cond_house_int[3], b = cond_house_slope[3], col="blue")
legend(0.1, 40, c("Low (2.25)", "Medium (3.25)", "High (4.25)"), 
       col=c("red", "green", "blue"), lty=1, title="Partner Housework Hours per Month")

Alternatively let’s check the effect of housepartner at each level of house.

quick_sim_slopes(out3b, pred="housepartner", modx="house", modx.values=housevalue)
## $`Conditional Intercept`
##   modx.values      est       se        z p
## 1        2.25 32.67542 1.796044 18.19300 0
## 2        3.25 26.96149 1.196587 22.53200 0
## 3        4.25 21.24756 1.493298 14.22861 0
## 
## $`Conditional Slope`
##   modx.values      est        se        z p
## 1        2.25 6.092072 0.4651709 13.09642 0
## 2        3.25 7.211999 0.3527814 20.44325 0
## 3        4.25 8.331927 0.5208103 15.99801 0

Plot the effect of housepartner at each level of house.

cond_housep <- quick_sim_slopes(out3b, pred="housepartner", modx="house", modx.values=housevalue)
cond_housep_int <- cond_housep[[1]][,"est"]
cond_housep_slope <- cond_housep[[2]][,"est"]
plot(dat3a$house, dat3a$sat, type="n", xlab="Partner Housework Hours per Month", 
     ylab="Relationship Satisfaction", main="Partner Effect")
abline(a = cond_housep_int[1], b = cond_housep_slope[1], col="red")
abline(a = cond_housep_int[2], b = cond_housep_slope[2], col="green")
abline(a = cond_housep_int[3], b = cond_housep_slope[3], col="blue")
legend(0.1, 90, c("Low (2.25)", "Medium (3.25)", "High (4.25)"), 
       col=c("red", "green", "blue"), lty=1, title="Self Housework Hours per Month")

Biases in Actor Effect Estimation

From the house and housepartner effect without interaction, the regression coefficient of house is -2.20. First, pretend that dyadic data is not available. Researchers can collect only one person in each room. In this case, the data will not have the nested data structure. The lm function is used.

dat3c <- dat3[!duplicated(dat3$coupleid),]
summary(lm(sat ~ house, data=dat3c))
## 
## Call:
## lm(formula = sat ~ house, data = dat3c)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -29.690  -8.277  -0.051   7.874  34.159 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   64.306      1.768   36.38  < 2e-16 ***
## house         -4.302      0.524   -8.21 1.91e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.68 on 498 degrees of freedom
## Multiple R-squared:  0.1192, Adjusted R-squared:  0.1174 
## F-statistic:  67.4 on 1 and 498 DF,  p-value: 1.912e-15

When the dyadic data is not available, the regression coefficient represents the actor effect without controlling the partner effect, which is -4.302. This value is quite different from APIM.

Assume that the dyadic data is available. Now, use only house to predict the dependent variable without housepartner. In this case, the regression coefficient will also represents the actor effect but this model assumes that the partner effect is 0. You will see that the actor effect from the model is also quite different from APIM, which is -7.59.

out3bias <- gls(sat ~ 1 + house,
             data = dat3a, correlation = corCompSymm(form = ~1|coupleid))
summary(out3bias)
## Generalized least squares fit by REML
##   Model: sat ~ 1 + house 
##   Data: dat3a 
##        AIC      BIC    logLik
##   7651.409 7671.032 -3821.704
## 
## Correlation Structure: Compound symmetry
##  Formula: ~1 | coupleid 
##  Parameter estimate(s):
##       Rho 
## 0.5441211 
## 
## Coefficients:
##                Value Std.Error   t-value p-value
## (Intercept) 74.74862 1.0658673  70.12939       0
## house       -7.59277 0.2930558 -25.90896       0
## 
##  Correlation: 
##       (Intr)
## house -0.895
## 
## Standardized residuals:
##          Min           Q1          Med           Q3          Max 
## -2.988840118 -0.699754559  0.001710381  0.664810861  3.029049389 
## 
## Residual standard error: 12.07816 
## Degrees of freedom: 1000 total; 998 residual

R-squared

In distinguishable dyads, users can use the sem function in lavaan and add rsquare=TRUE when they call the summary function. The proportion of variance explained in the dependent variables of each role can be seen.

In indistinguishable dyads, the \(R^2\) is not automatically provided. There are two ways to get the \(R^2\). First, users can use the the predict function to get the predicted values. Then the squared correlation between the predicted values and actual values is \(R^2\).

cor(predict(out3b), dat3$sat)^2
## [1] 0.4194868

Alternatively, if ICC is positive, the lme or lmer can be used. The r2mlm function can extract \(R^2\) from the output.

out3d <- lme(sat ~ 1 + house*housepartner, random = ~1 |coupleid, data = dat3a)
library(r2mlm)
r2mlm(out3d)

## $Decompositions
##                     total
## fixed           0.4185914
## slope variation 0.0000000
## mean variation  0.1940320
## sigma2          0.3873767
## 
## $R2s
##         total
## f   0.4185914
## v   0.0000000
## m   0.1940320
## fv  0.4185914
## fvm 0.6126233