Example 1: The Effect of Transformational Leadership toward Job Satisfaction

Read the data set

ex1 <- read.table("lecture3ex1.csv", sep=",", header=TRUE)

Find the descriptive statistics of the variables in the data set

library(psych)
## Warning: package 'psych' was built under R version 4.2.3
describe(ex1)
##          vars  n  mean    sd median trimmed   mad   min   max range  skew
## tlleader    1 60 48.90 10.96  51.03   49.61  9.63 17.31 70.75 53.44 -0.70
## security    2 60 51.65 10.22  52.10   51.87  9.78 28.29 76.30 48.01 -0.13
## jobsat      3 60 51.21 15.74  50.99   50.93 17.83 20.00 80.00 60.00  0.06
##          kurtosis   se
## tlleader     0.40 1.41
## security    -0.26 1.32
## jobsat      -0.89 2.03

Test whether each correlation in a correlation matrix is significantly different from 0.

corr.test(ex1)
## Call:corr.test(x = ex1)
## Correlation matrix 
##          tlleader security jobsat
## tlleader     1.00     0.09  -0.14
## security     0.09     1.00   0.57
## jobsat      -0.14     0.57   1.00
## Sample Size 
## [1] 60
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##          tlleader security jobsat
## tlleader     0.00     0.58   0.58
## security     0.51     0.00   0.00
## jobsat       0.29     0.00   0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

Run multiple regression where jobsat is the outcome and tlleader and security are predictors. Save the output of the multiple regression as out1 and use summary to see the output.

out1 <- lm(jobsat ~ tlleader + security, data=ex1)
summary(out1)
## 
## Call:
## lm(formula = jobsat ~ tlleader + security, data = ex1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.683  -8.821   2.048   7.479  26.441 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  17.8111    10.9031   1.634   0.1079    
## tlleader     -0.2724     0.1527  -1.785   0.0797 .  
## security      0.9045     0.1637   5.527 8.45e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.8 on 57 degrees of freedom
## Multiple R-squared:  0.3613, Adjusted R-squared:  0.3389 
## F-statistic: 16.12 on 2 and 57 DF,  p-value: 2.82e-06

tlleader*security represents the effect of each predictor alone and the product of both variables. That is, in R, the interaction can be represented by * in the lm formula.

out1a <- lm(jobsat ~ tlleader*security, data=ex1)
summary(out1a)
## 
## Call:
## lm(formula = jobsat ~ tlleader * security, data = ex1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.9307  -8.8247   0.3722   7.6131  18.6684 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       -152.11000   30.80450  -4.938 7.48e-06 ***
## tlleader             3.25737    0.62576   5.205 2.86e-06 ***
## security             4.19433    0.58680   7.148 1.97e-09 ***
## tlleader:security   -0.06808    0.01184  -5.751 3.84e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.24 on 56 degrees of freedom
## Multiple R-squared:  0.5985, Adjusted R-squared:  0.577 
## F-statistic: 27.83 on 3 and 56 DF,  p-value: 3.788e-11

You will see that tlleader:security is the interaction effect, which is significant here.

The anova function compares the models and without the interaction.

anova(out1, out1a)
## Analysis of Variance Table
## 
## Model 1: jobsat ~ tlleader + security
## Model 2: jobsat ~ tlleader * security
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     57 9335.4                                  
## 2     56 5868.8  1    3466.7 33.079 3.843e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Because the intercept is not meaningful because tlleader = 0 and security = 0 are not meaningful. Thus, centering to their means are used.

out1_1 <- lm(jobsat ~ I(tlleader - mean(tlleader)) + I(security - mean(security)), data=ex1)
summary(out1_1)
## 
## Call:
## lm(formula = jobsat ~ I(tlleader - mean(tlleader)) + I(security - 
##     mean(security)), data = ex1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -39.683  -8.821   2.048   7.479  26.441 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                   51.2075     1.6522  30.994  < 2e-16 ***
## I(tlleader - mean(tlleader))  -0.2724     0.1527  -1.785   0.0797 .  
## I(security - mean(security))   0.9045     0.1637   5.527 8.45e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.8 on 57 degrees of freedom
## Multiple R-squared:  0.3613, Adjusted R-squared:  0.3389 
## F-statistic: 16.12 on 2 and 57 DF,  p-value: 2.82e-06
out1a_1 <- lm(jobsat ~ I(tlleader - mean(tlleader))*I(security - mean(security)), data=ex1)
summary(out1a_1)
## 
## Call:
## lm(formula = jobsat ~ I(tlleader - mean(tlleader)) * I(security - 
##     mean(security)), data = ex1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -22.9307  -8.8247   0.3722   7.6131  18.6684 
## 
## Coefficients:
##                                                           Estimate Std. Error
## (Intercept)                                               51.86394    1.32653
## I(tlleader - mean(tlleader))                              -0.25935    0.12215
## I(security - mean(security))                               0.86493    0.13109
## I(tlleader - mean(tlleader)):I(security - mean(security)) -0.06808    0.01184
##                                                           t value Pr(>|t|)    
## (Intercept)                                                39.097  < 2e-16 ***
## I(tlleader - mean(tlleader))                               -2.123   0.0382 *  
## I(security - mean(security))                                6.598 1.59e-08 ***
## I(tlleader - mean(tlleader)):I(security - mean(security))  -5.751 3.84e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 10.24 on 56 degrees of freedom
## Multiple R-squared:  0.5985, Adjusted R-squared:  0.577 
## F-statistic: 27.83 on 3 and 56 DF,  p-value: 3.788e-11

You can see that the anova results between centered and uncentered outputs are the same.

anova(out1_1, out1a_1)
## Analysis of Variance Table
## 
## Model 1: jobsat ~ I(tlleader - mean(tlleader)) + I(security - mean(security))
## Model 2: jobsat ~ I(tlleader - mean(tlleader)) * I(security - mean(security))
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1     57 9335.4                                  
## 2     56 5868.8  1    3466.7 33.079 3.843e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

To get the standardized estimates, the betaDelta package cannot be used here. Each predictor must be standardized (by the scale function) first before calculating their products and running regression. The I and scale functions are used to get standardized coefficients. Note that the test statistics are incorrect. Please use the point standardized estimates only.

out1_2 <- lm(I(scale(jobsat)) ~ I(scale(tlleader)) + I(scale(security)), data=ex1)
summary(out1_2)
## 
## Call:
## lm(formula = I(scale(jobsat)) ~ I(scale(tlleader)) + I(scale(security)), 
##     data = ex1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5212 -0.5604  0.1301  0.4752  1.6799 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         8.091e-17  1.050e-01   0.000   1.0000    
## I(scale(tlleader)) -1.896e-01  1.063e-01  -1.785   0.0797 .  
## I(scale(security))  5.873e-01  1.063e-01   5.527 8.45e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8131 on 57 degrees of freedom
## Multiple R-squared:  0.3613, Adjusted R-squared:  0.3389 
## F-statistic: 16.12 on 2 and 57 DF,  p-value: 2.82e-06
out1a_2 <- lm(I(scale(jobsat)) ~ I(scale(tlleader))*I(scale(security)), data=ex1)
summary(out1a_2)
## 
## Call:
## lm(formula = I(scale(jobsat)) ~ I(scale(tlleader)) * I(scale(security)), 
##     data = ex1)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.45683 -0.56065  0.02365  0.48368  1.18605 
## 
## Coefficients:
##                                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                            0.04170    0.08428   0.495   0.6227    
## I(scale(tlleader))                    -0.18051    0.08501  -2.123   0.0382 *  
## I(scale(security))                     0.56160    0.08512   6.598 1.59e-08 ***
## I(scale(tlleader)):I(scale(security)) -0.48428    0.08420  -5.751 3.84e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6504 on 56 degrees of freedom
## Multiple R-squared:  0.5985, Adjusted R-squared:  0.577 
## F-statistic: 27.83 on 3 and 56 DF,  p-value: 3.788e-11

The interactions package is used for probing interactions. First, use interact_plot to visualize the interaction effect.

library(interactions)
## Warning: package 'interactions' was built under R version 4.2.3
interact_plot(out1a, pred="tlleader", modx="security")

The lines representing each value of moderators can be modified by changing modx.values. plot.points=TRUE is used to provide data points.

interact_plot(out1a, pred="tlleader", modx="security", modx.values = c(30, 40, 50, 60, 70), plot.points=TRUE)

You can change many arguments. Please check the help page.

interact_plot(out1a, pred="tlleader", modx="security", interval=TRUE, int.width=.9,
   x.label = "Transformational Leadership", y.label = "Job Satisfaction",
   main.title = NULL,  legend.main = "Job Security",
   colors = "seagreen") 

The simple slopes can be tested whether the effect of the main predictor is significant at each level of moderators.

sim_slopes(out1a, pred="tlleader", modx="security")
## JOHNSON-NEYMAN INTERVAL 
## 
## When security is OUTSIDE the interval [43.14, 51.44], the slope of tlleader
## is p < .05.
## 
## Note: The range of observed values of security is [28.29, 76.30]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of tlleader when security = 41.43465 (- 1 SD): 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.44   0.17     2.52   0.01
## 
## Slope of tlleader when security = 51.65472 (Mean): 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -0.26   0.12    -2.12   0.04
## 
## Slope of tlleader when security = 61.87480 (+ 1 SD): 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -0.96   0.17    -5.61   0.00

Get and test the conditional intercepts.

sim_slopes(out1a, pred="tlleader", modx="security", cond.int=TRUE)
## JOHNSON-NEYMAN INTERVAL 
## 
## When security is OUTSIDE the interval [43.14, 51.44], the slope of tlleader
## is p < .05.
## 
## Note: The range of observed values of security is [28.29, 76.30]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## When security = 41.43465 (- 1 SD): 
## 
##                                Est.   S.E.   t val.      p
## --------------------------- ------- ------ -------- ------
## Slope of tlleader              0.44   0.17     2.52   0.01
## Conditional intercept         43.02   1.89    22.77   0.00
## 
## When security = 51.65472 (Mean): 
## 
##                                Est.   S.E.   t val.      p
## --------------------------- ------- ------ -------- ------
## Slope of tlleader             -0.26   0.12    -2.12   0.04
## Conditional intercept         51.86   1.33    39.10   0.00
## 
## When security = 61.87480 (+ 1 SD): 
## 
##                                Est.   S.E.   t val.      p
## --------------------------- ------- ------ -------- ------
## Slope of tlleader             -0.96   0.17    -5.61   0.00
## Conditional intercept         60.70   1.88    32.27   0.00

Plot the confidence intervals of simple slopes at each level of the moderator.

ss <- sim_slopes(out1a, pred="tlleader", modx="security", modx.values = seq(30, 75, 5))
plot(ss)
## Registered S3 methods overwritten by 'broom':
##   method            from  
##   tidy.glht         jtools
##   tidy.summary.glht jtools
## Loading required namespace: broom.mixed

Check the Johnson-Neyman plot.

johnson_neyman(out1a, pred="tlleader", modx="security", alpha = .05)
## JOHNSON-NEYMAN INTERVAL 
## 
## When security is OUTSIDE the interval [43.14, 51.44], the slope of tlleader
## is p < .05.
## 
## Note: The range of observed values of security is [28.29, 76.30]

Example 2: Factors influencing attitude toward classical musics.

Read the data set

ex2 <- read.table("lecture3ex2.csv", sep=",", header=TRUE)

Find the descriptive statistics of the variables in the data set

describe(ex2)
##          vars   n  mean    sd median trimmed  mad   min   max range  skew
## ses         1 100  4.96  2.10   5.24    5.01 2.18  0.00  9.67  9.67 -0.19
## age         2 100 41.66  9.09  42.27   41.84 8.01 17.90 64.49 46.59 -0.17
## guarlike    3 100 50.77 10.81  52.08   51.33 9.16 20.00 76.16 56.16 -0.49
## partlike    4 100 64.72  9.71  65.46   65.09 9.74 30.84 80.00 49.16 -0.59
##          kurtosis   se
## ses         -0.51 0.21
## age         -0.10 0.91
## guarlike     0.29 1.08
## partlike     0.26 0.97

Test whether each correlation in a correlation matrix is significantly different from 0.

corr.test(ex2)
## Call:corr.test(x = ex2)
## Correlation matrix 
##            ses   age guarlike partlike
## ses       1.00 -0.06     0.63     0.47
## age      -0.06  1.00    -0.09     0.14
## guarlike  0.63 -0.09     1.00     0.60
## partlike  0.47  0.14     0.60     1.00
## Sample Size 
## [1] 100
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##           ses  age guarlike partlike
## ses      0.00 0.72     0.00     0.00
## age      0.56 0.00     0.72     0.46
## guarlike 0.00 0.36     0.00     0.00
## partlike 0.00 0.15     0.00     0.00
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

Run the multiple regression without interactions.

out2 <- lm(partlike ~ age + ses + guarlike, data=ex2)
summary(out2)
## 
## Call:
## lm(formula = partlike ~ age + ses + guarlike, data = ex2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -20.5429  -4.7012  -0.3549   5.8684  15.6513 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 28.49941    5.32750   5.349 5.98e-07 ***
## age          0.21546    0.08404   2.564   0.0119 *  
## ses          0.70153    0.46733   1.501   0.1366    
## guarlike     0.46814    0.09105   5.142 1.44e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.564 on 96 degrees of freedom
## Multiple R-squared:  0.4121, Adjusted R-squared:  0.3937 
## F-statistic: 22.43 on 3 and 96 DF,  p-value: 4.345e-11

Run the multiple regression with the interaction between age and ses.

out2a <- lm(partlike ~ age*ses + guarlike, data=ex2)
summary(out2a)
## 
## Call:
## lm(formula = partlike ~ age * ses + guarlike, data = ex2)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -16.3513  -5.8078   0.0796   4.8115  14.7311 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 59.90367    9.82356   6.098  2.3e-08 ***
## age         -0.52191    0.21361  -2.443 0.016399 *  
## ses         -5.25065    1.66125  -3.161 0.002112 ** 
## guarlike     0.45095    0.08565   5.265  8.7e-07 ***
## age:ses      0.14456    0.03891   3.715 0.000343 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.105 on 95 degrees of freedom
## Multiple R-squared:  0.4866, Adjusted R-squared:  0.465 
## F-statistic: 22.51 on 4 and 95 DF,  p-value: 4.239e-13

The anova function compares the models and without the interaction.

anova(out2, out2a)
## Analysis of Variance Table
## 
## Model 1: partlike ~ age + ses + guarlike
## Model 2: partlike ~ age * ses + guarlike
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)    
## 1     96 5492.7                                 
## 2     95 4796.0  1    696.74 13.801 0.000343 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Check the simple slopes of ses at each level of age.

sim_slopes(out2a, pred="ses", modx="age")
## JOHNSON-NEYMAN INTERVAL 
## 
## When age is OUTSIDE the interval [26.38, 42.38], the slope of ses is p <
## .05.
## 
## Note: The range of observed values of age is [17.90, 64.49]
## 
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of ses when age = 32.57988 (- 1 SD): 
## 
##    Est.   S.E.   t val.      p
## ------- ------ -------- ------
##   -0.54   0.55    -0.98   0.33
## 
## Slope of ses when age = 41.66498 (Mean): 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   0.77   0.44     1.76   0.08
## 
## Slope of ses when age = 50.75007 (+ 1 SD): 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   2.09   0.58     3.62   0.00

Plot the interaction between ses and age.

interact_plot(out2a, pred="ses", modx="age")

Example 3: Factors influencing angry emotions toward a person being late.

Read the data set

ex3 <- read.table("lecture3ex3.csv", sep=",", header=TRUE)

Find the descriptive statistics of the variables in the data set

describe(ex3)
##          vars   n  mean   sd median trimmed  mad   min   max range skew
## latemins    1 150  3.31 2.51   3.00    3.04 2.97  0.00 12.00 12.00 1.05
## country*    2 150  2.00 0.82   2.00    2.00 1.48  1.00  3.00  2.00 0.00
## angry       3 150 36.18 9.48  34.92   35.37 8.76 19.68 73.74 54.05 0.88
##          kurtosis   se
## latemins     1.12 0.21
## country*    -1.52 0.07
## angry        1.20 0.77

Find the frequency table of country.

table(ex3[,"country"])
## 
## chinese english    thai 
##      50      50      50

Test whether each correlation in a correlation matrix (only between continuous variables) is significantly different from 0.

corr.test(ex3[,c("latemins", "angry")])
## Call:corr.test(x = ex3[, c("latemins", "angry")])
## Correlation matrix 
##          latemins angry
## latemins     1.00  0.53
## angry        0.53  1.00
## Sample Size 
## [1] 150
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##          latemins angry
## latemins        0     0
## angry           0     0
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

Change country to be in the factor format and change the reference group to thai.

ex3[,"country"] <- factor(ex3[,"country"])
ex3[,"country"] <- relevel(ex3[,"country"], ref="thai")

Run the multiple regression without interactions.

out3 <- lm(angry ~ latemins + country, data=ex3)
summary(out3)
## 
## Call:
## lm(formula = angry ~ latemins + country, data = ex3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -15.0516  -4.0123  -0.2607   4.2589  18.5488 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     26.3328     1.2353  21.318  < 2e-16 ***
## latemins         1.9676     0.2266   8.685 6.98e-15 ***
## countrychinese   0.8071     1.3907   0.580    0.563    
## countryenglish   9.1790     1.3907   6.600 7.10e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.953 on 146 degrees of freedom
## Multiple R-squared:  0.4733, Adjusted R-squared:  0.4625 
## F-statistic: 43.73 on 3 and 146 DF,  p-value: < 2.2e-16

Run the multiple regression with the interaction between latemins and country. You will see that the dummy variables are multipled by the continuous variable automatically.

out3a <- lm(angry ~ latemins*country, data=ex3)
summary(out3a)
## 
## Call:
## lm(formula = angry ~ latemins * country, data = ex3)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.6988  -4.5530   0.1675   3.7784  16.0432 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)              29.5190     1.5485  19.063  < 2e-16 ***
## latemins                  1.0021     0.3826   2.619  0.00976 ** 
## countrychinese           -0.2551     2.0873  -0.122  0.90290    
## countryenglish            0.2512     2.1779   0.115  0.90833    
## latemins:countrychinese   0.3140     0.5051   0.622  0.53516    
## latemins:countryenglish   2.6642     0.5304   5.023 1.48e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.339 on 144 degrees of freedom
## Multiple R-squared:  0.5682, Adjusted R-squared:  0.5532 
## F-statistic:  37.9 on 5 and 144 DF,  p-value: < 2.2e-16

The anova function compares the models and without the interaction.

anova(out3, out3a)
## Analysis of Variance Table
## 
## Model 1: angry ~ latemins + country
## Model 2: angry ~ latemins * country
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    146 7058.5                                  
## 2    144 5786.4  2    1272.1 15.828 6.113e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Check the simple slopes of latemins at each country.

sim_slopes(out3a, pred="latemins", modx="country")
## Warning: Johnson-Neyman intervals are not available for factor moderators.
## SIMPLE SLOPES ANALYSIS 
## 
## Slope of latemins when country = english: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   3.67   0.37     9.98   0.00
## 
## Slope of latemins when country = chinese: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   1.32   0.33     3.99   0.00
## 
## Slope of latemins when country = thai: 
## 
##   Est.   S.E.   t val.      p
## ------ ------ -------- ------
##   1.00   0.38     2.62   0.01

Plot the interaction between latemins and country.

interact_plot(out3a, pred="latemins", modx="country")

The interactions package can test the simple slopes of continuous variables at each level of categorical variable. It cannot examine whether the differences between countries are significant at each amount of late minutes. You need to adjust the formula in lm to test them.

First, at latemins = 0, we check whether the countries are different. One model has the country in the lm formula to examine their differences and another model exclude the country out of the formula to say that the differences between countries are zero. Then use anova to test the differences.

Note that latemins:country is the interaction effect alone whereas latemins*country include the first order terms and their interaction; that is, latemins, country, and latemins:country

out3a <- lm(angry ~ latemins*country, data=ex3)
out3ax <- lm(angry ~ latemins + latemins:country, data=ex3)
anova(out3ax, out3a)
## Analysis of Variance Table
## 
## Model 1: angry ~ latemins + latemins:country
## Model 2: angry ~ latemins * country
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1    146 5788.8                           
## 2    144 5786.4  2    2.4004 0.0299 0.9706

Next, test the differences between countries at three-minute late. Use centering to test the differences.

out3a3 <- lm(angry ~ I(latemins - 3)*country, data=ex3)
out3a3x <- lm(angry ~ I(latemins - 3) + I(latemins - 3):country, data=ex3)
anova(out3a3x, out3a3)
## Analysis of Variance Table
## 
## Model 1: angry ~ I(latemins - 3) + I(latemins - 3):country
## Model 2: angry ~ I(latemins - 3) * country
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)    
## 1    146 7836.4                                 
## 2    144 5786.4  2      2050 25.508 3.29e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, test the differences between countries at six-minute late. Use centering to test the differences.

out3a6 <- lm(angry ~ I(latemins - 6)*country, data=ex3)
out3a6x <- lm(angry ~ I(latemins - 6) + I(latemins - 6):country, data=ex3)
anova(out3a6x, out3a6)
## Analysis of Variance Table
## 
## Model 1: angry ~ I(latemins - 6) + I(latemins - 6):country
## Model 2: angry ~ I(latemins - 6) * country
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    146 9467.6                                  
## 2    144 5786.4  2    3681.2 45.805 4.019e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, test the differences between countries at nine-minute late. Use centering to test the differences.

out3a9 <- lm(angry ~ I(latemins - 9)*country, data=ex3)
out3a9x <- lm(angry ~ I(latemins - 9) + I(latemins - 9):country, data=ex3)
anova(out3a9x, out3a9)
## Analysis of Variance Table
## 
## Model 1: angry ~ I(latemins - 9) + I(latemins - 9):country
## Model 2: angry ~ I(latemins - 9) * country
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    146 8589.9                                  
## 2    144 5786.4  2    2803.5 34.884 4.429e-13 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Next, test the differences between countries at twelve-minute late. Use centering to test the differences.

out3a12 <- lm(angry ~ I(latemins - 12)*country, data=ex3)
out3a12x <- lm(angry ~ I(latemins - 12) + I(latemins - 12):country, data=ex3)
anova(out3a12x, out3a12)
## Analysis of Variance Table
## 
## Model 1: angry ~ I(latemins - 12) + I(latemins - 12):country
## Model 2: angry ~ I(latemins - 12) * country
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)    
## 1    146 8107.5                                  
## 2    144 5786.4  2    2321.1 28.882 2.841e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Example 4: The curvilinear relationship between the frequecies of fear-based advertisement and smoking frequencies.

Read the data set

ex4 <- read.table("lecture3ex4.csv", sep=",", header=TRUE)

Find the descriptive statistics of the variables in the data set

describe(ex4)
##           vars  n  mean   sd median trimmed  mad  min   max range  skew
## fearprob     1 50  0.59 0.28   0.61     0.6 0.32 0.01  0.99  0.98 -0.36
## cigarette    2 50 14.26 6.94  15.00    14.2 5.93 0.00 36.00 36.00  0.23
##           kurtosis   se
## fearprob     -0.95 0.04
## cigarette     0.56 0.98

Test whether each correlation in a correlation matrix (only between continuous variables) is significantly different from 0.

corr.test(ex4)
## Call:corr.test(x = ex4)
## Correlation matrix 
##           fearprob cigarette
## fearprob      1.00     -0.59
## cigarette    -0.59      1.00
## Sample Size 
## [1] 50
## Probability values (Entries above the diagonal are adjusted for multiple tests.) 
##           fearprob cigarette
## fearprob         0         0
## cigarette        0         0
## 
##  To see confidence intervals of the correlations, print with the short=FALSE option

Run the multiple regression without the second-order terms.

out4 <- lm(cigarette ~ fearprob, data=ex4)
summary(out4)
## 
## Call:
## lm(formula = cigarette ~ fearprob, data = ex4)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.0786  -3.2576  -0.7166   2.7475  13.2565 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   22.897      1.898  12.067 3.82e-16 ***
## fearprob     -14.699      2.926  -5.024 7.44e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.679 on 48 degrees of freedom
## Multiple R-squared:  0.3446, Adjusted R-squared:  0.3309 
## F-statistic: 25.24 on 1 and 48 DF,  p-value: 7.436e-06

Run the multiple regression with the second-order term.

out4a <- lm(cigarette ~ fearprob + I(fearprob^2), data=ex4)
summary(out4a)
## 
## Call:
## lm(formula = cigarette ~ fearprob + I(fearprob^2), data = ex4)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.6048  -3.8607   0.4359   3.3665   9.0118 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     29.920      2.645  11.311 5.18e-15 ***
## fearprob       -50.608     10.650  -4.752 1.94e-05 ***
## I(fearprob^2)   33.467      9.617   3.480  0.00109 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.117 on 47 degrees of freedom
## Multiple R-squared:  0.4789, Adjusted R-squared:  0.4567 
## F-statistic:  21.6 on 2 and 47 DF,  p-value: 2.228e-07

The anova function compares the models and without the second-order term.

anova(out4, out4a)
## Analysis of Variance Table
## 
## Model 1: cigarette ~ fearprob
## Model 2: cigarette ~ fearprob + I(fearprob^2)
##   Res.Df    RSS Df Sum of Sq      F   Pr(>F)   
## 1     48 1547.8                                
## 2     47 1230.7  1    317.12 12.111 0.001093 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Plot the curvilinear relationship.

with(ex4, plot(fearprob, cigarette, xlab="Probability of the Fear Advertisement", ylab="Number of Cigarettes Smoked"))
myorder <- order(ex4[,"fearprob"])
yhat <- predict(out4a)
lines(ex4[myorder,"fearprob"], yhat[myorder])

Probe the instantaneous rate of change at the probability of fear advertisement of 20%.

out4a2 <- lm(cigarette ~ I(fearprob - 0.2) + I((fearprob - 0.2)^2), data=ex4)
summary(out4a2)
## 
## Call:
## lm(formula = cigarette ~ I(fearprob - 0.2) + I((fearprob - 0.2)^2), 
##     data = ex4)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.6048  -3.8607   0.4359   3.3665   9.0118 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             21.138      1.297  16.293  < 2e-16 ***
## I(fearprob - 0.2)      -37.221      6.988  -5.326 2.77e-06 ***
## I((fearprob - 0.2)^2)   33.467      9.617   3.480  0.00109 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.117 on 47 degrees of freedom
## Multiple R-squared:  0.4789, Adjusted R-squared:  0.4567 
## F-statistic:  21.6 on 2 and 47 DF,  p-value: 2.228e-07

Probe the instantaneous rate of change at the probability of fear advertisement of 70%.

out4a7 <- lm(cigarette ~ I(fearprob - 0.7) + I((fearprob - 0.7)^2), data=ex4)
summary(out4a7)
## 
## Call:
## lm(formula = cigarette ~ I(fearprob - 0.7) + I((fearprob - 0.7)^2), 
##     data = ex4)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.6048  -3.8607   0.4359   3.3665   9.0118 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            10.8939     0.9241  11.788 1.22e-15 ***
## I(fearprob - 0.7)      -3.7540     4.1040  -0.915  0.36501    
## I((fearprob - 0.7)^2)  33.4669     9.6167   3.480  0.00109 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.117 on 47 degrees of freedom
## Multiple R-squared:  0.4789, Adjusted R-squared:  0.4567 
## F-statistic:  21.6 on 2 and 47 DF,  p-value: 2.228e-07

Example 3 (cont.): Factors influencing angry emotions toward a person being late.

Examine the assumptions by applying the plot function to the output.

plot(out3)

Using the car package to find useful functions for checking assumptions. First, check the outliers.

library(car)
## Warning: package 'car' was built under R version 4.2.3
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.2.3
## 
## Attaching package: 'car'
## The following object is masked from 'package:psych':
## 
##     logit
outlierTest(out3)
## No Studentized residuals with Bonferroni p < 0.05
## Largest |rstudent|:
##     rstudent unadjusted p-value Bonferroni p
## 108 2.826432           0.005372      0.80579

The q-q plot can check the normality of residuals and outliers.

qqPlot(out3)

## [1] 108 110

The leverage plot can check influential cases.

leveragePlots(out3)