Skip to content

Naive Standardization of Interaction Models #480

@Kss2k

Description

@Kss2k

This is a similar issue as #427, but regards the standardizedSolution() function, when used in the presence of interaction effects. This is relevant for interactions between observed variables, as well as latent variables (using the sam() function). Here I highlight two current issues.

Naive Standardization Interaction Term Variances

No doubt, you're already aware of this (first) issue. But for clarity, I'll detail it as well. When standardizing an interaction model using standardizedSolution(), it incorrectly constrains the variance of the interaction term to 1. This, however, is usually incorrect, and is in general only appropriate if the variables in the interaction term are orthogonal to each other. Since the variance of the interaction term is "wrong", it also means that the standardized coefficient (assuming it is non-zero) is "wrong" as well.

The modsem package has implemented a correction, which I use here to highlight the differences.

library(lavaan)
library(modsem)

model <- '
  X =~ x1 + x2 + x3
  Z =~ z1 + z2 + z3
  Y =~ y1 + y2 + y3
  Y ~ X + Z + X:Z
'

fit <- sam(model, oneInt)
standardizedSolution(fit) # naive solution
#>    lhs op rhs est.std    se       z pvalue ci.lower ci.upper
#> 10   Y  ~   X   0.423 0.018  23.117  0.000    0.387    0.459
#> 11   Y  ~   Z   0.358 0.017  21.036  0.000    0.325    0.392
#> 12   Y  ~ X:Z   0.455 0.017  26.242  0.000    0.421    0.489
#> 25 X:Z ~~ X:Z   1.000 0.000      NA     NA    1.000    1.000
#> 27   X ~~ X:Z   0.017 0.041   0.401  0.689   -0.064    0.097
#> 28   Z ~~ X:Z   0.060 0.044   1.369  0.171   -0.026    0.147

standardized_estimates(fit, correction = TRUE, std.errors = "delta")
#>    lhs op rhs    est std.error z.value p.value ci.lower ci.upper
#> 1    Y  ~   X  0.423     0.018  23.076   0.000    0.387    0.459
#> 2    Y  ~   Z  0.358     0.017  20.915   0.000    0.325    0.392
#> 3    Y  ~ X:Z  0.453     0.017  26.571   0.000    0.419    0.486
#> 7  X:Z ~~ X:Z  1.012     0.049  20.459   0.000    0.915    1.109
#> 9    X ~~ X:Z  0.017     0.041   0.401   0.688   -0.065    0.098
#> 10   Z ~~ X:Z  0.061     0.045   1.359   0.174   -0.027    0.148

Here the difference is of course small, but for stronger correlations between X and Z, the difference would be more substantial.

Naive Standardization of Mean Structure

A more stark difference can be seen in how the mean structure of the model is treated. standardizedSolution() seems to disregard the mean structure of the model, only transforming the path coefficients and (co-)variances of the model. This usually doesn't matter, as changing the mean structure doesn't affect the path coefficients and (co-)variances. This, however, is not the case when we include an interaction term in the model, as the mean structure affects the simple main effects. While this might not be an error, I think it is highly unintuitive. It would be more intuitive if standardizedSolution() returned path coefficients belonging to the centered solution.

Here we can see the previous example, where we include the latent mean structure in the model.

library(lavaan)
library(modsem)

model <- '
  X =~ x1 + x2 + x3
  Z =~ z1 + z2 + z3
  Y =~ y1 + y2 + y3
  Y ~ X + Z + X:Z

  X ~ 1
  Z ~ 1
  x1 ~ 0 * 1
  z1 ~ 0 * 1
'

fit <- sam(model, oneInt)
standardizedSolution(fit) # naive solution
#> 10   Y  ~   X  -0.031 0.023  -1.303  0.193   -0.077    0.015
#> 11   Y  ~   Z  -0.109 0.024  -4.501  0.000   -0.157   -0.062
#> 12   Y  ~ X:Z   0.866 0.030  29.320  0.000    0.808    0.924
#> 29 X:Z ~~ X:Z   1.000 0.000      NA     NA    1.000    1.000
#> 31   X ~~ X:Z   0.641 0.018  36.276  0.000    0.607    0.676
#> 32   Z ~~ X:Z   0.677 0.015  45.459  0.000    0.648    0.706

standardized_estimates(fit, correction = TRUE, std.errors = "delta")
#>    lhs op rhs   est std.error z.value p.value ci.lower ci.upper
#> 1    Y  ~   X 0.427     0.024  17.887       0    0.381    0.474
#> 2    Y  ~   Z 0.362     0.022  16.120       0    0.318    0.406
#> 3    Y  ~ X:Z 0.457     0.024  18.719       0    0.409    0.505
#> 8  X:Z ~~ X:Z 1.041     0.193   5.380       0    0.661    1.420
#> 9    X ~~ X:Z 0.000     0.066   0.000       1   -0.129    0.129
#> 10   Z ~~ X:Z 0.000     0.073   0.000       1   -0.143    0.143

Here we see that the results from standardizedSolution() differ wildly between the centered and non-centered solution.

NOTE Here we see that the results from modsem::standardized_estimates() also deviate slightly from the centered solution. This is partly caused by modsem::standardized_estimates() replacing the (co-)variances associated with X:Z when centering the solution, assuming that X and Z are normally distributed ($\sigma(X,XZ)=\sigma(Z,XZ)=0$). This should however not affect the estimated path coefficient for Y~X:Z.

Feature Request

I would suggest that standardizedSolution() in the future does not constrain the variance of interaction terms to 1, and instead identifies the appropriate (co-)variance structure. Additionally, I would suggest that standardizedSolution() (at least by default) returns path coefficients corresponding to the mean-centered solution.

For the time being it might be possible to add a warning, particularily if the one of the variables used in the interaction term has a non-zero mean.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions