-
-
Notifications
You must be signed in to change notification settings - Fork 16
Open
Labels
Description
Currently we have support for the following (arbitrary) structures:
- Crossed:
~ J + K + M + ...
- Nested:
~ J/K/M/...
It would be nice to have support for more complex arbitrary designs, such as ~ J + K/M
The basic algorithm would be:
For each outcome:
- Compute the group means for each crosses structure separately (e.g.,
J
andK/M
) according to the currently implemented algorithm. - For each row in the input data, compute the sum of the group means
- Finally, subtract the sum from the outcome vector to obtain the cluster mean centered (at level 1) values (issue raised in Rethinking
degroup()
for cross-classified data #637 still applies) - Return group means and cluster mean centered values.
This is already currently possible using multiple calls to dmean()
:
set.seed(111)
data(iris)
iris$ID <- sample(1:4, nrow(iris), replace = TRUE) # fake-ID
iris$binary <- as.factor(rbinom(150, 1, .35)) # binary variable
# (Step 1)
x1 <- datawizard::demean(
iris,
select = c("Sepal.Length"),
by = "ID",
append = FALSE
)
x2 <- datawizard::demean(
iris,
select = c("Sepal.Length"),
by = "Species/binary",
append = FALSE
)
group_means <- data.frame(Sepal.Length_ID_between = x1$Sepal.Length_between, x2[1:2])
# (Step 2)
summ <- rowMeans(group_means)
# (Step 3)
group_means$Sepal.Length_within <- iris$Sepal.Length - summ
# (Step 4)
head(group_means)
#> Sepal.Length_ID_between Sepal.Length_Species_between Sepal.Length_binary_between Sepal.Length_within
#> 1 5.878788 5.006 0.009789474 1.4684742
#> 2 5.843243 5.006 -0.006000000 1.2855856
#> 3 5.812821 5.006 -0.006000000 1.0957265
#> 4 5.843243 5.006 -0.006000000 0.9855856
#> 5 5.843243 5.006 -0.006000000 1.3855856
#> 6 5.843902 5.006 0.009789474 1.7801027