Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add extractor methods #8

Open
6 tasks
s3alfisc opened this issue Aug 22, 2022 · 2 comments
Open
6 tasks

Add extractor methods #8

s3alfisc opened this issue Aug 22, 2022 · 2 comments
Labels
enhancement New feature or request
Milestone

Comments

@s3alfisc
Copy link
Owner

For

  • - leverage
  • - partial leverage
  • - p-values
  • - t-statistics
  • - confidence intervals
  • - jackknife beta's
@s3alfisc s3alfisc added this to the 0.7 milestone Feb 25, 2023
@s3alfisc s3alfisc added the enhancement New feature or request label Feb 25, 2023
@SergeyVAlexeev
Copy link

Title: Request for Cluster-by-Cluster Information Similar to Stata's Default Output

Body:

I'm currently working on a paper recommending the use of summclust for cluster analysis in clinical trials. While using your package, I've noticed a discrepancy between the cluster-by-cluster information output in R versus Stata. I would greatly appreciate assistance in aligning the R output with Stata's default presentation.

Current Situation

Using the following R code with summclust on scrambled_cherish.csv:

# Set a wider display
options(width = 150)

# Read the data
data <- read.csv("scrambled_cherish.csv")

# Create dummy variables
data$treat <- ifelse(data$intervention_factor == "Intervention", 1,
                     ifelse(data$intervention_factor == "Control", 0, NA))
data$elect <- as.numeric(data$adm_cat == "Elective")
data$female <- as.numeric(data$gender == "Female")
data$cog_dec <- ifelse(data$cognitive_impairment_admission == "1", 1,
                       ifelse(data$cognitive_impairment_admission == "0", 0, NA))
data$fun_dec <- ifelse(data$hospital_associated_functional_decline == "1", 1,
                       ifelse(data$hospital_associated_functional_decline == "0", 0, NA))

# Create factor variables
data$hospital <- as.factor(data$hospital)
data$ward <- as.factor(data$ward)

# Load necessary libraries
library(sandwich)
library(summclust)

# Specify the model formula
mf <- formula(
  delirium_new ~ treat + 
  age + 
  female + 
  cognitive_impairment_admission + 
  hospital_associated_functional_decline +
  cci_unadjusted_for_age +
  elect +
  factor(hospital)
)

# Testing clusters
fit <- lm(mf, data = data)
summ <- summclust(fit, params = ~ treat, cluster = ~ ward)
summary(summ)

# Create cluster by cluster information
cluster_by_cluster <- data.frame(
  Cluster = seq_along(summ$N_G),
  Ng = as.vector(summ$N_G),
  Leverage = summ$leverage_g,
  Partial_L = summ$partial_leverage[1,],
  Beta_no_g = summ$beta_jack[1,]
)
print(cluster_by_cluster, row.names = FALSE)

I get this output:

 Cluster Ng  Leverage  Partial_L    Beta_no_g
       1 65 1.5615371 0.13941055 -0.030189263
       2 37 1.2127338 0.10579421 -0.040323642
       3 38 1.0415643 0.09938913  0.021841058
       4 39 1.8487431 0.16080947  0.025654521
       5 59 1.0121208 0.09115682 -0.001107840
       6 65 0.9095753 0.09388891 -0.004918237
       7 44 1.7757687 0.16309955  0.005450797
       8 60 1.6379568 0.14645136  0.008303581

Desired Output

When running a similar analysis in Stata, I get:

  ward_cat |       Ng      Leverage     Partial L.  beta no g    
-----------+-----------------------------------------------------
         1 |       65      1.775769       0.136151   0.005451    
         2 |       37      1.012121       0.070870  -0.001108    
         3 |       38      0.909575       0.083975  -0.004918    
         4 |       39      1.041564       0.114481   0.021841    
         5 |       59      1.561537       0.144157  -0.030189    
         6 |       65      1.848743       0.178644   0.025655    
         7 |       44      1.212734       0.124528  -0.040324    
         8 |       60      1.637957       0.147194   0.008304    

Issue

The values are correct, but the order differs. For example, cluster 1's leverage in Stata is 1.775769, while in R it's 1.5615371 (which corresponds to cluster 5 in Stata).

Request

Could you please provide guidance or adjust the summclust package to allow extraction of cluster-by-cluster information in an order consistent with Stata's output? This would greatly enhance the package's utility for researchers transitioning between R and Stata or comparing results across platforms.

Thank you for your attention to this matter. Your assistance will significantly contribute to the adoption of summclust in clinical trial analysis.

Stata code for reference

// Saving output
log using Mudge_output.log, replace text

// Retrieving and preparing data
import delimited "scrambled_cherish.csv"

// Create the dummy variable for treated 
gen treat = .
replace treat = 1 if intervention_factor == "Intervention"
replace treat = 0 if intervention_factor == "Control"

// Create the dummy variable for elective admissions 
gen elect = (adm_cat == "Elective")

// Female dummy
gen female = (gender == "Female")

// Cognitive decline dummy
gen cog_dec = .
replace cog_dec = 1 if cognitive_impairment_admission == "1"
replace cog_dec = 0 if cognitive_impairment_admission == "0"

// Functional decline dummy
gen fun_dec = .
replace fun_dec = 1 if hospital_associated_functional_d == "1"
replace fun_dec = 0 if hospital_associated_functional_d == "0"

// Hospital categories
encode hospital, generate(hospital_cat)

// Ward categories
encode ward, generate(ward_cat)

// Testing clusters
summclust ///
delirium_new /// primary outcome 
treat /// treatment indicator
age ///
female ///
cog_dec ///
fun_dec ///
cci_unadjusted_for_age ///
elect ///
, fevar(hospital_cat) ///
cluster(ward_cat) ///
table addmeans jack rho(0.5) regtable // choice of tests

@s3alfisc
Copy link
Owner Author

s3alfisc commented Sep 6, 2024

Hi @SergeyVAlexeev, thanks for raising this - I'm travelling at the moment so haven't really had time to look into it; I'll do so once I'm back home by the end of the week!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants