Open
Description
Quick Q. We're using splatter
to simulate 3 vs. 3 samples from 2 conditions, 3 clusters. However, it seems condition-differences cde
are introduced "global"ly, i.e., all the same for every de
group. Any way we could alter this? E.g., have cde
only affect one group?
I tried simulating clusters independently with fixed seed, which let's me control cde
. However, this kills any de
between clusters, so it won't do... exemplary PCs and corresponding code below.
Hope that makes sense & happy to clarify if not. Cheers!!
seed <- 241122
t <- 1.5; s <- 2; b <- 0
vcf <- mockVCF(n.samples=6, seed=seed)
gff <- mockGFF(n.genes=800, seed=seed)
# simulate 'Group's = clusters in a loop
lys <- lapply(seq(3), \(.) {
# set 'cde.facLoc=s' = condition effect
# in all but one 'Group' = cluster
if (. != 1) s <- 0
set.seed(seed)
# simulate w/o DE effects
p <- newSplatPopParams(
# type
#de.facLoc=t,
#de.prob=0.1,
#de.facScale=0.1*t,
# state
cde.prob=0.1,
cde.facLoc=s,
cde.facScale=0.1,
bcv.common=1.5,
batchCells=50,
eqtl.ES.rate=30,
similarity.scale=15,
#group.prob=rep(1/3, 3), # cluster
condition.prob=c(0.5, 0.5)) # condition
# multiply overall 'lib.loc' by 1+'t' = cluster effect
# (now clusters are always the same, though...?)
# (and changing this would affect all genes...?)
attr(p, ll) <- attr(p, ll <- "lib.loc")*(1+t)
sim <- splatPopSimulate(
params=p, vcf=vcf, gff=gff,
verbose=FALSE, sparsify=FALSE)
sim$Group <- letters[.]
return(sim)
})
tmp <- lapply(lys, \(.) {
rowData(.) <- NULL
.
})
x <- do.call(cbind, tmp)
colData(x) <- DataFrame(
cluster_id=x$Group,
sample_id=x$Sample,
group_id=x$Condition)