-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DoF computing mistake in function verifyHomogeneity #142
Comments
May I open a new ticket, but it seems there is a bug in the calculation of "verifyHomogeneity" test. I made some simulations in order to evaluate the distribution of the test-statistic under null hypothesis. Below, I will find R code which compute by simulation the empirical distribution of the statistics given by verifyHomogeneity test, and a log-likelihood ratio.
|
Hi @drobelin first of all I wish to thank you for the issue. I am quite
busy at the moment, but I will check if possible. A possible quick solution
is for you to suggest a solution and to post a revised version of the
function either here or by github pull request
Il gio 23 nov 2017, 10:14 drobelin <[email protected]> ha scritto:
… May I open a new ticket, but it seems there is a bug in the calculation of
"verifyHomogeneity" test. I made some simulations in order to evaluate the
distribution of the test-statistic under null hypothesis. Below, I will
find R code which compute by simulation the empirical distribution of the
statistics given by verifyHomogeneity test, and a log-likelihood ratio.
I think that the formula p.35 of
"an_introduction_to_markovchain_package.pdf" needs a correction too.
require(markovchain)
nbsimu = 10000 # Nb of simulation
statesNames=c("A","B") # 2 states : A , B
n=c(100,200) # Respective length of both simulated markov chains
transitionMatrix1 = matrix(c(0.7,0.3,0.1,0.9),byrow=TRUE, nrow=2, dimnames=list(statesNames,statesNames))
transitionMatrix2 = matrix(c(0.7,0.3,0.1,0.9),byrow=TRUE, nrow=2, dimnames=list(statesNames,statesNames))
res = matrix(nrow = nbsimu, ncol = 5) #To store results of each iteration
for(i in 1:nbsimu) {
M1 = new("markovchain", transitionMatrix=transitionMatrix1)
M2 = new("markovchain", transitionMatrix=transitionMatrix2)
m1 = markovchainSequence(n = n[1], markovchain = M1)
m2 = markovchainSequence(n = n[2], markovchain = M2)
m = c(m1,NA,m2) #useful to compute model under null hypothesis
m1.cpt = unclass(table(m1,c(m1[-1],NA))) #class "matrix"
m2.cpt = unclass(table(m2,c(m2[-1],NA))) #class "matrix"
m1f = markovchainFit(m1,confint=FALSE)
m2f = markovchainFit(m2,confint=FALSE)
mf = markovchainFit(m)
vh = verifyHomogeneity(list(m1.cpt,m2.cpt), verbose = FALSE)
res[i,] = c(m1f$logLikelihood, m2f$logLikelihood, mf$logLikelihood, vh$statistic, vh$pvalue)
}
res = as.data.frame(res)
names(res) <- c("L1", "L2", "L","vhStat","vhPvalue")
# Correcting DoF following above comment (2 instead of 3)
res$vhPvalueCorr <- pchisq(res$vhStat, df = 2, lower.tail = FALSE)
#Log-likelihood ratio statistic
res$F = 2*((res$L1+res$L2)-res$L)
res$pvalue = pchisq(res$F,df = 2,lower.tail = FALSE)
# Under null hypothesis, p-value should follow a uniform(0,1) distribution
plot(hist(res$vhPvalue))
plot(hist(res$vhPvalueCorr))
plot(hist(res$pvalue))
plot(density(res$F))
plot(density(res$vhStat))
ks.test(res$F,"pchisq",df=2)
ks.test(res$vhStat,"pchisq",df=3)
ks.test(res$vhStat,"pchisq",df=3)
mean(res$F) ; var(res$F)
mean(res$vhStat) ; var(res$vhStat)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#142 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAyWxrQXNtlf7Rr7TmzgFUhKLrK1j_dmks5s5Td7gaJpZM4Qb73Y>
.
|
So far I have seen, the degrees of freedom are calculated folowing the formula written in the "Kullbak et al 1962" publication (see table 9.1). However, I do not have enough knowlegde to judge if this is the proper way of calculating the dof´s. |
Hi,
Seems there is a mistake in the calculation of the number of degree of freedom of the "verifyHomogeneity" test.
Running the example shows 35 DoF instead of 30 following the documentation ("an_introduction_to_markovchain_package.pdf, p.35), as the considered Markov chains have 6 states.
The text was updated successfully, but these errors were encountered: