-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RobustICA behaviour on non-convergence #1047
Comments
In testing out the method, my suspicion is that, if it doesn't converge with one starting seed, it often won't converge with another. The parameter that likely will make a difference is the initial number of components requested at initialization. I'm still trying to get a better handle on the optimal number of components to request, but most of the failures I've seen are when I request too many and the little variance spread out across many components is just inconsisent. For the datasets you're testing, I wonder if you see a higher IQ score with fewer initial components requested. For your Qs:
|
RE point 3, it's possible that the kinds of augmentations that would be ideal from a TEDANA standpoint would both be applicable in other ICA contexts, and be more appropriately / cleanly implemented within RobustICA. So it might be the case that the result of this thread is to bounce feature requests to RobustICA, in terms of eg.:
I'm not an ICA person so can't really comment on precedents in these regards, but I might be able to give advice in terms of software implementation. |
Thanks @Lestropie and @handwerkerd for your suggestions and comments. I am working on the PR and will incorporate your suggestion as much as possible. |
I want to discuss a confound intrinsic to #1013, which, given its esoteric nature, I felt better to write as a separate Issue so as to not cross-contaminate discussions.
For any given dataset, it is possible for the RobustICA run to be "inadequate".
There are two levels to such:
Then, depending on internal logic, there are multiple potential reactions:
(obviously only applicable to the low IQ case, not the exception case)
In its current state at tip 979d026, the code in #1013 is doing the following:
I want to approach this question with a fresh set of eyes, as the current logic may not be the optimal choice for pushing code out to the public.
In part, there is a question about the evaluation that led to addition of attempting an alternative clustering algorithm if DBSCAN fails. It's possible that AgglomerativeClustering was only determined to be necessary for testing purposes on aggressively downsampled data, and that for public consumption it would be preferable to omit that extra logic. Hopefully @BahmanTahayori can provide further insight here.
The discussion I want to initiate here is however not specific to that one point.
If, for a fixed seed + clustering method + number of runs, the clustering is "inadequate", what should the TEDANA software do?
Should the RobustICA clustering method be exposed at the TEDANA command-line?
Upon clustering being deemed "inadequate", would it be better to, instead of changing the clustering algorithm, increase the number of runs?
It is also likely that you would want to specify a maximum number of runs to prevent the software from running indefinitely.
Ie. The result obtained using a fixed seed of 42 and 40 runs should be identical to the result obtained using a fixed seed of 42 when 30 runs were first generated, clustering failed, and an additional 10 runs were appended and clustering was re-run.
Whether or not this is possible will depend on the exposed interface of RobustICA; hopefully @BahmanTahayori can look into this and feed back.
The text was updated successfully, but these errors were encountered: