Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protect against corrupted channels in legacy openephys. #1436

Merged
merged 2 commits into from
May 3, 2024

Conversation

samuelgarcia
Copy link
Contributor

@weiglszonja : Here this raise an error when a file is corrupted (bad timestamp)

@zm711
Copy link
Contributor

zm711 commented Mar 19, 2024

So I don't know openephys very well at all, but I'm wondering if there would ever be a case where the timestamps could be corrupted but the other channels are okay (ie warn the user of the corruption but then try to salvage the data and let the end-user decide whether to continue using that dataset). But maybe timestamp corruption always indicates that the whole file is messed up and shouldn't be used. I just don't know personally.

@samuelgarcia
Copy link
Contributor Author

@zm711 : I get a data example from @weiglszonja where only on channelf ile is corrupted and prevent the entire opening. Other channels are OK (but with gaps).

Skipping the channel would be tedious in the code because it is a mix of index and key.
So the raise erreur indicate a clear solution for the user I think no ?

@zm711
Copy link
Contributor

zm711 commented Mar 19, 2024

In this issue #1438, someone is asking to ignore corrupted channel and just use samples. If it's too annoying to work around I think that's fine, but it seems like at least some users would like to still use datasets even if some channels are corrupt. I'm fine with just raising the error though. Just wanted to keep it on your radar.

@samuelgarcia
Copy link
Contributor Author

I recently add zeros padding when gaps are detected. So we cannot ignore this otherwise events could be not aligned with traces. But if one channel have bad timstamps that are not monotically increasing then we cannot do anything. Ignoring the gaps like it used to be the case for a while is not good I think.

Here the PR is for the legacy openephys format.
The issue you are mentionning for the "new" binary openephys format. No ?

@zm711
Copy link
Contributor

zm711 commented Mar 19, 2024

Yes it is for the binary/new format. My point is just that some people regardless of the corruption status of their data may want to try to salvage something. That doesn't mean we have to support it, but I think it does come up.

@weiglszonja
Copy link
Contributor

My point is just that some people regardless of the corruption status of their data may want to try to salvage something. That doesn't mean we have to support it, but I think it does come up.

I think what @samuelgarcia did here still provides a way for the user to salvage the dataset, they would have to move out the corrupted file from the folder and rerun. But maybe I misunderstood your comment here.

@zm711
Copy link
Contributor

zm711 commented Mar 21, 2024

Hi @weiglszonja!

If that's all it takes then shouldn't that be in the error message? Again, I don't use openephys so I might be saying nonsensical things, but if we can still parse the files by removing the corrupted files shouldn't we just warn the user and do that? Or let them know that that is a possibility? Or we say that as a library that we don't do that because if part of the file/folder of files is corrupted we don't want to potentially return other data being corrupted.

@alejoe91 alejoe91 added this to the 0.13.1 milestone Apr 5, 2024
@zm711 zm711 merged commit 84b672c into NeuralEnsemble:master May 3, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants