-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A quick append mode for already existing hdf5 files #146
base: master
Are you sure you want to change the base?
Conversation
Pull Request Test Coverage Report for Build 5952173941
💛 - Coveralls |
I fully support having the possibility to append data. Maybe this could also work nicely with the ideas of Area A that were discussed yesterday, i.e., allowing for appending ELN data individually (e.g., from a python dict)? In any way, the proposed changes lgtm. Overwriting old data seems a bit more tricky since you wouldn't want to accidentally overwrite existing data (especially if the writing is automated, e.g. during exporting of experimental data), A y/n prompt may be a good compromise. |
I am strongly against adding an option to overwrite existent data especially if there is not a rock-solid mechanism in-place which keeps hashes and provenance of each piece of information or groupings of pieces of information. |
I understand. But the problem is if a user wants to replace this, do we block them from doing so and ignore the replacing keys for now? So no asking (y/n) and just warning them about it? They could always go around our code and replace anything they like. in the h5 files. So I feel like giving the user more power to outright replace what they like using our framework is better. |
I see the point. People will do it if they really want to. Maybe we could do something like the |
I'm on board with that. Do we make it work just like Nomad where the user has to run the program again with this flag or do we let the user type in "I am really sure!". Note I capitalized the "I" and added an exclamation mark there to make for a stronger approval. |
Though we want to get strong consent from users either by typing or a flag, I would say, it is still important to keep track of the old data at the same time. So that nobody can exploit this functionality for any unpleasant intention. But, Appending new data regarding new fields can be considered and nice to have. |
The issue is we cannot prevent anyone from doing anything nefarious. It will be good to provide an option to keep old data. But in most cases, we will only append /nexus/paths. We will most likely never overwrite "raw" h5 data paths. Where I could see this overwrite conflict is from entries added by one of our readers. In that case, the overwrite will provide a more recent either bug fixed version of the /nexus/path in question or a more feature rich version. For example, we gain some new plot functionality and a user wants to quickly update their NXS files to gain this new feature. I want to accommodate these users and not make it a hassle for them We can, in the future, provide an option to preserve old data under certain NeXus concepts with pynxtool/reader versions, etc. |
I would find such an option very important, e.g. to add evaluation results to a file. What is the status on this? |
This is in line with what @mkuehbach brought up. I just cooked up some basic draft to share ideas on this.
One of the questions we had was what to do if we are adding new data to an existing hdf5 path in the file. We decided to offer a simple (y/n) prompt for the overwrite. But this can be discussed.
If there is anything else you would like to add please feel free to work on this branch or leave comments.
@domna @RubelMozumder @sanbrock @lukaspie