-
Notifications
You must be signed in to change notification settings - Fork 38
Prevent to deep-copy nodes that have been stored. Fixes #241 #243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I believe that tests are not necessary since this feature is tested already when the various child classes of |
sphuber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we actually do need tests here. What you describe being tested is that things run, but not that nodes are not accidentally cloned, which is actually happening in this implementation. You are only checking the top level namespace for nodes, but they could be nested deeply within. Those will still get deepcopied. So to do this properly, you need to make it recursive. But you need to do it in a backtracking manner. You will need to go all the way down the namespaces to find where the nodes are and clone deepcopy the dictionaries by hand and the mutable values within it, except for the nodes which you keep.
sphuber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bosonie . Implementation looks good. Not sure if we actually need to distinguish between stored and unstored nodes, but can't hurt for now. Anyway, they would be stored before the get_builder is called when going through the compound workflows. Just have some changes in naming and general cleaning up, but then should be good to go.
Co-authored-by: Sebastiaan Huber <mail@sphuber.net>
This is a valid point. In other words, it does not make sense to distinguish stored and unstored nodes since it can always be that the node is stored before going through the |
The problem of deepcopying the inputs of an input generator does not affect the Data nodes only. Also other nodes like `ProcessNode` should not deepcopied (actually in that case the code will crash trying to deepcopy). So the deepcopy is prevented for all `Node`s. Also eliminated the distintion betweeen stored and unstored nodes. An implementation can not know in advance if the passed Data node is stored or not. Therefore every implementation should not modify nodes at all! No need to deepcopy nodes that are not stored!
|
@sphuber I finished to implement your modifications. Please note that:
|
sphuber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bosonie , just few remaining things
tests/generators/test_generator.py
Outdated
| def test_get_builder_mutable_kwargs(generate_input_generator_cls): | ||
| """ | ||
| Test that calling ``get_builder`` does mutate the ``kwargs`` when they are not nodes. | ||
| In this case we want to deepcopy the ``kwargs`` before entering the ``get_builder``. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who is "we" in this sentence? The person calling get_builder, or the get_builder implementation? I think you mean the latter, because it is get_builder who wants to perform a deepcopy, but than "before entering the get_builder" doesn't make sense.
| kwargs = {'mutable': {'test': 333}} | ||
| generator.get_builder(**kwargs) | ||
| assert kwargs['mutable'] == {'test': 333} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to understand why this test was actually testing something. It relies on the _construct_builder implementation of the class that is returned by the generate_input_generator_cls factory. This is not at all clear and the docstring also doesn't say this. For this purpose, I would really advice to have the implementation of _construct_builder locally (either in this test file or better even in the test itself). You don't have to copy the entire class and implement it again, you can just monkeypatch the relevant method. I would suggest doing the following:
def test_get_builder_mutable_kwargs(generate_input_generator_cls, monkeypatch):
def construct_builder(self, **kwargs):
kwargs['mutable']['test'] = 'whatever'
return self.process_class.get_builder()
cls = generate_input_generator_cls(inputs_dict={'mutable': dict})
monkeypatch.setattr(cls, '_construct_builder, construct_builder)
# and then do the rest of the test.This is a lot clearer and also less susceptible to accidentally breaking the test if the fixture generate_input_generator_cls removes the logic that changes the arbitrary keys in the kwargs that is not clear at all why it is there.
tests/conftest.py
Outdated
| if 'mutable' in kwargs: | ||
| kwargs['mutable']['test'] = 'whatever' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would remove this. It is specific to one test but really difficult to figure out which one. Should be locally implemented. See other comment
Just format of a comment Co-authored-by: Sebastiaan Huber <mail@sphuber.net>
|
@sphuber Thanks for the comments. I hope I clarified all the points. |
|
Thanks @bosonie |
Fixes #241