-
Notifications
You must be signed in to change notification settings - Fork 3.5k
AAP-17690 Inventory variables sourced from git project are not getting deleted after being removed from source #15928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Conversation
…=True) instead of merging them.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
on_delete=models.CASCADE, | ||
) | ||
group_name = models.CharField(max_length=256) | ||
variables = models.JSONField() # The group variables, including their history. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to ask that someone, like @chrismeyersfsu weigh in on the choice of field type here, JSONField
. There is some baggage with this, and I see json.dumps
being used to save to this field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I removed the superfluous json.dumps
and json.loads
in front of the JSONField. Thanks for the heads-up, I totally didn't see that.
If @chrismeyersfsu has some concerns regarding the performance impact of a JSONField here, I could switch to a TextField with explicit serialization through json.dumps
and json.loads
.
awx/main/utils/inventory_vars.py
Outdated
for name, inv_var in self._vars.items(): | ||
self[name] = inv_var.value | ||
|
||
def from_dict(self, state: dict[str, update_queue]) -> "InventoryGroupVariables": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's common to have a from_*
method be a @classmethod
, and I do kind of feel like that would be more clear here. Basically it would be better to always instantiate using a from_*
method, avoiding ever having an object that's in a state where it can't be used yet.
Also, if I resolve the type hint, it would give
state: dict[str, list[tuple[int, TypeAlias = str | int]]]
Are you sure this is what you meant? Seems like a lot of nesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the type, well, it is a lot of nesting indeed. But on the other hand it's the natural structure of the group variables state:
group_vars_state = {<var_name>: <var_history>, ...}
var_history = [(<src_id>, value), (<src_id>, value), ...]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's common to have a from_* method be a @classmethod, and I do kind of feel like that would be more clear here. Basically it would be better to always instantiate using a from_* method, avoiding ever having an object that's in a state where it can't be used yet.
So I probably choose the wrong name for the from_dict
method. The method should not initialize the object to a usable state, instead it is a loader to move the object instance to a particular state which has to be deserialized from a dict. I understand your concerns that the method here doesn't follow the usual semantic of a from_..
method. I would propose that I change the method name to better reflect the actual purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I renamed .to_dict
and .from_dict
to .save_state
and .load_state
respectively to better reflect their purpose. Together with the reasoning from my comment above would that suffice to mitigate your concerns in this conversation, @AlanCoding?
… handling of None into account
I had a quick look at the tests, and there's a clear theme that API PATCH requests are getting a 400 editing a simple inventory.
This almost certainly comes from the variables parsing within the serializer. That should be a pretty good clue as to what's going on. Even full integration tests are hitting this, so it's not something weird done directly by a test. |
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found. 🚀 New features to boost your workflow:
|
Open topics for this PR:
|
Proposal for product documentation update is available in this comment in the Jira ticket. |
Can you walk me through the intended data migration path? Why is there not a data migration? Presumably this leaves the new field blank, and the old Will those all then be treated like user-entered variables? If "yes", then what I'm not understanding is why the serializer would handle user-entered variables as
Without yet having your answer to your question here, I'm suspicious that it's a good idea for the data model to look different in these 2 scenarios. |
I tested a few things:
|
|
with open(path, "w") as fp: | ||
fp.write("[all:vars]\n") | ||
fp.write("b=value_b\n") | ||
subprocess.run('git add .; git commit -m "Update variables"', cwd=repo_path, shell=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To say it out loud - it is my intent that running git
commands in a subprocess is in scope for the awx/main/tests/live
tests. For other test suites, something like this would be a blocker, which is why I want to say it. In this space, it's okay to do this stuff.
This was apparent as a use case from the get-go for issues like #13845
path = f"{repo_path}/inventory_var_deleted_in_source.ini" | ||
with open(path, "w") as fp: | ||
fp.write("[all:vars]\n") | ||
fp.write("b=value_b\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting method. I don't mean to criticize how you solved this problem, but I do want to share how I approached the same problem earlier.
https://github.com/ansible/test-playbooks/blob/main/inventories/changes.py
By using an inventory script, it's possible to put in dynamic values for either keys or values. In your case, a randomized key will result in the "old" key disappearing in a subsequent update. This can allow doing the same full-integration test without needing to run git commands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look into that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't aware that I can use script-generated inventory files here. Interesting approach, but I guess we cannot preserve state between subsequent calls to such a script. The challenge would be to know what to assert in the test function when we use, e.g., random or timestamp-based variable names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the time being I would like to keep the test method simple, because it does verify the issue the PR is supposed to resolve, and I do not want to delay the merge of this PR longer than necessary.
:return: None | ||
""" | ||
self.name = name | ||
self._update_queue: update_queue = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just got around to really digging into this. Here's what I don't follow - why is this []
as opposed to a {}
? To understand the question, see a value:
{'foo': [[-1, 'bar']]}
The data model of this makes possible {'foo': [[-1, 'bar'], [-1, 'bar2']}
, which is an invalid state, if i understand correctly.
If a variable will only ever have 1 value for 1 source, then indexing based on the source seems like the best way to structure this.
I think that maybe ordering matter? You probably thought more about this than I have - about the rules for which variable surfaces depending on ordering of actions.
I checked out the branch and had a go at writing tests with mock API requests. Either way, I would like to get some test content in along this general structure. This is failing right now, and I want your help to sort out what's going on. Abbreviated human-readable steps:
|
SUMMARY
Fixes AAP-17690 Inventory variables sourced from git project are not getting deleted after being removed from source
The source history for each variable is now preserved, allowing for the updates from source to modify the variables as a user would expect them to behave.
Let A and B be two inventory sources for inventory INV.
A={x:1}, B={x:2} -> sync A -> INV={x:1} -> sync B -> INV={x:2} -> B={} -> sync B -> INV={x:1}
One can see that deleting variable x from source B will not delete x altogether but makes the value from the previous update from source A reappear.
You may think of the source updates as creating an overlay on a variable which covers the previous values. And by deleting a variable from this source and updating from it again will surface the value from the next layer, aka the previous update.
If a inventory source has set
overwrite_vars=True
, an update from this source will prune the history of all variables of this group and keep only this update as the new history.ISSUE TYPE
COMPONENT NAME
AWX VERSION
ADDITIONAL INFORMATION
To reproduce the issue, do the following:
src1.ini:
src2.ini
x: 2
in the Variables field.src2.ini:
Now the issue can be observed:
Since variable x is no longer defined in source B, the inventory should either revert x to the value before the update from B or remove it altogether. But it still shows
x: 2
!