-
-
Notifications
You must be signed in to change notification settings - Fork 116
Redo netvm setup after unpause #722
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #722 +/- ##
==========================================
- Coverage 70.40% 70.32% -0.09%
==========================================
Files 61 61
Lines 13682 13739 +57
==========================================
+ Hits 9633 9662 +29
- Misses 4049 4077 +28
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
e3c1b74
to
9d232cd
Compare
9d232cd
to
97e2ac8
Compare
97e2ac8
to
75c1084
Compare
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025102219-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2025081011-4.3&flavor=update
Failed tests12 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/149225#dependencies 84 fixed
Unstable testsPerformance TestsPerformance degradation:17 performance degradations
Remaining performance tests:163 tests
|
qubes/tests/integ/network.py
Outdated
self.assertEqual(self.run_cmd(self.testvm1, self.ping_ip), 0) | ||
self.assertEqual(self.run_cmd(self.testvm1, self.ping_name), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those two should fail, since you changed netvm to None, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes... why it didn't fail, I believe there is a race between
- event domain-unpaused
- ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is not a race, the detach_network
is not called if netvm is None
. If I try to call it, detachDevice
, it fails on
File "/usr/share/qubes/templates/libvirt/devices/net.xml", line 7, in top-level template code
<backenddomain name="{{ vm.netvm.name }}" />
^^^^^^^^^^^^^^^^^^^^^
jinja2.exceptions.UndefinedError: 'None' has no attribute 'name'
So I have to pass the vm.netvm
somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did manage to write the jinja, but failed as vm.ip
is None
. The ip
needs to be saved so it can be removed with vif-route-qubes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe detach can be changed to retrieve info from libvirt, not VM properties? If the XML is needed, maybe you can take it out from libvirt_domain.XMLDesc()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems viable, yes:
<interface type='ethernet'>
<mac address='00:16:3e:5e:6c:00'/>
<ip address='10.137.0.21' family='ipv4'/>
<script path='vif-route-qubes'/>
<backenddomain name='test-inst-netvm1'/>
</interface>
qubes/vm/mix/net.py
Outdated
except (qubes.exc.QubesException, libvirt.libvirtError): | ||
vm.log.warning("Cannot attach network", exc_info=1) | ||
|
||
def reset_deferred_netvm(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"reset" is a bit unfortunate here (means more "clear" than re-set). Maybe "apply"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
qubes/vm/mix/net.py
Outdated
): # pylint: disable=unused-argument | ||
"""Check for deferred netvm changes in case qube was paused while | ||
changes happened.""" | ||
if getattr(self, "is_preload", False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be problematic - self.use_preload()
is also called in a domain-unpaused
handler, so depending on execution order, is_preload
may be false here already. See fire_event
docstring, and check if the order is correct according to that description (and add info why it's correct into commit message). But if the order is wrong, or undefined, this will need some other solution
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# qubes/vm/dispvm.py
@qubes.events.handler("domain-unpaused")
def on_domain_unpaused_dispvm(...):
if self.is_preload:
self.use_preload()
def use_preload(...):
...
self.apply_deferred_netvm()
self.preload_requested = None
...
# qubes/vm/mix/net.py
@qubes.events.handler("domain-unpaused")
def on_domain_unpaused_net(...):
if getattr(self, "is_preload", False):
return
self.apply_deferred_netvm()
def apply_deferred_netvm(...):
deferred_from = self.features.get("deferred-netvm-original", None)
if deferred_from is None:
return
So the is_preload
is only False
after running the apply_deferred_netvm()
from DispVM.use_preload
. If it is False
and reaches on_domain_unpaused_net()
, it will call the apply_deferred_netvm
but won't execute because deferred_from is None
. I will think how to include this in the commit message.
c093f2b
to
38cc8bb
Compare
qubes/vm/mix/net.py
Outdated
vm.ip6 = ip_elem.get("address") | ||
self.libvirt_domain.detachDevice( | ||
self.app.env.get_template("libvirt/devices/net.xml").render(vm=self) | ||
self.app.env.get_template("libvirt/devices/net.xml").render(vm=vm) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could simply use XML you already got (interface
variable), instead of re-building it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is much simpler, thanks for the correction.
38cc8bb
to
dd74dcc
Compare
with self.subTest("resetting netvm to default"): | ||
original_netvm = vm.netvm.name if vm.netvm else "" | ||
del vm.netvm | ||
mock_detach.assert_called() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should be called on paused vm. It looks like you are simply missing reset_mock() calls after previous sub-test.
dd74dcc
to
0ce5c48
Compare
Starting a client stats the netvm. Also, the netvm is already started in |
Oh no, now the test failed in network_ipv6 job :( I'll look at it a bit... |
FYI update on debugging this: I think it's a kernel issue, I reported it to xen-devel, and in parallel came up with QubesOS/qubes-linux-kernel#1193, but based on test results, it doesn't fully fix it yet (for some reason, tests in network_ipv6 still fail...) |
So, at this point, I'd like to merge this, even if the new test sometimes (often) fail. I'll do one more test run, and depending on the outcome, maybe mark it as expected failure or something. After all this PR fixes actual issue and failure seems to be caused by a bug elsewhere, not here. |
This comment was marked as outdated.
This comment was marked as outdated.
Applying deferred netvm for preloaded disposables is handled separately to ensure that before a disposable is returned to the user, the networking is already set up. If the domain-unpaused event of the NetVMMixin kicks in before the preload is used, it is ignored by the "is_preload" attribute, if it kicks after, it is ignored by the absent "deferred-netvm-original" feature. Fixes: QubesOS/qubes-issues#10173 For: QubesOS/qubes-issues#1512
Concatenating empty ('') value is lost on translation. For: QubesOS/qubes-issues#1512
Therefore I will set to skip shutdown_old and purge_old for ipv4 and ipv6. |
1a2155a
to
fe20264
Compare
Fixes: QubesOS/qubes-issues#10173
For: QubesOS/qubes-issues#1512
Requires on the templates: QubesOS/qubes-core-agent-linux#603