-
Notifications
You must be signed in to change notification settings - Fork 1.1k
try to make certificate addition/removal reloadable in some cases #1468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
73e7946
to
a8b091c
Compare
f.l.WithError(err).WithField("udpAddr", addr). | ||
WithField("handshake", m{"stage": 1, "style": "ix_psk0"}).WithField("cert", remoteCert). | ||
Info("Unable to handshake with host due to missing certificate version") | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes this flow actually work
them(v1+v2) ----handshake(v1)---> me (v2 only)
a potential drawback is if them
is actually an old client with only v1 certs, we may make it upset?
with this current flow, we'll also create a connectionstate with mismatched cert versions. I think this is actually fine? But maybe we should try to trigger a rehandshake when the connectionmanager sees that, so everyone can know everyone's VPN IPs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes. I added that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm maybe we should only bother rehandshaking if there's a meaningful difference though? (like different number of VPN-IPs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a potential drawback is if them is actually an old client with only v1 certs, we may make it upset?
I think this is all of the problem, how does the remote behave when encountered with an unknown certificate version. I think anything pre 1.9.3ish is going to have a bad time.
hmm maybe we should only bother rehandshaking if there's a meaningful difference though? (like different number of VPN-IPs)
The only thing we can safely make a determination about is our own local state. Is this tunnel using a certificate at the current configured version level? Is the certificate the latest one available? I would be very wary of trying to reason about the remote state in a re-handshake flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing we can safely make a determination about is our own local state. Is this tunnel using a certificate at the current configured version level? Is the certificate the latest one available? I would be very wary of trying to reason about the remote state in a re-handshake flow.
Exactly, I agree. What I was referring to here was:
- I define two parties, one named Host, with v1+v2 certs, and Peer, with only a v2 cert
- Host handshakes to Peer. Host uses a V1 cert, as this is the default, most-compatible behavior
- Peer doesn't have a V1 cert. Peer replies to Host with a V2 cert.
- Host accepts Peer's V2 cert
- Host now has a decision to make:
a. if Host'scertv1.Networks() == certv2.Networks()
, (imo), there's no reason to do anything. Just use this tunnel.
b. else, (imo) it's worth having Host trigger a rehandshake with its V2 cert, so Peer will be informed of all of Host's VPN IPs. This is important on mixed-dual-stack networks for things like relays to behave reliably (and dual stack configurations that lack a dual-stack lighthouse!)
c. currently this PR always chooses path (b)
All the other rehandshake-flow bits should remain in place as they always have. I think my phrasing was poor: I meant "bother rehandshaking when scenario 5.b applies"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is all of the problem, how does the remote behave when encountered with an unknown certificate version. I think anything pre 1.9.3ish is going to have a bad time.
We should specifically test this to be sure, but shouldn't it just drop the packet? We'll error unmarshalling the not-proto in stage 1 and delete the handshake-hostinfo.
This class of problem should also only crop up when an un-updated, v1-only host, attempts to contact a v2-only host, I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think it's "initiator only understands v1 and responder only has v2".
The other side of this is "initiator only has v1 but understands v2 and is trying to handshake with an ipv6 overlay address" which isn't allowed at the moment through https://github.com/slackhq/nebula/pull/1468/files#diff-d27b1b01b189fd93eae405de59c38e5bd23ff1c9868a0dabac92703842666b78R31
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did confirm the packet gets dropped with an ugly log line, no bad behavior occurs otherwise
345ceef
to
eb5954f
Compare
…ot match with a cert that we can indeed match
…sion than our peer, and we have a newer-version cert available
eb5954f
to
dc3081e
Compare
…unnels (this is fine as long as we don't let you change your set of vpnNetworks on reload)
WithField("fingerprint", remoteCert.Fingerprint). | ||
Info("Remote certificate is no longer valid, tearing down the tunnel") | ||
return true | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, I'd leave out the else branch and return false
at the end of the function to try and avoid future shenanigans
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda see this the other way. If all the return statements are inside a conditional, anything outside the conditions is unreachable. It avoids future shenanigans in the form of "someone adds another condition that doesn't have a return
".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strong opinion loosely held though, I can swap it if you wanna
if hh.initiatingVersionOverride != cert.VersionPre1 { | ||
v = hh.initiatingVersionOverride | ||
} else if v < cert.Version2 { | ||
// If we're connecting to a v6 address we must use a v2 cert |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the must
here is really a should try super hard but maybe its ok
No description provided.