-
Notifications
You must be signed in to change notification settings - Fork 1
Registry protocols implementation #34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ion from a hobby project Signed-off-by: AbstractionFactory <[email protected]>
For some reason the Windows tests are timing out, but I can't reproduce it locally at all, even though I have a Windows box. :( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left some minor notes inline as I was reading, but a lot of it is subjective and I don't feel super strongly about it.
I must confess that I'm not super familiar with Swagger and so although I did review the YAML docs and the generated code I was relying on some assumptions and guesses as to how that works and how things might change if the protocols evolve in future.
Overall I think this is a great start and since this library is explicitly experimental I think it would be fine to merge this as-is or almost-as-is and then iterate on smaller details, rather than trying to "perfect" everything here in one huge PR.
RegistryPortNumber: | ||
format: int64 | ||
type: integer | ||
RegistryServiceDiscoveryResponse: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I believe this does correctly capture the main three documented service types, I have a light concern that this is specified in a much more rigid way than the library that OpenTofu itself currently uses for service discovery, which seems to be written intentionally to minimize assumptions about which specific service types are defined, so it therefore wouldn't always be necessary to update that library when a new service is defined.
Offering a different level of abstraction here might make this particular client harder to evolve if the set of protocols changes in ways that are permitted by the flexibility of the upstream library but not so easy to achieve with this rigid definition. 🤔
Do you think it would be reasonable to make this particular package a little more generic, allowing callers to specify arbitrary service identifiers to request rather than hard-coding them in the schema here, and then let the other protocol-specific packages be responsible for choosing what to request? For example, the providerregistryprotocol
package could be the one to "know" that providers.v1
is the service identifier for that protocol and that its property is just a base URL string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree that the service discovery could be more flexible, I tried to implement this with convenience in mind, otherwise the entire file would basically be a map[string]any
, which isn't exactly helpful when trying to use it. Personally, I dislike multi-staged schemas because they make things complicated to use. The intent behind this library is to make it easy to use. The current list of services in SD is but a handful and adding schemas for all of them shouldn't expand the library by much, so I would rather pursue that path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing library does have a little more structure than just map[string]any
, though indeed it does require the caller to know which of the two currently-supported "shapes" a particular service keyword uses:
Host.ServiceURL
asserts that the value ought to be a string containing a URL. The library automatically parses the URL and, if it's relative, resolves it to an absolute URL using the discovery document's own URL as the base. This one is the main one used for most services, includingmodules.v1
andproviders.v1
.Host.ServiceOAuthClient
deals with the specialized structure that's currently used only bylogin.v1
to represent a full OAuth client configuration. As withServiceURL
this automatically parses and resolves URLs relative to the discovery document's URL.
A typical pattern for using it would therefore be something like:
services := disco.New()
host, err := disco.Discover(providerAddr.Host)
// (error handling)
registryBaseURL, err := host.ServiceURL("providers.v1")
// (error handling)
client := NewProviderRegistryClient(registryBaseURL)
...so there's no any
exposed in the public API intended for use in client configuration, despite the fact that it does of course use such a representation a little internally. (You can find it exposed in the Disco.ForceHostServices
but that's an OpenTofu CLI configuration concern -- the undocumented host
block for overriding host services during development -- and so not really in scope for what this library is about.)
The spec you've put in this package does explicitly encode the fact that providers.v1
exists and its raw value is a string, but since it just exposes the raw URLs strings without parsing/resolving them I assume the client would be responsible for the URL resolution here, and they don't seem to actually have enough information to perform that step because the final service discovery document URL (after following any redirects) isn't exposed anywhere.
I guess a way to get "the best of both worlds" here would be to retain this package's awareness of the specific services providers.v1
, modules.v1
, and login.v1
but to bring in the logic that parses and resolves the URLs to absolute *url.URL
, instead of just exposing the raw URL strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My goal here was to ultimately avoid a two-phased handling and also create a unified OpenAPI document that people can easily implement. We should consider if that is of value to us at the expense of flexibility.
registry/servicediscovery/client.go
Outdated
} | ||
} | ||
|
||
func WithEndpoint(endpoint string) ClientOpt { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having this be WithHostname
(or ForHostname
?) and taking a svchost.Hostname
might make this package tessellate better with how surrounding packages work.
For example, tfaddr.Provider
and tfaddr.ModulePackage
from the "registry-address" module both provide the hostname portion of the source address as a svchost.Hostname
, and so someone using both of these libraries together could just pass the hostname over here without needing first transform it into a full base URL. (That's also, incidentally, how the corresponding code in OpenTofu itself is treating it.)
Relatedly, RFC 8615 requires that the /.well-known/
segment always appear at the start of the path, so it would be non-compliant for the discovery document to be placed anywhere other than the root of the authority implied by the svchost.Hostname
value. It seems reasonable to expose that situation as an alternative for contrived situations like unit tests, but I think it's preferable for the main entry point to force a spec-compliant implementation, especially when that also allows some consistency with the other library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would tend to agree with the RFC reference. However, I would maybe consider adding a bit of flexibility in order to enable use cases where someone wants to host a "mini registry" on GitHub Pages without a dedicated domain name. Maybe we should make OpenTofu more flexible instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is quite a stack of assumptions built on the idea that service providers are identified by so-called "friendly hostnames", including their use in the source address syntax for both providers and registry-published modules, so it isn't really clear to me how we would get there from where we currently are, but I imagine that's a much bigger discussion than this PR.
It seems strange to me for this client to prematurely offer such flexibility before we've designed how that flexibility might manifest in the rest of the system. I would expect this library to implement the protocols as they are currently defined first, and then evolve along with the rest of the system if we choose to change the protocols later.
// Copyright (c) The OpenTofu Authors | ||
// SPDX-License-Identifier: MPL-2.0 | ||
|
||
package providerregistryprotocol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is certainly not the most important thing and I don't feel strongly about it, but this seems rather long compared to typical Go package naming conventions, given that the package name often appears with high frequency in calling code. 🤔
The compromise I thought about is package providerregistry
: still a little long, but the noun "protocol" doesn't seem super crucial here since everything in this module is related to protocols.
(Similar comment for package providermirrorprotocol
too.)
description: |- | ||
A list of hashes for the ZIP archive. | ||
|
||
TODO: describe the hash format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is certainly not something we need to treat as a blocker for merging the client code here 😀 but I'd separately like to factor out the calculation functions for both of the currently-supported hash schemes (the h1:
and zh:
schemes) into some other library.
It seems beyond the scope of this library, which is focused on the registry protocols themselves rather than the specific data within, but perhaps there's something in a separate library that knows how to work with the actual provider packages (both in their .zip
archive and unpacked forms), including generating checksums for them, and then that could be shared between OpenTofu's package getproviders
and separate tools such as Renovate updating a dependency lock file, building a provider mirror with more flexibility than tofu providers mirror
offers today, etc.
(This comment is really just notes we might refer back to in future, not actually feedback on your PR. Feel free to immediately resolve it. 😀 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, we should describe the hash format, I missed this. This library intends to make working with registries (e.g. building a proxy, etc.) as simple as possible and the hash formats are definitely part of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circling back to this, I think we should still make sure that this description contains some language-agnostic description for the hash format. h1:
may be difficult to describe in that way though.
} | ||
|
||
if resp.ProvidersV1 == "" { | ||
return nil, fmt.Errorf("no providers endpoint found") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The package name registryclient
suggested to me that this particular component was intended to be broader than just provider registries, and would potentially grow to support module registries in future too. If that's true, then I would've expected this to support a registry that only supports one of the two current registry protocols, such as only offering a module registry and not also a provider registry.
If your intention was for this to be only for provider registry then perhaps the package name could be more specific to allow for adding a separate package for module registry client later?
Alternatively, I imagine a potential design where this ServiceDiscovery
method fails only if the overall service discovery protocol failed in itself, regardless of which specific services were returned, and then the returned DiscoveredClient
object is a struct with a method like ProviderClient() (providerregistryprotocol.Client, error)
where this error
is where we'd report "no providers endpoint found", since the caller has then explicitly requested that particular service.
All of this is loose ideas, not hard objection. Seems like there's a few different approaches here with some different tradeoffs, so I'm not meaning to try to force any particular design.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct, once module registries are added this will need to be changed. Currently, they are not implemented in this version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks ok.
I cannot really add much info on the schemas since I am not that familiar with the registry's API. But the general structure and overall idea, looks ok to me.
One thing, yeah, the name of the packages could be shorter here and there.
But definitely looks good.
Signed-off-by: AbstractionFactory <[email protected]>
Donating the following from a hobby project: