-
Notifications
You must be signed in to change notification settings - Fork 26
estuary-cdk: refactor the placement of client creds in OAuth requests #3503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors OAuth client credentials placement in the estuary-cdk to support flexible credential placement strategies (headers vs. form body) and simplifies the OAuth specification models.
- Introduces a builder pattern via
with_client_credentials_placement()method to override default credential placement - Merges
OAuth2RotatingTokenSpecintoOAuth2Specby addingadditionalTokenExchangeBodyfield with a default empty dict - Simplifies token exchange logic by using
grant_typeandclient_credentials_placementClassVars to eliminate redundant match/case logic
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| estuary-cdk/estuary_cdk/flow.py | Adds RequestDataPlacement type, adds grant_type and client_credentials_placement ClassVars to OAuth credential classes, adds with_client_credentials_placement() method to credential classes, merges OAuth2RotatingTokenSpec into OAuth2Spec, and refactors for_provider() methods to use cls parameter |
| estuary-cdk/estuary_cdk/http.py | Refactors _fetch_oauth2_token() to use credential ClassVars (grant_type, client_credentials_placement) instead of match/case on credential types, updates to use merged OAuth2Spec, and includes formatting improvements |
| estuary-cdk/estuary_cdk/capture/common.py | Removes OAuth2RotatingTokenSpec import |
| estuary-cdk/estuary_cdk/capture/base_capture_connector.py | Formatting and import organization improvements |
| source-zendesk-support-native/source_zendesk_support_native/models.py | Updates OAuth2RotatingTokenSpec to OAuth2Spec |
| source-outreach/source_outreach/models.py | Updates OAuth2RotatingTokenSpec to OAuth2Spec and removes redundant additionalTokenExchangeBody=None parameter |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
c08f6dd to
84f3e25
Compare
Alex-Bair
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a few small comments.
Additionally whenever I make changes that affect multiple connectors, I make sure that the associated tests are passing in the CI checks. That helps put me a little at ease & makes it easier to claim that my CDK level changes didn't break individual connectors. It looks like the source-zendesk-support-native tests are failing, so we should figure out why they're failing & get them passing before merging this PR.
84f3e25 to
73a8389
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
0f5e012 to
4fb85cc
Compare
Alex-Bair
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good! I had a few more comments, but this is getting really close to being ready to be merged.
| @classmethod | ||
| def with_client_credentials_placement( | ||
| cls, | ||
| placement: OAuth2ClientCredentialsPlacement, | ||
| ) -> type[Self]: | ||
| """ | ||
| Returns a subclass with a custom client credentials placement. | ||
| """ | ||
|
|
||
| return type(cls.__name__, (cls,), {"client_credentials_placement": placement}) | ||
|
|
||
| @staticmethod | ||
| def for_provider( | ||
| provider: str, | ||
| ) -> type["AuthorizationCodeFlowOAuth2Credentials"]: | ||
| @classmethod | ||
| def for_provider(cls, provider: str) -> type[Self]: | ||
| """ | ||
| Builds an OAuth2Credentials model for the given OAuth2 `provider`. | ||
| This routine is only available in Pydantic V2 environments. | ||
| """ | ||
| from pydantic import ConfigDict | ||
|
|
||
| class _OAuth2Credentials(AuthorizationCodeFlowOAuth2Credentials): | ||
| model_config = ConfigDict( | ||
| json_schema_extra={"x-oauth2-provider": provider}, | ||
| title="OAuth", | ||
| ) | ||
| return type( | ||
| cls.__name__, | ||
| (cls,), | ||
| { | ||
| "model_config": ConfigDict( | ||
| json_schema_extra={"x-oauth2-provider": provider}, | ||
| title="OAuth", | ||
| ), | ||
| "_you_must_build_oauth2_credentials_for_a_provider": lambda _: None, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pylance is complaining about the return values for with_client_credentials_placement and for_provider:
Type "type[_]" is not assignable to return type "type[Self@_BaseOAuth2CredentialsData]"
Type "type[_]" is not assignable to type "type[Self@_BaseOAuth2CredentialsData]"PylancereportReturnType
It'd be great to address Pylance's concerns. Does changing the return type to type["_BaseOAuth2CredentialsData"] help?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would help, but the return type would stay the same for child classes -- I think casting to type[Self] would be slightly more expressive 💪
PS: I got the same error to show up on my IDE by migrating from pyright to basedpyright, which is the open source alternative to Pylance. So going forward I should be able to catch these on my own
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not extremely fond of using cast() unless there aren't any better options. cast() doesn't verify the cast is valid. It tells mypy to blindly trust us, and I'd like to keep as many type checking guardrails in place as possible.
What do you think about this approach instead? Double check me, but I think it keeps the expressiveness with the __name__ and __qualname__ replacements:
@classmethod
def for_provider(cls, provider: str) -> type[Self]:
from pydantic import ConfigDict
class _ProviderCredentials(cls):
model_config = ConfigDict(
json_schema_extra={"x-oauth2-provider": provider},
title="OAuth",
)
def _you_must_build_oauth2_credentials_for_a_provider(self) -> ...
_ProviderCredentials.__name__ = cls.__name__
_ProviderCredentials.__qualname__ = cls.__qualname__
return _ProviderCredentialsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did test this approach but no luck :/ even after setting __name__, __qualname__ and __module__ model serialisations still report _ProviderCredentials as their class name.
I could have all dynamic subclasses use _OAuth2Credentials and circumvent this typing issue entirely, at the cost of losing type expressiveness in connector specs. Which option would you prefer?
estuary-cdk/estuary_cdk/flow.py
Outdated
|
|
||
| class OAuth2RotatingTokenSpec(OAuth2Spec): | ||
| additionalTokenExchangeBody: dict[str, str | int] | None | ||
| additionalTokenExchangeBody: dict[str, str | int] = {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source-zendesk-support-native check is failing, and I believe it's because the additionalTokenExchangeBody it uses is now in this OAuth2Spec class.
The Flow runtime doesn't expect an additionalTokenExchangeBody field to be present in an connector's Spec response, and that's what's causing the check to fail.
Error: failed to process connector output: could not parse "***\"spec\":***\"configSchema\":***\"$defs\":***\"Advanced\":***\"properties\":***\"incremental_export_page_size\":***\"default\":1000,\"description\":\"Page size for incremental export streams. Typically left as the default unless Estuary Support or the connector logs indicate otherwise.\",\"exclusiveMinimum\":0,\"maximum\":1000,\"title\":\"Incremental Export Streams' Page Size\",\"type\":\"integer\"***,\"title\":\"Advanced\",\"type\":\"object\"***,\"ApiToken\":***\"properties\":***\"credentials_title\":***\"const\":\"Email & API Token\",\"default\":\"Email & API Token\",\"title\":\"Credentials Title\",\"type\":\"string\"***,\"username\":***\"title\":\"Email\",\"type\":\"string\"***,\"password\":***\"secret\":true,\"title\":\"API Token\",\"type\":\"string\"***,\"required\":[\"username\",\"password\"],\"title\":\"ApiToken\",\"type\":\"object\"***,\"DeprecatedOAuthCredentials\":***\"properties\":***\"credentials_title\":***\"const\":\"Deprecated OAuth Credentials\",\"default\":\"Deprecated OAuth Credentials\",\"title\":\"Credentials Title\",\"type\":\"string\"***,\"client_id\":***\"secret\":true,\"title\":\"Client Id\",\"type\":\"string\"***,\"client_secret\":***\"secret\":true,\"title\":\"Client Secret\",\"type\":\"string\"***,\"access_token\":***\"secret\":true,\"title\":\"Access Token\",\"type\":\"string\"***,\"required\":[\"client_id\",\"client_secret\",\"access_token\"],\"title\":\"OAuth\",\"type\":\"object\",\"x-oauth2-provider\":\"zendesk\"***,\"RotatingOAuth2Credentials\":***\"properties\":***\"credentials_title\":***\"const\":\"OAuth Credentials\",\"default\":\"OAuth Credentials\",\"title\":\"Credentials Title\",\"type\":\"string\"***,\"client_id\":***\"secret\":true,\"title\":\"Client Id\",\"type\":\"string\"***,\"client_secret\":***\"secret\":true,\"title\":\"Client Secret\",\"type\":\"string\"***,\"refresh_token\":***\"secret\":true,\"title\":\"Refresh Token\",\"type\":\"string\"***,\"access_token\":***\"secret\":true,\"title\":\"Access Token\",\"type\":\"string\"***,\"access_token_expires_at\":***\"format\":\"date-time\",\"title\":\"Access token expiration time.\",\"type\":\"string\"***,\"required\":[\"client_id\",\"client_secret\",\"refresh_token\",\"access_token\",\"access_token_expires_at\"],\"title\":\"OAuth\",\"type\":\"object\",\"x-oauth2-provider\":\"zendesk\"***,\"properties\":***\"subdomain\":***\"description\":\"This is your Zendesk subdomain that can be found in your account URL. For example, in https://***MY_SUBDOMAIN***.zendesk.com, MY_SUBDOMAIN is the value of your subdomain.\",\"pattern\":\"^[a-z][a-z0-9-]***2,62***$\",\"title\":\"Subdomain\",\"type\":\"string\"***,\"start_date\":***\"description\":\"UTC date and time in the format YYYY-MM-DDTHH:MM:SSZ. Any data generated before this date will not be replicated. If left blank, the start date will be set to 30 days before the present.\",\"format\":\"date-time\",\"title\":\"Start Date\",\"type\":\"string\"***,\"credentials\":***\"discriminator\":***\"mapping\":***\"Deprecated OAuth Credentials\":\"#/$defs/DeprecatedOAuthCredentials\",\"Email & API Token\":\"#/$defs/ApiToken\",\"OAuth Credentials\":\"#/$defs/RotatingOAuth2Credentials\"***,\"propertyName\":\"credentials_title\"***,\"oneOf\":[***\"$ref\":\"#/$defs/RotatingOAuth2Credentials\"***,***\"$ref\":\"#/$defs/ApiToken\"***],\"title\":\"Authentication\"***,\"advanced\":***\"$ref\":\"#/$defs/Advanced\",\"advanced\":true,\"description\":\"Advanced settings for the connector.\",\"title\":\"Advanced Config\"***,\"required\":[\"subdomain\",\"credentials\"],\"title\":\"EndpointConfig\",\"type\":\"object\"***,\"resourceConfigSchema\":***\"additionalProperties\":false,\"description\":\"ResourceConfig is a common resource configuration shape.\",\"properties\":***\"_meta\":***\"anyOf\":[***\"additionalProperties\":true,\"type\":\"object\"***,***\"type\":\"null\"***],\"default\":null,\"title\":\"Meta\"***,\"name\":***\"description\":\"Name of this resource\",\"title\":\"Name\",\"type\":\"string\"***,\"interval\":***\"default\":\"PT0S\",\"description\":\"Interval between updates for this resource\",\"format\":\"duration\",\"title\":\"Interval\",\"type\":\"string\"***,\"required\":[\"name\"],\"title\":\"ResourceConfig\",\"type\":\"object\"***,\"documentationUrl\":\"[https://go.estuary.dev/source-zendesk-support-native\](https://go.estuary.dev/source-zendesk-support-native/)",\"resourcePathPointers\":[\"/name\"],\"oauth2\":***\"provider\":\"zendesk\",\"accessTokenBody\":\"***\\\"grant_type\\\": \\\"authorization_code\\\", \\\"code\\\": \\\"*** code ***\\\", \\\"client_id\\\": \\\"*** client_id ***\\\", \\\"client_secret\\\": \\\"*** client_secret ***\\\", \\\"redirect_uri\\\": \\\"*** redirect_uri ***\\\", \\\"scope\\\": \\\"read\\\", \\\"expires_in\\\": 172800***\",\"authUrlTemplate\":\"https://*** config.subdomain ***.zendesk.com/oauth/authorizations/new?response_type=code&client_id=***#urlencode*** client_id ***/urlencode***&redirect_uri=***#urlencode*** redirect_uri ***/urlencode***&scope=read&state=***#urlencode*** state ***/urlencode***\",\"accessTokenHeaders\":***\"Content-Type\":\"application/json\"***,\"accessTokenResponseMap\":***\"access_token\":\"/access_token\",\"refresh_token\":\"/refresh_token\",\"access_token_expires_at\":\"***#now_plus*** expires_in ***/now_plus***\"***,\"accessTokenUrlTemplate\":\"https://*** config.subdomain ***.zendesk.com/oauth/tokens\",\"additionalTokenExchangeBody\":***\"expires_in\":172800***,\"protocol\":3032023***\n" into JSON response: unknown field `additionalTokenExchangeBody`, expected one of `provider`, `auth_url_template`, `authUrlTemplate`, `access_token_url_template`, `accessTokenUrlTemplate`, `access_token_method`, `accessTokenMethod`, `access_token_body`, `accessTokenBody`, `access_token_headers_json_map`, `accessTokenHeaders`, `access_token_response_json_map`, `accessTokenResponseMap`, `refresh_token_url_template`, `refreshTokenUrlTemplate`, `refresh_token_method`, `refreshTokenMethod`, `refresh_token_body`, `refreshTokenBody`, `refresh_token_headers_json_map`, `refreshTokenHeaders`, `refresh_token_response_json_map`, `refreshTokenResponseMap` at line 1 column 4640
Before this refactor, the additionalTokenExchangeBody was purely a connector internal concept & wasn't communicated in the connector's Spec response. I think it should remain so too since the connector performs token exchanges, not the runtime.
4fb85cc to
382070d
Compare
A new `with_client_credentials_location` class method has been implemented to specify whether client ids and secrets will go into request headers or forms. To make it chainable with `for_provider`, the latter has been turned into a class method too -- a later commit will replace the redundant definitions with a single one. In the process of adapting `for_provider` to this new scheme it was also determined that it was preferrable to have dynamically created classes report their original OAuth flavour instead of a generic `_OAuthCredentials`. This change required spec updates for six different connectors.
382070d to
7ea8818
Compare
Alex-Bair
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % a couple comments.
If you're alright with it, let's wait and merge this next week when more folks are around to support after Thanksgiving.
|
|
||
| class OAuth2RotatingTokenSpec(OAuth2Spec): | ||
| additionalTokenExchangeBody: dict[str, str | int] | None | ||
| additionalTokenExchangeBody: dict[str, str | int] = Field(default={}, exclude=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice use of exclude=True to prevent additionalTokenExchangeBody from being serialized in the connector's Spec response. Can you also add a comment here explaining why we need to exclude additionalTokenExchangeBody from serialization? A short summary of what I mentioned in this comment would be good IMO.
| @classmethod | ||
| def with_client_credentials_placement( | ||
| cls, | ||
| placement: OAuth2ClientCredentialsPlacement, | ||
| ) -> type[Self]: | ||
| """ | ||
| Returns a subclass with a custom client credentials placement. | ||
| """ | ||
|
|
||
| return type(cls.__name__, (cls,), {"client_credentials_placement": placement}) | ||
|
|
||
| @staticmethod | ||
| def for_provider( | ||
| provider: str, | ||
| ) -> type["AuthorizationCodeFlowOAuth2Credentials"]: | ||
| @classmethod | ||
| def for_provider(cls, provider: str) -> type[Self]: | ||
| """ | ||
| Builds an OAuth2Credentials model for the given OAuth2 `provider`. | ||
| This routine is only available in Pydantic V2 environments. | ||
| """ | ||
| from pydantic import ConfigDict | ||
|
|
||
| class _OAuth2Credentials(AuthorizationCodeFlowOAuth2Credentials): | ||
| model_config = ConfigDict( | ||
| json_schema_extra={"x-oauth2-provider": provider}, | ||
| title="OAuth", | ||
| ) | ||
| return type( | ||
| cls.__name__, | ||
| (cls,), | ||
| { | ||
| "model_config": ConfigDict( | ||
| json_schema_extra={"x-oauth2-provider": provider}, | ||
| title="OAuth", | ||
| ), | ||
| "_you_must_build_oauth2_credentials_for_a_provider": lambda _: None, | ||
| }, | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not extremely fond of using cast() unless there aren't any better options. cast() doesn't verify the cast is valid. It tells mypy to blindly trust us, and I'd like to keep as many type checking guardrails in place as possible.
What do you think about this approach instead? Double check me, but I think it keeps the expressiveness with the __name__ and __qualname__ replacements:
@classmethod
def for_provider(cls, provider: str) -> type[Self]:
from pydantic import ConfigDict
class _ProviderCredentials(cls):
model_config = ConfigDict(
json_schema_extra={"x-oauth2-provider": provider},
title="OAuth",
)
def _you_must_build_oauth2_credentials_for_a_provider(self) -> ...
_ProviderCredentials.__name__ = cls.__name__
_ProviderCredentials.__qualname__ = cls.__qualname__
return _ProviderCredentials7ea8818 to
4b5042f
Compare
Description:
So far, each OAuth flavour in the CDK had a hardcoded place where OAuth client credentials were supposed to go -- either the headers, or the request body. However, in the process of developing the QuickBooks connector (#3468) we realised its rotating tokens system broke the assumed standard: client ids and secrets were to be concatenated and base64-encoded before being placed in the headers.
This PR adds a new class method to our OAuth models:
with_client_credentials_placement, which allows for a builder pattern-like interface to override what we'd used as the standard so far. This means no pre-existing connector code needed to be updated.Two smaller changes were effected too
OAuth2RotatingTokenSpecgot merged intoOAuth2Specand now the latter supports theadditionalTokenExchangeBodyfieldheadersandformdictionaries has been simplifiedWorkflow steps:
(How does one use this feature, and how has it changed)
Documentation links affected:
(list any documentation links that you created, or existing ones that you've identified as needing updates, along with a brief description)
Notes for reviewers:
Changes have been tested on the following connectors: