-
Notifications
You must be signed in to change notification settings - Fork 18.5k
feat: add new table of end user oauth #28351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feat/end-user-oauth
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @CourTeous33, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a new database model, Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new SQLAlchemy model EndUserAuthenticationProvider to store authentication credentials for end-users in tools. The changes look good and follow the existing patterns in the codebase. I've made a few suggestions to improve database portability and consistency by using SQLAlchemy's backend-agnostic functions and adding a missing Python-level default. Overall, a solid addition.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
|
||
| # id of the authentication provider | ||
| id: Mapped[str] = mapped_column(StringUUID, primary_key=True, default=uuidv7, init=False) | ||
| name: Mapped[str] = mapped_column( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, enforcing name uniqueness for a (provider, end_user_id) pair creates more overhead than value.
Ideally, we should rely on IDs as the primary identifier and treat names merely as user-facing hints. Therefore, strict uniqueness shouldn't be required. If a user creates duplicate names, they should bear the responsibility for any ambiguity.
Enforcing name uniqueness has caused implementation issues in the past, such as with credential management in the EE version. Furthermore, it complicates name generation.
That said, I suggest discussing this with the PM. I’ve skimmed the PRD and found no explicit requirement for name uniqueness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, should we remove the unique constraint at all? Since ID is enough for uniqueness
| sa.Column( | ||
| "id", | ||
| models.types.StringUUID(), | ||
| server_default=sa.text("uuid_generate_v4()"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would break MySQL support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will edit the script at once after all comments got resolved at the new provider
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new table, tool_enduser_authentication_providers, for managing end-user authentication credentials, which is a valuable addition. The implementation includes a new Alembic migration and updates to the SQLAlchemy models. However, my review found several critical and high-severity inconsistencies between the model definition in api/models/tools.py and the database schema defined in the migration file. These discrepancies relate to unique constraints, column types, nullability, default values, and database indexes. It is crucial to resolve these mismatches to ensure data integrity, prevent potential runtime errors, and maintain a consistent schema. I have provided specific comments and code suggestions to address each of these issues.
| __table_args__ = ( | ||
| sa.UniqueConstraint("end_user_id", "provider", "name"), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unique constraint in the model definition is ("end_user_id", "provider", "name"), but the corresponding Alembic migration defines it as ("tenant_id", "provider", "end_user_id", "name"). This inconsistency can lead to unexpected database behavior and ORM errors. The constraint in the migration seems more correct as it includes tenant_id, ensuring uniqueness across the whole system. Please update the model to match the migration.
| __table_args__ = ( | |
| sa.UniqueConstraint("end_user_id", "provider", "name"), | |
| ) | |
| __table_args__ = ( | |
| sa.UniqueConstraint("tenant_id", "provider", "end_user_id", "name"), | |
| ) |
| ), | ||
| sa.Column("tenant_id", models.types.StringUUID(), nullable=False), | ||
| sa.Column("end_user_id", models.types.StringUUID(), nullable=False), | ||
| sa.Column("provider", sa.String(length=256), nullable=False), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The provider column is defined as sa.String(length=256), but the corresponding model EndUserAuthenticationProvider defines it as LongText. This can cause data truncation if a provider name exceeds 256 characters. Please use models.types.LongText() for consistency with the model and to avoid potential data loss.
| sa.Column("provider", sa.String(length=256), nullable=False), | |
| sa.Column("provider", models.types.LongText(), nullable=False), |
| name: Mapped[str] = mapped_column( | ||
| String(256), | ||
| nullable=False, | ||
| default="API KEY 1", | ||
| index=True | ||
| ) | ||
| # id of the tenant | ||
| tenant_id: Mapped[str] = mapped_column(StringUUID, nullable=False) | ||
| # id of the end user | ||
| end_user_id: Mapped[str] = mapped_column(StringUUID, nullable=False, index=True) | ||
| # name of the tool provider | ||
| provider: Mapped[str] = mapped_column(LongText, nullable=False, index=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are several inconsistencies between the model definition and the Alembic migration regarding indexes:
name: The model specifiesindex=True, but the migration does not create a separate index for this column.provider: The model specifiesindex=True, but the migration does not create a separate index.tenant_id: The migration creates an index on this column, but the model does not specifyindex=True.
These discrepancies can lead to performance issues and schema drift. Please ensure the model definition and the migration are synchronized. It's recommended to make the model the source of truth and adjust the migration accordingly.
| # name of the tool provider | ||
| provider: Mapped[str] = mapped_column(LongText, nullable=False, index=True) | ||
| # encrypted credentials for the end user | ||
| encrypted_credentials: Mapped[str] = mapped_column(LongText, nullable=False, default="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The encrypted_credentials column is defined as non-nullable (nullable=False) with a Mapped[str] type hint, but the migration file defines it as nullable. This is inconsistent. To align with the database schema and avoid potential runtime errors, it should be defined as nullable. The BuiltinToolProvider model uses Mapped[str | None] and nullable=True, which is a good pattern to follow.
| encrypted_credentials: Mapped[str] = mapped_column(LongText, nullable=False, default="") | |
| encrypted_credentials: Mapped[str | None] = mapped_column(LongText, nullable=True, default=None) |
| ) | ||
|
|
||
| # id of the authentication provider | ||
| id: Mapped[str] = mapped_column(StringUUID, primary_key=True, default=uuidv7, init=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The id column uses default=uuidv7, while the migration uses uuid_generate_v4(). Furthermore, other models in this file, like BuiltinToolProvider, use uuid4. For consistency within the codebase and with the database schema, please use a consistent UUID generation method. Using uuid4 would align with other models in this file.
| id: Mapped[str] = mapped_column(StringUUID, primary_key=True, default=uuidv7, init=False) | |
| id: Mapped[str] = mapped_column(StringUUID, primary_key=True, default=lambda: str(uuid4()), init=False) |
| created_at: Mapped[datetime] = mapped_column( | ||
| sa.DateTime, nullable=False, default=datetime.now, init=False | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The created_at column uses default=datetime.now. This can lead to issues with timezones as it may not be UTC-aware. It's better to use a consistent, timezone-aware UTC datetime. The BuiltinToolProvider model uses server_default=func.current_timestamp(), which is a good practice for ensuring database-level timestamping in UTC. Please update this to be consistent and avoid potential timezone bugs.
| created_at: Mapped[datetime] = mapped_column( | |
| sa.DateTime, nullable=False, default=datetime.now, init=False | |
| ) | |
| created_at: Mapped[datetime] = mapped_column( | |
| sa.DateTime, nullable=False, server_default=func.current_timestamp(), init=False | |
| ) |
Important
Fixes #<issue number>.Summary
Added a new table for end user auth:
id: Unique identifier for each authentication entry (auto-generated)
name: A human-readable name for the credentials (defaults to "API KEY 1")
tenant_id: Which tenant/organization this belongs to
end_user_id: Which specific end user owns these credentials
provider: The name of the tool/service provider (e.g., "github", "google", etc.)
encrypted_credentials: The actual credentials stored in encrypted JSON format
credential_type: What kind of authentication this is (defaults to "api-key", but could be "oauth2", etc.)
created_at: When the credentials were first added
updated_at: When they were last modified
expires_at: When the credentials expire (defaults to -1, meaning no expiration)
Screenshots
Checklist
dev/reformat(backend) andcd web && npx lint-staged(frontend) to appease the lint gods