feat(backend): add SQLAlchemy infrastructure for database operations #11419

Swiftyos · 2025-11-20T09:36:54Z

Summary

Adds SQLAlchemy infrastructure to the backend as foundation for incrementally replacing Prisma for runtime database operations, while maintaining Prisma for migration generation.

Changes

Core Infrastructure

backend/data/sqlalchemy.py: Async SQLAlchemy engine with connection pooling
- Engine creation with QueuePool (10 persistent + 5 overflow connections)
- Session factory for dependency injection
- get_session() FastAPI dependency
- Lifecycle management (initialize(), dispose())
backend/data/sqlalchemy_test.py: Comprehensive test suite
- URL conversion, schema extraction, engine creation
- Session factory and dependency injection tests
- All tests passing ✅

Configuration

backend/util/settings.py: SQLAlchemy settings
- Pool size, overflow, timeouts
- Echo mode for debugging
backend/.env.default: Default environment variables

Service Integration

backend/executor/database.py: DatabaseManager lifespan
backend/server/rest_api.py: AgentServer lifespan

Both services now initialize SQLAlchemy on startup and dispose on shutdown.

Dependencies

pyproject.toml: Added sqlalchemy[asyncio] and asyncpg

Technical Details

Connection Pool:

10 persistent connections per service
5 overflow connections
30s pool timeout
Pre-ping enabled

Schema Handling:

Extracts from existing DATABASE_URL
Sets via search_path parameter
Compatible with Prisma configuration

Session Lifecycle:

Automatic transaction management
Commit on success, rollback on error
Connection returned to pool after use

Migration Approach

This PR establishes infrastructure only. Both Prisma and SQLAlchemy will coexist during incremental migration:

✅ Infrastructure (this PR)
Next: Proof of concept with new features
Then: Systematic table migration
Finally: Remove Prisma runtime usage

Testing

poetry run pytest backend/backend/data/sqlalchemy_test.py -xvs

All tests passing with coverage of:

URL conversion and schema extraction
Engine and session factory creation
Dependency injection lifecycle
Error handling and rollback

Breaking Changes

None - purely additive. Prisma continues to work unchanged.

Checklist 📋

For code changes:

I have clearly listed my changes in the PR description
I have made a test plan
I have tested my changes according to the test plan:
- I have added tests for the new functionality

netlify · 2025-11-20T09:37:00Z

✅ Deploy Preview for auto-gpt-docs-dev canceled.

Name	Link
🔨 Latest commit	`39839a5`
🔍 Latest deploy log	https://app.netlify.com/projects/auto-gpt-docs-dev/deploys/6920c304a7314d00087de5df

coderabbitai · 2025-11-20T09:37:01Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch swiftyos/sqlalchemy-plumbing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

netlify · 2025-11-20T09:37:08Z

✅ Deploy Preview for auto-gpt-docs canceled.

Name	Link
🔨 Latest commit	`39839a5`
🔍 Latest deploy log	https://app.netlify.com/projects/auto-gpt-docs/deploys/6920c30410b6290008fdeabf

qodo-merge-pro · 2025-11-20T09:37:29Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 No security concerns identified
⚡ Recommended focus areas for review Possible Misuse Using QueuePool with create_async_engine may be unnecessary or problematic since async engines manage pooling internally via the underlying driver; verify that specifying QueuePool is supported and won’t lead to unexpected behavior with asyncpg. # Connection pool configuration poolclass=QueuePool, # Standard connection pool pool_size=config.sqlalchemy_pool_size, # Persistent connections max_overflow=config.sqlalchemy_max_overflow, # Burst capacity pool_timeout=config.sqlalchemy_pool_timeout, # Wait time for connection pool_pre_ping=True, # Validate connections before use # Async configuration Transaction Semantics The FastAPI dependency commits after yielding regardless of whether any writes occurred; this can surprise callers that expect explicit commit control. Consider scoping transactions explicitly or documenting that each request auto-commits and ensure read-only routes don’t incur unnecessary commits. # Create session (borrows connection from pool) async with _session_factory() as session: try: yield session # Inject into route handler or context manager # If we get here, route succeeded - commit any pending changes await session.commit() except Exception: # Error occurred - rollback transaction URL Sanitization Regex-based stripping of schema query params may miss edge cases (ordering, URL encoding, additional params). Consider parsing via urllib.parse to robustly remove only schema while preserving other parameters. async_url = prisma_url.replace("postgresql://", "postgresql+asyncpg://") # Remove schema parameter (we'll handle via MetaData) async_url = re.sub(r"\?schema=\w+", "", async_url) # Remove any remaining query parameters that might conflict async_url = re.sub(r"&schema=\w+", "", async_url) return async_url

deepsource-io · 2025-11-20T09:38:15Z

Here's the code health analysis summary for commits 0edc669..39839a5. View details on DeepSource ↗.

Analysis Summary

Analyzer	Status	Summary	Link
JavaScript	✅ Success		View Check ↗
Python	✅ Success	❗ 20 occurences introduced 🎯 2 occurences resolved	View Check ↗

💡 If you’re a repository administrator, you can configure the quality gates from the settings.

AutoGPT-Agent · 2025-11-20T10:11:15Z

Thank you for this well-structured PR that adds SQLAlchemy infrastructure to the backend. The code looks well-designed with comprehensive test coverage and clear documentation.

A few items to address before merging:

Missing checklist: Your PR is missing the required checklist. Even though this is primarily infrastructure code, we still need the checklist filled out. You can mark the testing sections as completed with your test plan since you've clearly tested the SQLAlchemy integration.
Configuration design: The SQLAlchemy configuration in settings.py looks good, but should we add some comments about reasonable values for these settings in different environments (dev/test/prod)?
Documentation: While you mentioned SQLAlchemy_INTEGRATION.md, I don't see it in the diff. Make sure this documentation is included to help other developers understand the migration plan.
Error handling: The error handling in the lifespan hooks looks good, but consider adding more specific error types in your exception handlers where possible for better debugging.

Overall, this is a well-structured foundation for the gradual migration from Prisma to SQLAlchemy. Once you address the checklist issue, this should be ready for merging.

qodo-merge-pro · 2025-11-20T10:59:08Z

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
🧪 PR contains tests
🔒 Security concerns Sensitive information exposure: Initialization error logs may reveal parts of the database connection string (host, port, db name) in both executor/database.py and server/rest_api.py. While credentials are not logged, internal endpoints can be sensitive; recommend redacting or removing URL details from logs. No other obvious injection or secret exposure found.
⚡ Recommended focus areas for review Transaction Semantics get_session() unconditionally commits on successful exit. This may lead to unintended commits for pure read-only operations or when callers expect to manage transactions explicitly. Consider making commit opt-in, supporting read-only sessions, or documenting clearly to avoid accidental writes. async def get_session() -> AsyncGenerator[AsyncSession, None]: """ FastAPI dependency that provides database session. Usage in routes: @router.get("/users/{user_id}") async def get_user( user_id: int, session: AsyncSession = Depends(get_session) ): result = await session.execute(select(User).where(User.id == user_id)) return result.scalar_one_or_none() Usage in DatabaseManager RPC methods: @expose async def get_user(user_id: int): async with get_session() as session: result = await session.execute(select(User).where(User.id == user_id)) return result.scalar_one_or_none() Lifecycle: 1. Request arrives 2. FastAPI calls this function (or used as context manager) 3. Session is created (borrows connection from pool) 4. Session is injected into route handler 5. Route executes (may commit/rollback) 6. Route returns 7. Session is closed (returns connection to pool) Error handling: - If exception occurs, session is rolled back - Connection is always returned to pool (even on error) """ if _session_factory is None: raise RuntimeError( "SQLAlchemy not initialized. Call initialize() in lifespan context." ) # Create session (borrows connection from pool) async with _session_factory() as session: try: yield session # Inject into route handler or context manager # If we get here, route succeeded - commit any pending changes await session.commit() except Exception: # Error occurred - rollback transaction await session.rollback() raise finally: # Always close session (returns connection to pool) await session.close() URL Sanitization get_database_url() strips all query parameters indiscriminately; if future required parameters (e.g., sslmode) are added to DATABASE_URL, they will be lost. Consider preserving safe/whitelisted params or explicitly migrating supported ones via connect_args. def get_database_url() -> str: """ Extract database URL from environment and convert to async format. Prisma URL: postgresql://user:pass@host:port/db?schema=platform&connect_timeout=60 Async URL: postgresql+asyncpg://user:pass@host:port/db Returns the async-compatible URL without query parameters (handled via connect_args). """ prisma_url = Config().database_url # Replace postgresql:// with postgresql+asyncpg:// async_url = prisma_url.replace("postgresql://", "postgresql+asyncpg://") # Remove ALL query parameters (schema, connect_timeout, etc.) # We'll handle these through connect_args instead async_url = re.sub(r"\?.$", "", async_url) return async_url Error Logging Detail* Initialization logs include the tail of database_url (host:port/db) which could leak sensitive info in some deployments. Ensure no credentials or internal hostnames are exposed in logs; consider redaction or omitting the URL altogether. if config.enable_sqlalchemy: from sqlalchemy.exc import DatabaseError, OperationalError from sqlalchemy.exc import TimeoutError as SQLAlchemyTimeoutError from backend.data import sqlalchemy as sa try: engine = sa.create_engine() sa.initialize(engine) app.state.db_engine = engine logger.info( f"[{self.service_name}] ✓ SQLAlchemy initialized " f"(pool_size={config.sqlalchemy_pool_size}, " f"max_overflow={config.sqlalchemy_max_overflow})" ) except OperationalError as e: logger.error( f"[{self.service_name}] Failed to connect to database during SQLAlchemy initialization. " f"Check database connection settings (host, port, credentials). " f"Database URL: {config.database_url.split('@')[-1] if '@' in config.database_url else 'N/A'}. " f"Error: {e}" ) raise except SQLAlchemyTimeoutError as e: logger.error( f"[{self.service_name}] Database connection timeout during SQLAlchemy initialization. " f"Timeout setting: {config.sqlalchemy_connect_timeout}s. " f"Check if database is accessible and increase timeout if needed. " f"Error: {e}" ) raise except DatabaseError as e: logger.error( f"[{self.service_name}] Database error during SQLAlchemy initialization. " f"Check database permissions and configuration. " f"Error: {e}" ) raise except Exception as e: logger.error( f"[{self.service_name}] Unexpected error during SQLAlchemy initialization. " f"Configuration: pool_size={config.sqlalchemy_pool_size}, " f"max_overflow={config.sqlalchemy_max_overflow}, " f"pool_timeout={config.sqlalchemy_pool_timeout}s. " f"Error: {e}", exc_info=True, ) raise

sentry · 2025-11-20T11:00:52Z

autogpt_platform/backend/backend/data/sqlalchemy.py

+    """
+    prisma_url = Config().database_url
+
+    # Replace postgresql:// with postgresql+asyncpg://
+    async_url = prisma_url.replace("postgresql://", "postgresql+asyncpg://")
+
+    # Remove ALL query parameters (schema, connect_timeout, etc.)
+    # We'll handle these through connect_args instead
+    async_url = re.sub(r"\?.*$", "", async_url)
+
+    return async_url


Bug: Empty DATABASE_URL with enable_sqlalchemy=true causes unhandled ArgumentError during startup.
_{Severity: HIGH | Confidence: High}

🔍 Detailed Analysis

When enable_sqlalchemy=true is set and the DATABASE_URL environment variable is not provided, the Config().database_url defaults to an empty string. The get_database_url() function then attempts to create an engine with this empty URL, causing SQLAlchemy to raise an ArgumentError. This ArgumentError is not caught by the existing exception handlers in database.py or rest_api.py, leading to an unhandled exception and application crash during startup.

💡 Suggested Fix

Validate that database_url is not empty before calling create_async_engine(), or add ArgumentError to the list of caught exceptions, or set a sensible fallback for database_url.

🤖 Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: autogpt_platform/backend/backend/data/sqlalchemy.py#L31-L49 Potential issue: When `enable_sqlalchemy=true` is set and the `DATABASE_URL` environment variable is not provided, the `Config().database_url` defaults to an empty string. The `get_database_url()` function then attempts to create an engine with this empty URL, causing SQLAlchemy to raise an `ArgumentError`. This `ArgumentError` is not caught by the existing exception handlers in `database.py` or `rest_api.py`, leading to an unhandled exception and application crash during startup.

_{Did we get this right? 👍 / 👎 to inform future reviews.}
_{Reference_id: 2841823}

Swiftyos requested a review from a team as a code owner November 20, 2025 09:36

Swiftyos requested review from Bentlybro and ntindle and removed request for a team November 20, 2025 09:36

github-project-automation bot added this to AutoGPT development kanban Nov 20, 2025

github-project-automation bot moved this to 🆕 Needs initial review in AutoGPT development kanban Nov 20, 2025

github-actions bot added the platform/backend AutoGPT Platform - Back end label Nov 20, 2025

github-actions bot added the size/xl label Nov 20, 2025

Swiftyos marked this pull request as draft November 20, 2025 09:37

qodo-merge-pro bot added the Review effort 3/5 label Nov 20, 2025

Swiftyos added 4 commits November 20, 2025 11:10

added sqlalchemy plumbing

5c8dac4

integrate into database manager and rest api.

0085e5e

add tests

900b0e7

added tests and toggle for sqlalchemy integration

8ae5cbe

Swiftyos force-pushed the swiftyos/sqlalchemy-plumbing branch from 682f01f to 8ae5cbe Compare November 20, 2025 10:10

Swiftyos added 3 commits November 20, 2025 11:21

update poetry lock

5388a32

Added docs and more specific error handling

0c0488e

update asyncpg to version ^0.30.0

b114354

Swiftyos requested a review from majdyz November 20, 2025 10:41

Swiftyos marked this pull request as ready for review November 20, 2025 10:58

qodo-merge-pro bot added the Possible security concern label Nov 20, 2025

sentry bot reviewed Nov 20, 2025

View reviewed changes

Merge branch 'dev' into swiftyos/sqlalchemy-plumbing

39839a5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(backend): add SQLAlchemy infrastructure for database operations #11419

feat(backend): add SQLAlchemy infrastructure for database operations #11419

Swiftyos commented Nov 20, 2025 •

edited

Loading

Uh oh!

netlify bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

coderabbitai bot commented Nov 20, 2025 •

edited

Loading

Review skipped

Uh oh!

netlify bot commented Nov 20, 2025 •

edited

Loading

Uh oh!

qodo-merge-pro bot commented Nov 20, 2025

Uh oh!

deepsource-io bot commented Nov 20, 2025 •

edited

Loading

Analysis Summary

Uh oh!

AutoGPT-Agent commented Nov 20, 2025

Uh oh!

qodo-merge-pro bot commented Nov 20, 2025

Uh oh!

sentry bot Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat(backend): add SQLAlchemy infrastructure for database operations #11419

Are you sure you want to change the base?

feat(backend): add SQLAlchemy infrastructure for database operations #11419

Conversation

Swiftyos commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Core Infrastructure

Configuration

Service Integration

Dependencies

Technical Details

Migration Approach

Testing

Breaking Changes

Checklist 📋

For code changes:

Uh oh!

netlify bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for auto-gpt-docs-dev canceled.

Uh oh!

coderabbitai bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

netlify bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for auto-gpt-docs canceled.

Uh oh!

qodo-merge-pro bot commented Nov 20, 2025

PR Reviewer Guide 🔍

Uh oh!

deepsource-io bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Analysis Summary

Uh oh!

AutoGPT-Agent commented Nov 20, 2025

Uh oh!

qodo-merge-pro bot commented Nov 20, 2025

PR Reviewer Guide 🔍

Uh oh!

sentry bot Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Swiftyos commented Nov 20, 2025 •

edited

Loading

netlify bot commented Nov 20, 2025 •

edited

Loading

coderabbitai bot commented Nov 20, 2025 •

edited

Loading

netlify bot commented Nov 20, 2025 •

edited

Loading

deepsource-io bot commented Nov 20, 2025 •

edited

Loading