-
Notifications
You must be signed in to change notification settings - Fork 55
PostgreSQL Integration for Distributed Two-View Estimation #844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! Added some comments (on top of reasonable co-pilot review).
CI seems to fail (more than usual).
@liuzongyue6 let me know when all comments are addressed and you need another review. |
before I re-review:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good improvements, some concerns still remain
- Remove moved test files from tests/ and scripts/ directories - Update PostgresClient with proper connection management - Update TwoViewEstimator with improved database integration - These files were moved/renamed to different locations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR integrates PostgreSQL database support and distributed computing via Dask into the two-view estimation pipeline while introducing SSH tunneling for secure remote worker connections. Key changes include new modules for database connectivity and SSH tunnel management, updates to the two-view estimator for database result storage, and adjustments in configuration and environment files to support PostgreSQL.
Reviewed Changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
scripts/two_view_estimator_dask_postgres.py | Implements Dask-based distributed two-view estimation with database integration and process management. |
scripts/local_scheduler_db_remote_worker_demo.py | Provides a demo for setting up a distributed cluster with SSH tunnels and PostgreSQL. |
gtsfm/two_view_estimator.py | Extends the two-view estimator to support PostgreSQL result storage and schema initialization. |
gtsfm/common/postgres_client.py | Introduces a PostgreSQL client for managing database connections and queries. |
gtsfm/common/dask_db_module_base.py | Creates a base class for Dask modules interacting with the PostgreSQL database. |
Other files (configs, environment, docs) | Update configuration and documentation to support new PostgreSQL and SSH tunneling features. |
Comments suppressed due to low confidence (1)
gtsfm/two_view_estimator.py:456
- The start_time value is passed directly to the result storage function without computing the elapsed computation time. It is recommended to calculate the duration (e.g. time.time() - start_time) before storing to ensure that the recorded 'computation_time' is meaningful.
self.store_computation_results(
PostgreSQL Integration for Distributed Two-View Estimation
Description
This PR introduces a Dask with PostgreSQL database integration for storing computation results and performance monitoring
Key Features
Files Added
gtsfm/common/dask_db_module_base.py
- Base class for databasegtsfm/common/postgres_client.py
- PostgreSQL client with connection managementgtsfm/configs/local_scheduler_postgres_remote_cluster.yaml
- Cluster and Database, related port configurationscripts/local_scheduler_db_remote_worker_demo.py
- Toy example: distributed cluster testscripts/two_view_estimator_dask_postgres.py
- Full two-view estimator test with databasegtsfm/utils/ssh_tunneling.py
- Manages SSH tunnels and Dask scheduler/worker startscripts/README_DASK_LOCAL_REMOTE.md
added for running local scheduler and remote workFiles Modified
gtsfm/two_view_estimator.py
- Added database integration and result storagegtsfm/two_view_estimator/cacher.py
- Added default i1 and i2 parametersenvironment_linux.yml
/environment_linux_cpuonly.yml
- Addedpsycopg2
dependencySetup and Usage
Configuration
Edit
gtsfm/configs/local_scheduler_postgres_remote_cluster.yaml
:Running Tests
Architecture
[Local Machine]
├── Dask Scheduler port 8788, dashboard 8787
├── PostgreSQL Database (port 5432)
└── SSH Tunnels to Remote Workers
[Remote Workers]
├── Dask Workers port 9000
├── SSH Tunnel Connection
└── Database Access via Tunnel
Known Issues