-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Context:
We often need to transfer massive datasets (GBs to TBs) across institutional boundaries, where firewalls and network constraints are common. Currently, our g3t tooling (push/clone) lacks a high-performance mechanism similar to Globus + rsync.
Brainstorming Ideas:
-
Dynamic Transfer Modes:
- Can we automatically detect large files (or directories with mixed sizes) and switch to an optimized transfer mode?
- Should there be an option for users to override the default behavior?
-
Environment Detection:
- What are our users running? host and destination operating systems and what is available (Linux, macOS, or Windows with POSIX tools?)
- Are these transfers from local laptops or cloud instances
- How might we tailor the transfer protocol based on the operating environment?
-
Handling Network Restrictions:
- Institutional firewalls often block certain ports; how can we incorporate reverse tunneling, proxies, or even Globus-like endpoints into our solution?
- What have been the most common network/firewall issues so far?
-
Cloud-to-Cloud Integration & Optimizations:
- If both source and destination are cloud-based, explore leveraging native cloud storage features (such as multi-part uploads) to enable parallel transfers and reduce overall transfer times. Example
- What additional strategies (ex. parallel transfers, checksum verifications) could further optimize these operations?
- What security implications might we need to address with direct cloud storage transfers?
-
User Experience & Feedback:
- How can we provide clear, real-time progress feedback during transfers?
- Should error handling and resumption/retry strategies be baked into the transfer mechanism?
quinnwai and bwalsh
Metadata
Metadata
Assignees
Labels
No labels