Add import-google-takeout-folder CLI Command to Ente cli #8740
+1,622
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a new CLI command to import Google Takeout exports directly to Ente Photos. It preserves metadata (timestamps, location, etc) from Takeout JSON sidecars and includes built-in duplicate detection.
Motivation
Many users manage their media on headless environments like a NAS or a home server. Currently, importing large Google Takeout exports requires a GUI (Desktop/Web client). This command provides a robust, command-line alternative for these environments, allowing for efficient, unattended imports.
Key Implementation Details
Duplicate Detection
Duplicate detection is implemented and enabled by default. It follows the logic used in the web/desktop clients:
Content Hash: Computes a BLAKE2b hash of the file content.
Metadata Comparison: Before uploading, the command fetches existing file metadata from the target album and compares hash + title + fileType.
Efficiency: This ensures that interrupted imports can be resumed without re-uploading existing files, and multiple imports from the same source don't create clusters of duplicates.
Chunked Encryption for Large Files
To ensure compatibility with the web client's decryption process for large files, we've implemented
EncryptChaCha20poly1305Chunked()
. This function encrypts data in 4MB chunks (matching streamEncryptionChunkSize), which resolves "Failed to fetch" errors during playback of large videos.
Testing
Tested on macOS with:
Large-scale collection: Imported a collection of ~6,400 files.
Deduplication reliability: Interrupted and restarted the import multiple times at random intervals; verified that the duplicate detection correctly identified and skipped already-uploaded files, resuming the process seamlessly.
Resilience: Encountered and successfully handled transient 500 errors from the storage provider (Wasabi) via built-in retries, with all files eventually being uploaded correctly.
Media types: Verified playback of 100MB+ videos and visibility of diverse JPG/PNG/HEIC files.
Usage
Basic usage
ente import-google-takeout-folder /path/to/takeout "Album Name"
Enable debug output
ente import-google-takeout-folder /path/to/takeout "Album Name" --debug
Questions for Maintainers
FileType Requirement
While implementing this, I noticed the Zod schema for file metadata seems to require a fileType field (0 for image, 1 for video), otherwise files can remain invisible in the web client. I have implemented this as a requirement in the CLI metadata - could you clarify if this is the intended behavior or if there's a more flexible way to handle this?