Warning
This tool is in active development, use at your own risk.
Problem: Google Takeout and iCloud archives of photos and videos:
- do not share a common standard for directory and photo naming
- do not represent albums in a standardized way
- do not allow for duplicates to be merged
Solution: A CLI tool that syncs photos, videos and albums from Google Takeout and iCloud archives into a standard directory structure that removes duplicates, standardizes albums and makes long-term archiving easy.
In detail:
- EXIF metadata and supplemental is extracted from photos and videos and used to determine the date and time of the file
- Files are put into directories with the following format:
yyyy/mm/dd/hhmm-ss-{short checksum}.ext
- For each photo or video file:
- A matching Markdown file is written at the same path with the extension
md
- This contains YAML frontmatter (the part between
---
's) with metadata - The Markdown part of this file can be edited with notes, and it will not be clobbered on later runs
- Determine date based on EXIF tags or file modification time
- A matching Markdown file is written at the same path with the extension
- Rename files with the wrong extension based on a inspecting bytes of the file
- For each Album (Google uses JSON format, iCloud CSV) a Markdown file will be produced
- Input can be Google Takeout zip/directory or iCloud archive zip or directory
- Sync photos/videos into existing directories without clobbering if the same file exists already
- Additive only nothing will be deleted or overwritten
You will need to install Rust and Cargo, follow the instructions on the Rust installation page.
Then build the project from source.
cargo install --git https://github.com/paultuckey/photo-sorter.git photo-sorter
photo-sorter --help
photo-sorter info --debug --root "test" --input "Canon_40D.jpg"
photo-sorter \
sync --debug --dry-run \
--input "input/takeout-20250614T030613Z-1-001.zip" \
--output "output/archive"
photo-sorter sync --debug --input "input/Takeout-small" --output "output/archive-small"
Why use date based file and directory names? Why include the checksum in the file name?
Time is the most important factor in archiving, it enables you to take different actions with different year directories.
A robust failsafe solution for file naming is needed that will be durable very long term. Multiple photos can be taken during the same second, the checksum is used to differentiate them (date-based EXIF tags do not provide sub-second accuracy).
Why use markdown files?
Markdown is widely supported and human readable without any special software. Just as with Obsidian, you can edit the Markdown files with any text editor, or backup the directoryies to any storage solution.
What format is the short checksum?
It's the first 7 characters of a SHA256 hash over the bytes of the file. As with a git short hash it's a good trade-off between uniqueness and length.
Google is a trademark of Google LLC. iCloud is a trademark of Apple Inc.