JFK Documents Bulk Downloader

A command-line tool for downloading ZIP files from the National Archives JFK Bulk Download page or any other URL containing ZIP file links.

Note: This tool has been tested on macOS. Windows and Linux support should work as described, but has not been extensively tested. Feedback and bug reports are welcome!

Features

Downloads all ZIP files or a specified subset
Handles connection errors with retries
Supports resuming interrupted downloads
Shows progress bar for downloads
Verifies downloaded files
Checks for existing files and prompts before overwriting
Configurable input URL and output directory
Parallel downloading capability

Installation

Option 1: Using venv (Standard Python)

# Clone the repository
git clone https://github.com/yourusername/jfk-dl.git
cd jfk-dl

# Create and activate virtual environment
# On Windows:
python -m venv venv
venv\Scripts\activate

# On macOS/Linux:
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Option 2: Using uv (Faster alternative)

# Clone the repository
git clone https://github.com/yourusername/jfk-dl.git
cd jfk-dl

# Install dependencies with uv
# If you don't have uv installed:
# pip install uv

# On all platforms:
uv venv  # Creates a .venv directory by default
uv pip install -r requirements.txt

# Activate the virtual environment
# On Windows: .venv\Scripts\activate
# On macOS/Linux: source .venv/bin/activate

Quick Start

Once installed, you can download all JFK document ZIP files with:

# Activate the virtual environment if needed
# For standard venv:
# On Windows: venv\Scripts\activate
# On macOS/Linux: source venv/bin/activate
#
# For uv:
# On Windows: .venv\Scripts\activate
# On macOS/Linux: source .venv/bin/activate

# Run the downloader with default settings
# Without arguments, the script will display help
python bulk_download.py

Usage

python bulk_download.py [OPTIONS]

Options

--url URL: URL containing ZIP files to download (default: https://www.archives.gov/research/jfk/jfkbulkdownload)
--output-dir DIR: Directory to save downloaded files (default: auto-generated based on URL)
--max-files N: Maximum number of files to download (default: 0, download all)
--retry ATTEMPTS: Maximum number of retry attempts (default: 3)
--workers N: Number of parallel downloads (default: 4)
--force: Force download without prompting, even if files exist
--skip-existing: Skip files that already exist without prompting (default: True)
--no-skip-existing: Prompt for each existing file
--smart-check: Smart check: skip files with matching size (default: True)
--no-smart-check: Disable smart file size checking
--filter PATTERN: Filter files by filename pattern (e.g., 'doc-*')
--extension EXT: File extension to look for (e.g., 'zip', 'pdf', 'docx') without the dot (default: zip)
--cowboyup: Run with defaults without showing help message (only needed when running with no other arguments)

Examples

1. Download all ZIP files from the JFK Archives (default)

## Download 2016 to 2023 bulk files
python bulk_download.py --cowboyup

2. Download files from the 2025 JFK Archives release

## Download 2025 files, it is smart enough to skip existing ones as they keep adding
python bulk_download.py --url https://www.archives.gov/research/jfk/release-2025 --output-dir data/raw/archive_gov/2025 --extension pdf

## For testing, you can limit to just a few files
python bulk_download.py --url https://www.archives.gov/research/jfk/release-2025 --output-dir data/raw/archive_gov/2025 --extension pdf --max-files 5

Additional Options:

Use --max-files 5 to download only the first 5 files
Use --filter "record-*" to download files matching a pattern
Use --workers 8 to increase parallel downloads for faster performance

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github		.github
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
bulk_download.py		bulk_download.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

JFK Documents Bulk Downloader

Features

Installation

Option 1: Using venv (Standard Python)

Option 2: Using uv (Faster alternative)

Quick Start

Usage

Options

Examples

1. Download all ZIP files from the JFK Archives (default)

2. Download files from the 2025 JFK Archives release

License

About

Uh oh!

Releases

Packages

Languages

License

mollyfud/jfk-dl

Folders and files

Latest commit

History

Repository files navigation

JFK Documents Bulk Downloader

Features

Installation

Option 1: Using venv (Standard Python)

Option 2: Using uv (Faster alternative)

Quick Start

Usage

Options

Examples

1. Download all ZIP files from the JFK Archives (default)

2. Download files from the 2025 JFK Archives release

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages