The JFK Files: Fully OCR'd & Searchable

The JFK files are now part of the public domain, offering a trove of historical documents for researchers, journalists, and enthusiasts alike. However, the vast collection remains unindexed, lacks a text layer, and is difficult to search—making it challenging to analyze effectively, especially for AI-powered research.
As a leader in OCR (Optical Character Recognition) technology, ABBYY is facilitating research. We are providing the JFK files as fully searchable, structured PDFs, freely available for the open-source community. By making these documents machine-readable, we aim to unlock deeper insights, accelerate historical research, and enable advanced AI-driven analysis.

What You Can Do with These Files

With this dataset, you can for example:

🔍 Perform Full-Text Search – Instantly locate key events, names, and places across thousands of pages.
🏗 Build AI-Powered Research Tools – Leverage Retrieval-Augmented Generation (RAG) to create AI assistants that can answer JFK-related questions.
📊 Run NLP & Machine Learning Analysis – Detect patterns, extract key insights, and apply entity recognition to map relationships.
📜 Enhance Historical Investigations – Cross-reference details, analyze declassified records, and uncover new connections.

About the Data

These records are sourced from the U.S. National Archives and are part of the public domain:
🔗 JFK Records Collection (National Archives)

⚠ Disclaimer: While these records are public domain, any copyrighted material within them remains the property of the respective copyright owner. These documents are provided for private study, scholarship, or research purposes only and are shared as-is, without warranty of any kind.

Brought to you by ABBYY

The JFK Files were made machine-readable using the Document AI API. Get access here: https://hubs.li/Q039Y11p0

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Data		Data
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

The JFK Files: Fully OCR'd & Searchable

What You Can Do with These Files

About the Data

Brought to you by ABBYY

About

Uh oh!

License

abbyy/JFK-OCR

Folders and files

Latest commit

History

Repository files navigation

The JFK Files: Fully OCR'd & Searchable

What You Can Do with These Files

About the Data

Brought to you by ABBYY

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks