Skip to content
/ JFK-OCR Public

Fully searchable JFK files powered by ABBYY Purpose-Built AI. Making history more accessible for AI research, full-text search, and Retrieval-Augmented Generation (RAG) applications. Open-source and ready for analysis.

License

Notifications You must be signed in to change notification settings

abbyy/JFK-OCR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The JFK Files: Fully OCR'd & Searchable

The JFK files are now part of the public domain, offering a trove of historical documents for researchers, journalists, and enthusiasts alike. However, the vast collection remains unindexed, lacks a text layer, and is difficult to search—making it challenging to analyze effectively, especially for AI-powered research.
As a leader in OCR (Optical Character Recognition) technology, ABBYY is facilitating research. We are providing the JFK files as fully searchable, structured PDFs, freely available for the open-source community. By making these documents machine-readable, we aim to unlock deeper insights, accelerate historical research, and enable advanced AI-driven analysis.

What You Can Do with These Files

With this dataset, you can for example:

  • 🔍 Perform Full-Text Search – Instantly locate key events, names, and places across thousands of pages.
  • 🏗 Build AI-Powered Research Tools – Leverage Retrieval-Augmented Generation (RAG) to create AI assistants that can answer JFK-related questions.
  • 📊 Run NLP & Machine Learning Analysis – Detect patterns, extract key insights, and apply entity recognition to map relationships.
  • 📜 Enhance Historical Investigations – Cross-reference details, analyze declassified records, and uncover new connections.

About the Data

These records are sourced from the U.S. National Archives and are part of the public domain:
🔗 JFK Records Collection (National Archives)

Disclaimer: While these records are public domain, any copyrighted material within them remains the property of the respective copyright owner. These documents are provided for private study, scholarship, or research purposes only and are shared as-is, without warranty of any kind.

Brought to you by ABBYY

The JFK Files were made machine-readable using the Document AI API. Get access here: https://hubs.li/Q039Y11p0

About

Fully searchable JFK files powered by ABBYY Purpose-Built AI. Making history more accessible for AI research, full-text search, and Retrieval-Augmented Generation (RAG) applications. Open-source and ready for analysis.

Topics

Resources

License

Stars

Watchers

Forks