Skip to content

Latest commit

 

History

History
12 lines (8 loc) · 692 Bytes

README.md

File metadata and controls

12 lines (8 loc) · 692 Bytes

Profunc Next Generation

These are some scripts I generated real quick so I can navigate my 4 GB of PDF data that I collected for four years on the RCMP and CSIS with respect to the 2010 Winter Olympic Games and the G8/G20 Summit in Toronto. This is some very basic Retrieval Augmented Generation work, and I'm sure I can modify this to provide insights into other documents.

The most computationally expensive task is cleaning the data.

OK, WHAT IS THIS?

This is the future of Data Driven Journalism. Have you ever wanted an LLM that was tuned on a bunch of ATIP data from the Canadian Government and you can ask it questions and get the most unhinged as hell responses from it?