Retrieval Augmented Generation (RAG)

I found it tedious and time-consuming to search for information within my local files. So, I developed this project using Retrieval Augmented Generation (RAG) to simplify the process. By leveraging LangChain, Chroma, and ChatGPT, it allows users to query their local files with relevant document snippets provided as context.

Workflow

The following flowchart illustrates the project workflow. The text from input documents is extracted and chunked into smaller passages. Embedding vectors for these chunks are generated and stored in a vector database, in this case, Chroma.

When a user submits a query, it is converted into a standalone query based on the chat history. Relevant text passages are retrieved from the vector database as context. The standalone query and the retrieved context are sent to the LLM, which generates a response based on the provided context.

File Formats

I chose to allow uploading multiple file types, including PDF, TXT, and DOCX formats. These files are chunked into smaller sections, vectorized, and stored for efficient retrieval during queries.

Vector Database

The project uses Chroma as the vector database for storing text embeddings. Chroma efficiently handles vectorized representations of text chunks and allows for quick and accurate retrieval.

Requirements

It is recommended to set up a Python environment before running the project. You can install the required dependencies using:

pip install -r requirements.txt

Running the App

To start the app, simply use the following command:

streamlit run app.py

OpenAI API Key

An OpenAI API key is required to use ChatGPT. Save the key in a file named .streamlit/secrets.toml as shown below:

openai_api_key = "YOUR_API_KEY"

If the file is missing, the app will prompt you to enter the key each time it runs.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
fig		fig
.gitignore		.gitignore
RAG.py		RAG.py
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Retrieval Augmented Generation (RAG)

Workflow

File Formats

Vector Database

Requirements

Running the App

OpenAI API Key

About

Uh oh!

Releases

Packages

Uh oh!

Languages

mohsenim/Local-RAG-with-LangChain-and-Chroma

Folders and files

Latest commit

History

Repository files navigation

Retrieval Augmented Generation (RAG)

Workflow

File Formats

Vector Database

Requirements

Running the App

OpenAI API Key

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages