wandbot

A question answering bot for Weights & Biases documentation. This bot is built using llama-index and openai gpt-4.

Features

The bot utilizes retrieval augmented generation with FAISS backend to retrieve relevant documents and efficiently handle user queries and provides accurate, context-aware responses
Periodic data ingestion with report generation for continuous improvement of the bot.: Checkout the latest data ingestion report here
Integrated with Discord and Slack, allowing seamless integration into popular collaboration platforms.
Logging and analysis with Weights & Biases Tables for performance monitoring and continuous improvement.: Checkout the workspace for more details here
Uses a fallback mechanism for model selection when GPT-4 is unable to generate a response.
Evaluation using a combination of metrics such as retrieval accuracy, string similarity, and model-generated response correctness
Want to know more about the custom system prompt used by the bot?: Checkout the full prompt here

Installation

The project uses python = ">=3.10.0,<3.11" and uses poetry for dependency management. To install the dependencies:

git clone [email protected]:wandb/wandbot.git
pip install poetry
cd wandbot
poetry install --all-extras
# Depending on which platform you want to run on run the following command:
# poetry install --extras discord # for discord
# poetry install --extras slack # for slack
# poetry install --extras api # for api

Usage

Data Ingestion

The data ingestion module pulls code and markdown from Weights & Biases repositories docodile and examples ingests them into vectorstores for the retrieval augmented generation pipeline. To ingest the data run the following command from the root of the repository

poetry run python -m src.wandbot.ingestion

You will notice that the data is ingested into the data/cache directory and stored in three different directories raw_data, vectorstore with individual files for each step of the ingestion process. These datasets are also stored as wandb artifacts in the project defined in the environment variable WANDB_PROJECT and can be accessed from the wandb dashboard.

Running the Q&A Bot

You will need to set the following environment variables:

OPENAI_API_KEY
SLACK_APP_TOKEN
SLACK_BOT_TOKEN
SLACK_SIGNING_SECRET
WANDB_API_KEY
DISCORD_BOT_TOKEN
COHERE_API_KEY
WANDBOT_API_URL="http://localhost:8000"
WANDB_TRACING_ENABLED="true"
WANDB_PROJECT="wandbot-dev"
WANDB_ENTITY="wandbot"

Then you can run the Q&A bot application, use the following commands:

(poetry run uvicorn wandbot.api.app:app --host="0.0.0.0" --port=8000 > api.log 2>&1) & \
(poetry run python -m wandbot.apps.slack > slack_app.log 2>&1) & \
(poetry run python -m wandbot.apps.discord > discord_app.log 2>&1)

Please refer to the run.sh file in the root of the repository for more details on commands related to installing and running the bot.

This will start the chatbot applications - the api, the slackbot and the discord bot, allowing you to interact with it and ask questions related to the Weights & Biases documentation.

Evaluation

To evaluate the performance of the Q&A bot, the provided evaluation script (…) can be used. This script utilizes a separate dataset for evaluation, which can be stored as a W&B Artifact. The evaluation script calculates retrieval accuracy, average string distance, and chat model accuracy.

The evaluation script downloads the evaluation dataset from the specified W&B Artifact, performs the evaluation using the Q&A bot, and then logs the results, such as retrieval accuracy, average string distance, and chat model accuracy, back to W&B. The logged results can be viewed on the W&B dashboard.

To run the evaluation script, use the following commands:

cd wandbot
poetry run python -m eval

Implementation Overview

Document Embeddings with FAISS
Building the Q&A Pipeline with llama-index
Model Selection and Fallback Mechanism
Deploying the Q&A Bot on FastAPI, Discord and Slack
Logging and Analysis with Weights & Biases Tables
Evaluation of the Q&A Bot

You can track the bot usage in the following project: https://wandb.ai/wandbot/wandbot_public

Name		Name	Last commit message	Last commit date
Latest commit History 215 Commits
.github/workflows		.github/workflows
data		data
examples		examples
src		src
.gitignore		.gitignore
.replit		.replit
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
build.sh		build.sh
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wandbot

Features

Installation

Usage

Data Ingestion

Running the Q&A Bot

Evaluation

Implementation Overview

About

Releases

Packages

Languages

License

polya20/wandbot

Folders and files

Latest commit

History

Repository files navigation

wandbot

Features

Installation

Usage

Data Ingestion

Running the Q&A Bot

Evaluation

Implementation Overview

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages