Red Hat Developers Hands-On Day 2023, Darmstadt

Event

Room: Helium, Darmstadtium Time: November 29th, 2023 15:30 - 18:00 CET

Instructors:

Rodeina Mohamed rmohamed@redhat.com
Steffen Röcker sroecker@redhat.com

Introduction

Welcome to our hands-on LLM application development workshop! Today you'll learn how to develop a simple chatbot (or "GPT") that can answer based on your own documents and how to deploy it with Podman and on OpenShift.

Software Stack

We will use a bunch of cool open source software to build our bot:

Ollama
Llama.cpp
LlamaIndex
Streamlit

For local experimentation you should have a Python 3.11 virtual env set up already. We will also hand out access to an OpenShift cluster with GPUs and Red Hat OpenShift Data Science (RHODS) where you can use a Jupyter notebook and deploy your streamlit app or connect to a hosted Ollama service.

Basic Concepts

By now you should be familiar with commercial offerings like ChatGPT, Bing, Bard or Claude. If not, don't worry ;)

We're not going to cover GPT architecture today, remember it's going to be hands-on. If you're interested in theory we've got you covered in the Further Readings section below.

Instead let's have a look at our Linuxbot example streamlit app and go through its functionality step by step:

Streamlit and how to easily build interactive apps
Connecting to Ollama
What is Ollama and llama.cpp?
Models - Strange creatures and where to find them
Quantization? How to be GPU poor and local Llama rich
Prompting: System prompt vs user prompt
Tokenization, see OpenAI tokenizer

The basic architecture of our RAG bot will look like this:

See LlamaIndex High-level Concepts for what's needed to query our documents:

Embeddings & Vector Store: see this nice interactive Solara demo of embeddings with retrieval
Context
Retrieval augmented generation (RAG)

Coincidentally few days ago LlamaIndex released RAGs, exactly what we are going to build today: Introducing RAGs: Your Personalized ChatGPT Experience Over Your Data

Since it was released shortly after this example app you should have a look what can improved here. One thing not implemented yet in RAGs are local models. 🦙

A very good introduction to RAG can be found in RAG 101 for Enterprise.

It's time to build

Now that we covered the very basics it's time for learning by doing! First we should modify our example bot and give it some custom data, a different prompt or try out different models. Some of them can have quite the personality.

Have a look at the included notebooks to see examples for text summary and natural SQL query.

Some ideas for experimentation & improvement

We've collected a few ideas and tried to cluster them according to their required skill level:

Beginners:

Create your own bot, some inspiration for GPTs:
Build a simple prompt injection game where the user must guess a secret the GPT tries to hide
Don't generate embeddings in the streamlit app (bad practice) and utilize a real vector database like Chroma, Weaviate, Qdrant, Milvus, Pinecone. Even SQLite or Postgres can be used as vector DB.

Intermediate:

Make complex texts and concepts, e.g legislature, accessible for everyone. See this example.
Replace Ollama with a LiteLLM proxy
Port the streamlit app to Solara

Expert:

Create a multimodal bot that can understand images (LLaVA or BakLLaVA) and speech (Whisper)
Create a loader for Kiwix ZIM files (e.g Wikipedia)
Add Ollama embedding REST API support to LlamaIndex

It's best if you find some other people interested in the same idea and change the table setup accordingly. The instructors will go from team to team. First to setup the infrastructure and then to help you implement your bot. Don't forget to ask your favorite (local) LlaMA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Darmstadt_v1.md

Darmstadt_v1.md

Red Hat Developers Hands-On Day 2023, Darmstadt

Event

Introduction

Software Stack

Basic Concepts

It's time to build

Some ideas for experimentation & improvement

Further Readings

General Introductions

Tokenization

Embeddings & Vector Databases

RAG

Files

Darmstadt_v1.md

Latest commit

History

Darmstadt_v1.md

File metadata and controls

Red Hat Developers Hands-On Day 2023, Darmstadt

Event

Introduction

Software Stack

Basic Concepts

It's time to build

Some ideas for experimentation & improvement

Further Readings

General Introductions

Tokenization

Embeddings & Vector Databases

RAG