Skip to content

Commit 4669f46

Browse files
Update genai_cookbook site content with new agent sample app (databricks#32)
* Update genai_cookbook site content with new aget sample app Signed-off-by: Prithvi Kannan <[email protected]> * reference 0.2.0 Signed-off-by: Prithvi Kannan <[email protected]> * update genai_cookbook eval content Signed-off-by: Prithvi Kannan <[email protected]> * typo Signed-off-by: Prithvi Kannan <[email protected]> * fix Signed-off-by: Prithvi Kannan <[email protected]> --------- Signed-off-by: Prithvi Kannan <[email protected]>
1 parent 2ed98a5 commit 4669f46

16 files changed

+48
-106
lines changed

README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@
22

33
Please visit http://ai-cookbook.io for the accompanying documentation for this repo.
44

5-
This repo provides [learning materials](https://ai-cookbook.io/) and [production-ready code](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code) to build a **high-quality RAG application** using Databricks. The [Mosaic Generative AI Cookbook](https://ai-cookbook.io/) provides:
5+
This repo provides [learning materials](https://ai-cookbook.io/) and [production-ready code](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code) to build a **high-quality RAG application** using Databricks. The [Mosaic Generative AI Cookbook](https://ai-cookbook.io/) provides:
66
- A conceptual overview and deep dive into various Generative AI design patterns, such as Prompt Engineering, Agents, RAG, and Fine Tuning
77
- An overview of Evaluation-Driven development
88
- The theory of every parameter/knob that impacts quality
99
- How to root cause quality issues and detemermine which knobs are relevant to experiment with for your use case
1010
- Best practices for how to experiment with each knob
1111

12-
The [provided code](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code) is intended for use with the Databricks platform. Specifically:
12+
The [provided code](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code) is intended for use with the Databricks platform. Specifically:
1313
- [Mosaic AI Agent Framework](https://docs.databricks.com/en/generative-ai/retrieval-augmented-generation.html) which provides a fast developer workflow with enterprise-ready LLMops & governance
1414
- [Mosaic AI Agent Evaluation](https://docs.databricks.com/en/generative-ai/agent-evaluation/index.html) which provides reliable, quality measurement using proprietary AI-assisted LLM judges to measure quality metrics that are powered by human feedback collected through an intuitive web-based chat UI
1515

genai_cookbook/10-min-demo/Mosaic-AI-Agents-10-Minute-Demo.ipynb

+4-1
Original file line numberDiff line numberDiff line change
@@ -677,7 +677,7 @@
677677
"\n",
678678
"## Browse the code samples\n",
679679
"\n",
680-
"Open the `./genai-cookbook/rag_app_sample_code` folder that was synced to your Workspace by this notebook. Documentation [here](https://ai-cookbook.io/nbs/6-implement-overview.html).\n",
680+
"Open the `./genai-cookbook/agent_app_sample_code` folder that was synced to your Workspace by this notebook. Documentation [here](https://ai-cookbook.io/nbs/6-implement-overview.html).\n",
681681
"\n",
682682
"## Read the [Generative AI Cookbook](https://ai-cookbook.io)!\n",
683683
"\n",
@@ -706,6 +706,9 @@
706706
},
707707
"notebookName": "Mosaic-AI-Agents-10-Minute-Demo",
708708
"widgets": {}
709+
},
710+
"language_info": {
711+
"name": "python"
709712
}
710713
},
711714
"nbformat": 4,

genai_cookbook/_config.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ execute:
1212

1313
# Information about where the book exists on the web
1414
repository:
15-
url: https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code
15+
url: https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code
1616
path_to_book: ../genai_cookbook # Optional path to your book, relative to the repository root
1717
branch: main # Which branch of the repository should be used when creating links (optional)
1818

30 KB
Loading

genai_cookbook/nbs/5-hands-on-build-poc.md

+11-26
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@
1111
1. Completed [start here](./6-implement-overview.md) steps
1212
2. Data from your [requirements](/nbs/5-hands-on-requirements.md#requirements-questions) is available in your [Lakehouse](https://www.databricks.com/blog/2020/01/30/what-is-a-data-lakehouse.html) inside a Unity Catalog [volume](https://docs.databricks.com/en/connect/unity-catalog/volumes.html) <!-- or [Delta Table](https://docs.databricks.com/en/delta/index.html)-->
1313

14-
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code)
14+
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code)
1515
:class: tip
16-
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code).
16+
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code).
1717
```
1818

1919
**Expected outcome**
@@ -62,20 +62,9 @@ By default, the POC uses the open source models available on [Mosaic AI Foundati
6262

6363

6464

65-
1. **Open the POC code folder within [`A_POC_app`](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code/A_POC_app) based on your type of data:**
65+
1. **Open the [`agent_app_sample_code`](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code)**
6666

67-
<br/>
68-
69-
| File type | Source | POC application folder |
70-
|----------------------------|------------------------|------------------------|
71-
| PDF files | UC Volume | [`pdf_uc_volume`](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code/A_POC_app/pdf_uc_volume) |
72-
| Powerpoint files | UC Volume | [`pptx_uc_volume`](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code/A_POC_app/pptx_uc_volume) |
73-
| DOCX files | UC Volume | [`docx_uc_volume`](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code/A_POC_app/docx_uc_volume) |
74-
| JSON files w/ text/markdown/HTML content & metadata | UC Volume | [`json_uc_volume`](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code/A_POC_app/html_uc_volume) |
75-
<!--| HTML content | Delta Table | |
76-
| Markdown or regular text | Delta Table | | -->
77-
78-
If your data doesn't meet one of the above requirements, you can customize the parsing function (`parser_udf`) within `02_poc_data_pipeline` in the above POC directories to work with your file types.
67+
If your data doesn't meet one of the above requirements, you can customize the parsing function (`file_parser`) within `02_data_pipeline` in the above directory to work with your file types.
7968

8069
Inside the POC folder, you will see the following notebooks:
8170

@@ -84,27 +73,23 @@ By default, the POC uses the open source models available on [Mosaic AI Foundati
8473
```
8574

8675
```{tip}
87-
The notebooks referenced below are relative to the specific POC you've chosen. For example, if you see a reference to `00_config` and you've chosen `pdf_uc_volume`, you'll find the relevant `00_config` notebook at [`A_POC_app/pdf_uc_volume/00_config`](https://github.com/databricks/genai-cookbook/blob/main/rag_app_sample_code/A_POC_app/pdf_uc_volume/00_config.py).
76+
The notebooks referenced below are relative to the specific POC you've chosen. For example, if you see a reference to `00_config` and you've chosen `pdf_uc_volume`, you'll find the relevant `00_global_config` notebook at [`00_global_config`](https://github.com/databricks/genai-cookbook/blob/v0.2.0/agent_app_sample_code/00_global_config.py).
8877
```
8978

9079
<br/>
9180

9281
2. **Optionally, review the default parameters**
9382

94-
Open the `00_config` Notebook within the POC directory you chose above to view the POC's applications default parameters for the data pipeline and RAG chain.
83+
Open the `00_global_config` Notebook within the directory to view the POC's applications default parameters for the data pipeline and RAG chain.
9584

9685

9786
```{note}
9887
**Important:** our recommended default parameters are by no means perfect, nor are they intended to be. Rather, they are a place to start from - the next steps of our workflow guide you through iterating on these parameters.
9988
```
10089

101-
3. **Validate the configuration**
102-
103-
Run the `01_validate_config` to check that your configuration is valid and all resources are available. You will see an `rag_chain_config.yaml` file appear in your directory - we will use this in step 4 to deploy the application.
104-
105-
4. **Run the data pipeline**
90+
3. **Run the data pipeline**
10691

107-
The POC data pipeline is a Databricks Notebook based on Apache Spark. Open the `02_poc_data_pipeline` Notebook and press Run All to execute the pipeline. The pipeline will:
92+
The POC data pipeline is a Databricks Notebook based on Apache Spark. Open the `02_data_pipeline` Notebook and press Run All to execute the pipeline. The pipeline will:
10893

10994
1. Load the raw documents from the UC Volume
11095
2. Parse each document, saving the results to a Delta Table
@@ -142,7 +127,7 @@ The notebooks referenced below are relative to the specific POC you've chosen. F
142127
The POC Chain uses MLflow code-based logging. To understand more about code-based logging, visit the [docs](https://docs.databricks.com/generative-ai/create-log-agent.html#code-based-vs-serialization-based-logging).
143128
```
144129

145-
1. Open the `03_deploy_poc_to_review_app` Notebook
130+
1. Open the `03_agent_proof_of_concept` Notebook
146131

147132
2. Run each cell of the Notebook.
148133

@@ -155,7 +140,7 @@ The notebooks referenced below are relative to the specific POC you've chosen. F
155140
4. Modify the default instructions to be relevant to your use case. These are displayed in the Review App.
156141
157142
```python
158-
instructions_to_reviewer = f"""## Instructions for Testing the {RAG_APP_NAME}'s Initial Proof of Concept (PoC)
143+
instructions_to_reviewer = f"""## Instructions for Testing the {AGENT_NAME}'s Initial Proof of Concept (PoC)
159144
160145
Your inputs are invaluable for the development team. By providing detailed feedback and corrections, you help us fix issues and improve the overall quality of the application. We rely on your expertise to identify any gaps or areas needing enhancement.
161146
@@ -170,7 +155,7 @@ The notebooks referenced below are relative to the specific POC you've chosen. F
170155
- Carefully review each document that the system returns in response to your question.
171156
- Use the thumbs up/down feature to indicate whether the document was relevant to the question asked. A thumbs up signifies relevance, while a thumbs down indicates the document was not useful.
172157
173-
Thank you for your time and effort in testing {RAG_APP_NAME}. Your contributions are essential to delivering a high-quality product to our end users."""
158+
Thank you for your time and effort in testing {AGENT_NAME}. Your contributions are essential to delivering a high-quality product to our end users."""
174159
175160
print(instructions_to_reviewer)
176161
```

genai_cookbook/nbs/5-hands-on-curate-eval-set.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@
1010

1111
*Time varies based on the quality of the responses provided by your stakeholders. If the responses are messy or contain lots of irrelevant queries, you will need to spend more time filtering and cleaning the data.*
1212

13-
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code)
13+
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code)
1414
:class: tip
15-
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code).
15+
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code).
1616
```
1717

1818
#### **Overview & expected outcome**
@@ -49,6 +49,6 @@ Databricks recommends that your Evaluation Set contain at least 30 questions to
4949

5050
2. Inspect the Evaluation Set to understand the data that is included. You need to validate that your Evaluation Set contains a representative and challenging set of questions. Adjust the Evaluation Set as required.
5151

52-
3. By default, your evaluation set is saved to the Delta Table configured in `EVALUATION_SET_FQN` in the [`00_global_config`](https://github.com/databricks/genai-cookbook/blob/main/rag_app_sample_code/00_global_config.py) Notebook.
52+
3. By default, your evaluation set is saved to the Delta Table configured in `EVALUATION_SET_FQN` in the [`00_global_config`](https://github.com/databricks/genai-cookbook/blob/v0.2.0/agent_app_sample_code/00_global_config.py) Notebook.
5353

5454
> **Next step:** Now that you have an evaluation set, use it to [evaluate the POC app's](./5-hands-on-evaluate-poc.md) quality/cost/latency.

genai_cookbook/nbs/5-hands-on-evaluate-poc.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@
1010

1111
*Time varies based on the number of questions in your evaluation set. For 100 questions, evaluation will take approximately 5 minutes.*
1212

13-
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code)
13+
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code)
1414
:class: tip
15-
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code).
15+
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code).
1616
```
1717

1818
### **Overview & expected outcome**

genai_cookbook/nbs/5-hands-on-improve-quality-step-1-generation.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ The following is a step-by-step process to address **generation quality** issues
1010

1111

1212

13-
1. Open the [`B_quality_iteration/01_root_cause_quality_issues`](https://github.com/databricks/genai-cookbook/blob/main/rag_app_sample_code/B_quality_iteration/01_root_cause_quality_issues.py) Notebook
13+
1. Open the [`05_evaluate_poc_quality`](https://github.com/databricks/genai-cookbook/blob/v0.2.0/agent_app_sample_code/05_evaluate_poc_quality.py) Notebook
1414

1515
2. Use the queries to load MLflow traces of the records that retrieval quality issues.
1616

genai_cookbook/nbs/5-hands-on-improve-quality-step-1-retrieval.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Retrieval quality is arguably the most important component of a RAG application.
1111

1212
Here's a step-by-step process to address **retrieval quality** issues:
1313

14-
1. Open the [`B_quality_iteration/01_root_cause_quality_issues`](https://github.com/databricks/genai-cookbook/blob/main/rag_app_sample_code/B_quality_iteration/01_root_cause_quality_issues.py) Notebook
14+
1. Open the [`05_evaluate_poc_quality`](https://github.com/databricks/genai-cookbook/blob/v0.2.0/agent_app_sample_code/05_evaluate_poc_quality.py) Notebook
1515

1616
2. Use the queries to load MLflow traces of the records that retrieval quality issues.
1717

genai_cookbook/nbs/5-hands-on-improve-quality-step-1.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@
1313
- If you followed the previous step, this will be the case!
1414
2. All requirements from previous steps
1515

16-
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code)
16+
```{admonition} [Code Repository](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code)
1717
:class: tip
18-
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/main/rag_app_sample_code).
18+
You can find all of the sample code referenced throughout this section [here](https://github.com/databricks/genai-cookbook/tree/v0.2.0/agent_app_sample_code).
1919
```
2020

2121
#### **Overview**
@@ -31,7 +31,7 @@ Each row your evaluation set will be tagged as follows:
3131

3232
The approach depends on if your evaluation set contains the ground-truth responses to your questions - stored in `expected_response`. If you have `expected_response` available, use the first table below. Otherwise, use the second table.
3333

34-
1. Open the [`B_quality_iteration/01_root_cause_quality_issues`](https://github.com/databricks/genai-cookbook/blob/main/rag_app_sample_code/B_quality_iteration/01_root_cause_quality_issues.py) Notebook
34+
1. Open the [`05_evaluate_poc_quality`](https://github.com/databricks/genai-cookbook/blob/v0.2.0/agent_app_sample_code/05_evaluate_poc_quality.py) Notebook
3535
2. Run the cells that are relevant to your use case e.g., if you do or don't have `expected_response`
3636
3. Review the output tables to determine the most frequent root cause in your application
3737
4. For each root cause, follow the steps below to further debug and identify potential fixes:

0 commit comments

Comments
 (0)