12th of July Updates

RAG Evaluation

1. 100 questions
Types of questions:
- 60 on general trade
- 12 on growth/variation
- 28 on rankings

2. RAG evaluation results

Best combination tested so far: multi-qa-mpnet-base-cos-v1 (embeddings) + gpt-3.5-turbo (LLM)
- Accuracy: 73%
  - Answers missing data: 9
  - Answers missing context: 14
  - Incorrect answers: 4
- Average latency: 4.19 s

Out of the wrong answers:
- 24 were general questions
- 3 of growth (the lowest)
- 0 of ranking questions

We're preparing a presentation gathering the results of all approaches with more detail. Next week I'll be improving the RAG + LLM and evaluating the previous multi-layer approach with Pippo.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

12th of July Updates #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

12th of July Updates #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions