Skip to content

12th of July Updates #7

@alebjanes

Description

@alebjanes

RAG Evaluation

  1. 100 questions
    Types of questions:
  • 60 on general trade
  • 12 on growth/variation
  • 28 on rankings
  1. RAG evaluation results

Best combination tested so far: multi-qa-mpnet-base-cos-v1 (embeddings) + gpt-3.5-turbo (LLM)

  • Accuracy: 73%
    • Answers missing data: 9
    • Answers missing context: 14
    • Incorrect answers: 4
  • Average latency: 4.19 s

Out of the wrong answers:

  • 24 were general questions
  • 3 of growth (the lowest)
  • 0 of ranking questions

We're preparing a presentation gathering the results of all approaches with more detail. Next week I'll be improving the RAG + LLM and evaluating the previous multi-layer approach with Pippo.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions