models qna non rag metrics eval

qna-non-rag-metrics-eval

Overview

The Q&A evaluation flow will evaluate the Q&A systems by leveraging the state-of-the-art Large Language Models (LLM) to measure the quality and safety of your responses. Utilizing GPT and GPT embedding model to assist with measurements aims to achieve a high agreement with human evaluations compared to traditional mathematical measurements.

Inference samples

Inference type	CLI	VS Code Extension
Real time	deploy-promptflow-model-cli-example	deploy-promptflow-model-vscode-extension-example
Batch	N/A	N/A

Sample inputs and outputs (for real-time inference)

Sample input

{
    "inputs": {
        "question": "Which camping table holds the most weight?",
        "answer": "The Alpine Explorer Tent is the most waterproof.",
        "context": "From the our product list, the alpine explorer tent is the most waterproof. The Adventure Dining Tabbe has higher weight.",
        "ground_truth": "The Alpine Explorer Tent has the highest rainfly waterproof rating at 3000m",
        "metrics": "gpt_groundedness,f1_score,ada_similarity,gpt_fluency,gpt_coherence,gpt_similarity,gpt_relevance"
    }
}

Sample output

{
    "outputs": {      
        "f1_score":0.5,
        "gpt_coherence":1,
        "gpt_similarity":1,
        "gpt_fluency":1,
        "gpt_relevance":1,
        "gpt_groundedness":5,
        "ada_similarity":0.9317354400079281
    }
}

Version: 5

View in Studio: https://ml.azure.com/registries/azureml/models/qna-non-rag-metrics-eval/version/5

Properties

is-promptflow: True

azureml.promptflow.section: gallery

azureml.promptflow.type: evaluate

azureml.promptflow.name: QnA Evaluation

azureml.promptflow.description: Compute the quality of the answer for the given question based on the ground_truth and the context

inference-min-sku-spec: 2|0|14|28

inference-recommended-sku: Standard_DS3_v2

Wiki menu

Home
Reference Documentation
- Components
- Data
- Environments
- Models
Contributing

models qna non rag metrics eval

qna-non-rag-metrics-eval

Overview

Inference samples

Sample inputs and outputs (for real-time inference)

Sample input

Sample output

Properties

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!