FactScore evaluator class

**Is your feature request related to a problem? Please describe.**
We would like an evaluator class specific to [FactScore](https://arxiv.org/abs/2305.14251). This dataset asks questions about celebrities and leverages Wikipedia content as an answer key. The evaluation approach for a single question is as follows:
1. Generate an LLM response to a question from FactScore dataset
2. Deconstruct that response into individual claims (using an LLM)
3. Calculate the precision relative to the Wikipedia answer key (i.e., proportion of generated claims supported by the Wikipedia content)
4. The precision is the FactScore for that single LLM response

**Describe the solution you'd like**
Assigning this to @virenbajaj  and trusting his guidance for the design. 

**Describe alternatives you've considered**
Loading FactScore dataset directly from `load_example_dataset` utility function and going through the above steps manually

**Additional context**
Refer to the linked paper above for more information


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FactScore evaluator class #196

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

FactScore evaluator class #196

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions