Skip to content

Evaluation Metrics seem to be over-counting/ inflating the counts of true positives? #28

Open
@keanepotato

Description

@keanepotato

Hi Kevin,

Thanks for your contribution to the ABSA task. I just wanted to bring your attention to the following code block in your utils.py file within the InstructABSA folder. Seems that because each matched prediction isn't removed from the pred_val list, in the case where gt_val contains repeated instances ['food', 'food'], but pred_val contains only one instance ['food'], the model is considered to predict all instances correctly, despite missing out the second 'food'?

  def get_metrics(self, y_true, y_pred, is_triplet_extraction=False):
    total_pred = 0
    total_gt = 0
    tp = 0
    if not is_triplet_extraction:
        for gt, pred in zip(y_true, y_pred):
            gt_list = gt.split(', ')
            pred_list = pred.split(', ')
            total_pred+=len(pred_list)
            total_gt+=len(gt_list)
            for gt_val in gt_list:
                for pred_val in pred_list:
                    if pred_val in gt_val or gt_val in pred_val:
                        tp+=1
                        break

    else:
        for gt, pred in zip(y_true, y_pred):
            gt_list = gt.split(', ')
            pred_list = pred.split(', ')
            total_pred+=len(pred_list)
            total_gt+=len(gt_list)
            for gt_val in gt_list:
                gt_asp = gt_val.split(':')[0]

                try:
                    gt_op = gt_val.split(':')[1]
                except:
                    continue

                try:
                    gt_sent = gt_val.split(':')[2]
                except:
                    continue

                for pred_val in pred_list:
                    pr_asp = pred_val.split(':')[0]

                    try:
                        pr_op = pred_val.split(':')[1]
                    except:
                        continue

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions