Skip to content

Incorrect val/metrics/MulticlassAUROC/epoch in model training #323

@redwangwangwang

Description

@redwangwangwang

Thanks for this great work! I used your code to fine-tune a model that distinguishes 4 categories from images. I found that during the training process, the val/metrics/MulticlassAUROC/epoch indicator was very low, only about 0.2, but the val/metrics/MulticlassAUROC/step was very high, and the val/metrics/MulticlassAUROC/epoch and val/metrics/MulticlassAUROC/step indicators were normal. Then I tested the performance of the validation set in the same epoch model, and used sklearn.metrics to count various indicators including micro auc and macro auc, which achieved good performance of about 0.9. Is there a problem with the MulticlassAUROC settings of the model training?

    postprocessing:
      metrics:
        pred:
          - "$lambda x: torch.softmax(x, 1)" # Note: Change to $lambda x: torch.sigmoid(x) for Task 2 and Task 3
        # target:  # Note: Uncomment for Task 2 and Task 3
        #   - "$lambda x: x.long()"
      # criterion: # Note: Uncomment for Task 2 and Task 3
      #   pred:
      #     - "$lambda x: x.squeeze(1)"
      #   target:
      #     - "$lambda x: x.float()"

    criterion:
      _target_: torch.nn.CrossEntropyLoss # Note: Change to torch.nn.BCEWithLogitsLoss for Task 2 and Task 3

    optimizer:
        _target_: torch.optim.Adam
        params: "$@system#model.parameters()"
#        lr: "$((@system#batch_size * @trainer#devices)/256) * 0.1" # Compute LR dynamically for different batch sizes
#        weight_decay: 0.0
#        momentum: 0.9
      # lr: "$((@system#batch_size * @trainer#devices)/256) * 0.1" # Compute LR dynamically for different batch sizes
      # weight_decay: 0.0
      # momentum: 0.9

    scheduler:
      _target_: torch.optim.lr_scheduler.StepLR
      optimizer: "@system#optimizer"
      step_size: 30

    metrics:
      train:
        - _target_: torchmetrics.AveragePrecision
          task: multiclass # Note: Change to `binary` for Task 2 and Task 3 and remove num_classes below
          num_classes: 4
        - _target_: torchmetrics.AUROC
          task: multiclass # Note: Change to `binary` for Task 2 and Task 3 and remove num_classes below
          num_classes: 4

      val: "%#train"
      test: "%#train"

Here are some of my related settings: I hope to get your reply and help, and thank you again for this work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions