-
Notifications
You must be signed in to change notification settings - Fork 3k
Fix MeanAverageRecall: compute mAR@K using top-K detections per image (COCO-compliant) #1967
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
|
Parabéns pela correção! A modificação para que o mAR@K seja calculado por imagem está totalmente alinhada ao protocolo COCO, melhorando a acurácia das métricas. Essa abordagem é recomendada para benchmarks robustos e facilita comparações justas entre modelos. Ótima contribuição para toda a comunidade, obrigado! Assinado: Gabriel. |
Excellent work on fixing the mAR@K calculation! This is a critical correction that addresses a fundamental issue in metric computation. The COCO evaluation protocol indeed requires per-image top-K filtering, and this fix ensures proper compliance. Technical insights:
Implementation notes:
This contribution enhances the library's evaluation accuracy and research reproducibility. Well done! Best regards, |
Thank you for the encouraging feedback and detailed notes, Gabriel 🙏 |
I've added a simple unit test with synthetic data to validate the mAR@K calculation. Test setup:
Result with the original (buggy) implementation:
As shown above, mAR@10 (0.52786) ≠ mAR@100 (0.63622), which is incorrect. |
Description
This PR fixes the calculation of mAR@K in
MeanAverageRecall
to comply with the COCO evaluation protocol.Previously, the implementation selected the top-K predictions globally across all images, rather than per image.
According to the COCO evaluation protocol, mAR@K should be calculated by considering the top-K highest-confidence detections for each image.
This issue is tracked in issue #1966
To resolve this, I modified the
_compute
and_compute_average_recall_for_classes
function to first filter the statistics by confidence score before concatenating them and calculate the confusion matrix.No new dependencies are required for this change.
Type of change
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
I tested the change by running the metric on a dataset with varying numbers of predictions per image and verified that, for each image, only the top-K predictions (by confidence) were used in the mAR@K calculation.
Any specific deployment considerations
No special deployment considerations are required.
Docs