Test Set Metrics #2000
Replies: 1 comment
-
Hey guys, Think I figured it out for the most part. Just not the box plots for nodes. But, Here i what I got so far. My model is 3200 frames, so i used 320 to test (10%) from 16 videos that are from a completely separate cohort so, different rats and images it's never seen. It doesn't look like it was too good. Not sure if it was the images chose or what, feels like I could choose super easy poses and inflate the test, so not sure how to go about that. But in comparison with the training results, makes me feel like it overfitted maybe. That low PCK, as far as I can tell from researching it, is in relation to the levers being predicted, as some videos don't even have them come out, so false negatives maybe or something on ones when it should expect them, especially when as far as i can tell the rest looks good. Not sure. But it honestly doesnt look bad when i look at videos so I have no idea how to gauge it. Our videos are 800x600, so ~32pxls at its worst is about 4% x 5.3% of the resolution. Again, not too sure how to evaluate or how I'd report the model in a paper as it looks good but this test says its not (to me, at least). |
Beta Was this translation helpful? Give feedback.
-
Hey guys,
Talmo, was a pleasure meeting you in person finally at the Jackson Lab course and always making yourself so available, a true treat for all.
I came back and began putting together a test set of labeled frames, searched through the discussion board, and didn't nail anything down that was in line with what I was mentioning, such as a way to get output metrics that look like that that is produced from training. Essentially, I am after the same bot plot graph and metrics label.
But, I did remember I kind of had ChatGPT help me put something together that was based off the sleap.nn.evals API. I listed it below. Basically, I see that if I change the inputs before the "metrics = " portion when evaluating labels_gt and _pr to be the labels and results of a test inference I could get close to what I am after. But any edits that supplies the training print out or any other useful ways to reprt in a paper how well the model performs is greatly appreciated.
Thanks as always! Cheers.
-Jarryd
Beta Was this translation helpful? Give feedback.
All reactions