Questions about reproducibility #4

soc12 · 2024-06-17T19:48:26Z

Hi,

Congratulations on the paper—it is truly interesting! I have a few questions regarding the implementation and the reproducibility of the results.

For the Cityscapes dataset, I downloaded the leftImg8bit images along with the gtFine annotations. Then, I used the CityscapesScripts to obtain the trainLabelIds. Is this the correct procedure to set up the data?

Additionally, I have some specific questions about the training script:

In train.py, optimizer.zero_grad() appears to be missing. Is this intentional?
The default learning rate in the code is 0.0001, whereas the paper mentions a learning rate of 0.004. Which one is correct?
When enabling all the losses, the MAV loss becomes extremely large and the training becomes very unstable. Essentially, the model does not learn anything. I have tried using both learning rates mentioned above, as well as including and excluding optimizer.zero_grad(). Why is this happening? Are there specific hyperparameters required for the algorithm to function correctly?

Lastly, if possible, it would be incredibly helpful to have a set of instructions to reproduce the results.

Thanks!

The text was updated successfully, but these errors were encountered:

aggelos-michael-papadopoulos · 2024-07-11T12:03:44Z

I am also encountering several issues with the current implementation:

The MAV loss consistently takes on a very large value, regardless of the learning rate used. Furthermore, the test function seems to incorrectly mark the unknown labels from the training set as known. If I am not mistaken, this is incorrect because these labels are used in the objectosphere loss as unknown and pushed to zero. Even when ignoring these mislabeled unknowns, the mIoU for the test set using only the contrastive decoder for knowns/unknowns yields poor results. The maximum mIoU for unknowns observed was only 0.07. This is in the case we training unknows as -1, knows as 0 and test unknowns as 1 with mIoU ignore_index = -1

Finally, there is a discrepancy between the objectosphere loss in the code and the one described in the paper. The paper specifies that the norm should be squared, but this is not reflected in the code. Additionally, in the code, the unknown component is multiplied by 10.

Kimiarfaie · 2024-11-11T17:12:38Z

Hi, I am facing the same issue with the loss (the feature loss) taking a very large value, has anyone found a solution to this problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about reproducibility #4

Questions about reproducibility #4

soc12 commented Jun 17, 2024

aggelos-michael-papadopoulos commented Jul 11, 2024

Kimiarfaie commented Nov 11, 2024

Questions about reproducibility #4

Questions about reproducibility #4

Comments

soc12 commented Jun 17, 2024

aggelos-michael-papadopoulos commented Jul 11, 2024

Kimiarfaie commented Nov 11, 2024