Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about reproducibility #4

Open
soc12 opened this issue Jun 17, 2024 · 2 comments
Open

Questions about reproducibility #4

soc12 opened this issue Jun 17, 2024 · 2 comments

Comments

@soc12
Copy link

soc12 commented Jun 17, 2024

Hi,

Congratulations on the paper—it is truly interesting! I have a few questions regarding the implementation and the reproducibility of the results.

For the Cityscapes dataset, I downloaded the leftImg8bit images along with the gtFine annotations. Then, I used the CityscapesScripts to obtain the trainLabelIds. Is this the correct procedure to set up the data?

Additionally, I have some specific questions about the training script:

  • In train.py, optimizer.zero_grad() appears to be missing. Is this intentional?
  • The default learning rate in the code is 0.0001, whereas the paper mentions a learning rate of 0.004. Which one is correct?
  • When enabling all the losses, the MAV loss becomes extremely large and the training becomes very unstable. Essentially, the model does not learn anything. I have tried using both learning rates mentioned above, as well as including and excluding optimizer.zero_grad(). Why is this happening? Are there specific hyperparameters required for the algorithm to function correctly?

Lastly, if possible, it would be incredibly helpful to have a set of instructions to reproduce the results.

Thanks!

@aggelos-michael-papadopoulos

I am also encountering several issues with the current implementation:

The MAV loss consistently takes on a very large value, regardless of the learning rate used. Furthermore, the test function seems to incorrectly mark the unknown labels from the training set as known. If I am not mistaken, this is incorrect because these labels are used in the objectosphere loss as unknown and pushed to zero. Even when ignoring these mislabeled unknowns, the mIoU for the test set using only the contrastive decoder for knowns/unknowns yields poor results. The maximum mIoU for unknowns observed was only 0.07. This is in the case we training unknows as -1, knows as 0 and test unknowns as 1 with mIoU ignore_index = -1

Finally, there is a discrepancy between the objectosphere loss in the code and the one described in the paper. The paper specifies that the norm should be squared, but this is not reflected in the code. Additionally, in the code, the unknown component is multiplied by 10.

@Kimiarfaie
Copy link

Hi, I am facing the same issue with the loss (the feature loss) taking a very large value, has anyone found a solution to this problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants