Skip to content

IndexError: Shape mismatch between mask and prediction during NYUV2 probing #2

@peterwisu

Description

@peterwisu

First of all, thank you for the amazing work on this project.

I’m running monocular depth estimation on the NYU dataset using the linear probing setup with the iBOT-Base (ViT-B/16) model. I’m using the evaluation benchmark exactly as provided, without any modifications, but the process fails with a shape mismatch error between the predicted depth map and the ground-truth mask:

IndexError: The shape of the mask [8, 1, 448, 448] at index 2 does not match the shape of the indexed tensor [8, 1, 112, 112] at index 2

From what I can tell, the issue seems to come from a mismatch in resolution between the dataset and the probing model. The NYU_geonet dataset class resizes both the RGB image and its depth map to 448×448 before returning them. The iBOT-Base backbone (ViT-B/16) then processes these 448×448 images with a patch size of 16, producing a 28×28 patch grid. These tokens are reshaped into a tensor of shape (B, C, 28, 28) and passed into the DepthProbeModel.

Inside that model, the line

x = F.interpolate(x, scale_factor=4, mode="bilinear")

upsamples the feature map by a factor of four, giving a 112×112 output. Then

x = self.conv(x)

predicts 256 depth-bin logits per spatial location, and
depth = self.predict(x)

converts those logits into a continuous depth map using linear normalization. The final prediction therefore has shape (B, 1, 112, 112).

The evaluation script later tries to compare this predicted depth map (B, 1, 112, 112) with the ground-truth depth (B, 1, 448, 448) from the dataset, which causes the shape mismatch error.

Could you please clarify how this resolution difference is supposed to be handled in the benchmark? Should the ground-truth depth be downsampled to 112×112 before computing the loss, or should the predicted map be upsampled back to 448×448?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions