Skip to content

MaxCNN query #20

Open
Open
@stw32

Description

@stw32

Hi there,

I have been trying to run the MaxCNN model and was wondering about this section of the code:

def forward(self, x):
        if x.get_device() == 0:
            tmp = torch.zeros(x.shape[0],x.shape[1],128,4,4).cuda()
        else:
            tmp = torch.zeros(x.shape[0],x.shape[1],128,4,4).cpu()
        for i in range(7):
            tmp[:,i] = self.pool1(   ##F.relu(self.conv7(self.pool1(F.relu(self.conv6(F.relu(self.conv5(self.pool1( F.relu(self.conv4(F.relu(self.conv3( F.relu(self.conv2(F.relu(self.conv1(x[:,i])))))))))))))))))
        x = tmp.reshape(x.shape[0], x.shape[1],4*128*4,1)
        x = self.pool(x)
        x = x.view(x.shape[0],-1)
        x = self.fc2(self.fc(x))
        x = self.max(x)
        return x

In particular, the self.pool layer appears to be applying a maxpool kernel of size (n_windows, 1) (i.e. 7, 1) over a reshaped tensor of shape (x.shape[0], x.shape[1], 2048, 1). The result is that the kernel of height 7 works down the 2048 rows and finds the maximum of each 7-row receptive field. I was expecting the maxpooling to occur across the frames (i.e. the temporal dimension), but this isn't happening. I wonder if this is an error in translating from the original code, because the paper defines this model as "performs max-pooling over ConvNet outputs across time frames", and it does seem like the maxpool kernel size is intended to take number of frames into account (n_windows), but then it isn't applied across frames.

I note that the 1st FC layer is defined as follows:

self.fc = nn.Linear(n_window*int(4*4*128/n_window),512)

which takes into account that the 2048 rows aren't divisible by the maxpool kernel height of 7, but it seems a un-neat which makes me wonder if this is what was originally intended.

Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions