Skip to content

About the a_k, b, x_a, x_b in the paper #36

Open
@auto-Dog

Description

@auto-Dog

Hi, I read your paper Attentional Pooling for Action Recognition and feels it great for my network pooling (a C3D network for video recognition). However, in your code I did not find clear clues about a_k, b, x_a, x_b and the corresponding pooling module in the paper. All I can see is about "POSE_ATTENTION_LOGITS".
image

if cfg.NET.USE_POSE_ATTENTION_LOGITS:
with tf.variable_scope('PoseAttention'):
# use the pose prediction as an attention map to get the features
# step1: split pose logits over channels
pose_logits_parts = tf.split(
pose_logits, pose_logits.get_shape().as_list()[-1],
axis=pose_logits.get_shape().ndims-1)

Can you give me a more brief instruction? so that I can use your attention pooling module to pool a [bsz, 128, 16, 32, 32] feature into [bsz, 128, 1, 32, 32]

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions