Return correct output shape for MultiHeadAttention #25

Callidior · 2019-09-06T12:00:41Z

In contrast to MultiHeadSelfAttention, MultiHeadAttention has two inputs but only one input. The current implementation does not override compute_output_shape, which by default returns the input shapes unmodified. Instead, only the input shape of the decoder must be returned.
Otherwise, this results in errors during model construction if the sequence length of the encoder and decoder differ.

In contrast to `MultiHeadSelfAttention`, `MultiHeadAttention` has two inputs but only one input. The current implementation does not override `compute_output_shape`, which by default returns the input shapes unmodified. Instead, only the input shape of the decoder must be returned. Otherwise, this results in errors during model construction if the sequence length of the encoder and decoder differ.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return correct output shape for MultiHeadAttention #25

Return correct output shape for MultiHeadAttention #25

Callidior commented Sep 6, 2019

Return correct output shape for MultiHeadAttention #25

Are you sure you want to change the base?

Return correct output shape for MultiHeadAttention #25

Conversation

Callidior commented Sep 6, 2019