You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the forward() for MultiHeadAttention class in assignment3/cs231n/transformer_layers.py People can only get the provided expected_self_attn_output if people do attention weights --- dropout --- attention weights after dropout X value matrix. However, your assignment instruction explicitly instructed people to follow a different order, namely, attention weights --- attention weights X value matrix --- dropout. If people follow the order you actually instructed, their self_attn_output will be different from the provided expected_self_attn_output. So the check you provided in your Transformer_Captioning.ipynb is wrong.
The text was updated successfully, but these errors were encountered:
manuka2
changed the title
assignment 3 Q2 self attention section: the expected_self_attn_output provided is wrong
2021 assignment 3 Q2 self attention section: the expected_self_attn_output provided is wrong
Feb 7, 2022
Hi there! I'm working on the same problem these days and it just drives me nuts.
Previously I was not quite familiar with the way Transformer works. Yesterday I read the supplementary materials and revisited the problem, but the error is still 1.0 ... There has not been any solution for reference so I wonder if you could help me, maybe?
Btw, I'm not a student taking the class right now lol. So there's no need to worry about things like Honor Code. I'm just watching the 2017 videos and doing the 2022 assignments.
In the
forward()
forMultiHeadAttention
class inassignment3/cs231n/transformer_layers.py
People can only get the provided
expected_self_attn_output
if people doattention weights --- dropout --- attention weights after dropout X value matrix
. However, your assignment instruction explicitly instructed people to follow a different order, namely,attention weights --- attention weights X value matrix --- dropout
. If people follow the order you actually instructed, theirself_attn_output
will be different from the providedexpected_self_attn_output
. So the check you provided in your Transformer_Captioning.ipynb is wrong.The text was updated successfully, but these errors were encountered: