Dear author,
Thanks for sharing the code. According to Eq 1 in your paper, the attribute, w1, and visual features are timed together. However, in the MSDN.py the code has self.V, Fs and W1 to calculate the einsum sum.
According to my understanding, Fs is your visual feature, which is obtained from the res101. What is the meaning of self.V. Suppose it should be the attribute vector, but it is randomly initialized, not the GloVe.
I hope to get more clarification about how to use the GloVe.
Regards,
Yifan