-
Notifications
You must be signed in to change notification settings - Fork 316
Open
Description
I try to get the attention weight of the model like this:
outputs = self.model( vision_x=vision_x, lang_x=lang_x, attention_mask=attention_mask, clear_conditioned_layers=clear_conditioned_layers, past_key_values=past_key_values, use_cache=(past_key_values is not None), output_attentions=True, )
However, the attention weight tuple it returns is tuple of None.
I step into the code and find out it might be a bug in MPT codes in "huggingface/modules/transformers_modules/". The parameter output_attentions has been omitted during the calling of function MPTBlock. forward() in blocks.py.
I try to fix this bug but when running it, the code returns back to its original version.
Is there any solution to it?
Metadata
Metadata
Assignees
Labels
No labels