Open
Description
Hi, thanks for developing a very wonderful project.
I found torch.nn.functional.scaled_dot_product_attention
throws an error when both attn_mask and is_causal are set .
But, currently the language_model.py code uses both.
nanoVLM/models/language_model.py
Line 141 in 6ba9082
A simple fix is to create a causal mask by yourself, but if there's other ways, I want to know.
Metadata
Metadata
Assignees
Labels
No labels