Skip to content

Latest commit

 

History

History
21 lines (11 loc) · 1.31 KB

README.md

File metadata and controls

21 lines (11 loc) · 1.31 KB

DL-DIY potential project ideas

  • read the paper and understand the method
  • take a preptrained model on HuggingFace and test the approach

Attention flow

This repository contain implementations of Attention Rollout and Attention Flow algorithms, which are post hoc methods to get more explanatory attention weights.

Attention Rollout and Attention Flow recursively compute the token attentions in each layer of a given model given the embedding attentions as input. They differ in the assumptions they make about how attention weights in lower layers affect the flow of information to the higher layers and whether to compute the token attentions relative to each other or independently.

Here is the paper introducing these methods:

Related projects: