Open
Description
As it stands, only the attention primitives Attention
and GlobalAttention
are TorchScript-ed (or, for that matter, TorchScript-able) during inference. For better runtimes and memory allocation, more of the network's modules---especially in the Evoformer---should be made compatible with TorchScript. In my estimation, the biggest hurdle before this goal is the inference-time chunking functionality, which currently makes heavy use of function pointers not supported by TorchScript.