Open
Description
In the paper, using CCM can improve performance. In the open review system, the authors comment "While it is true that CCM is a single linear layer, it can still strongly modify how information is provided to the downstream discriminator as its weights stay fixed. "
I've read the description of CCM many times and still can't understand why random projection is so important? Even if we don't fuse features across layers. Is there related literature or more insight? Thank you everyone.
Metadata
Metadata
Assignees
Labels
No labels