-
Notifications
You must be signed in to change notification settings - Fork 438
Open
Description
🚀 The feature, motivation and pitch
Falcon H1 MLP block uses gate and down proj multipliers. Currently this is unsupported for Liger Kernels.
Can we add support of Swiglu MLP Liger kernel with gate and down proj multipliers ?
class FalconH1MLP(nn.Module):
def __init__(self, config: FalconH1Config):
super().__init__()
self.config = config
self.hidden_size = config.hidden_size
self.intermediate_size = config.intermediate_size
self.gate_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=config.mlp_bias)
self.up_proj = nn.Linear(self.hidden_size, self.intermediate_size, bias=config.mlp_bias)
self.down_proj = nn.Linear(self.intermediate_size, self.hidden_size, bias=config.mlp_bias)
self.act_fn = ACT2FN[config.hidden_act]
self.gate_multiplier, self.down_multiplier = config.mlp_multipliers
def forward(self, x):
y = self.up_proj(x) * self.act_fn(self.gate_proj(x) * self.gate_multiplier)
y = self.down_proj(y) * self.down_multiplier
return y
Alternatives
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
No labels