Skip to content

Another MLP implementation along with multiplier support #937

@Tcc0403

Description

@Tcc0403

🚀 The feature, motivation and pitch

Currently, LigerMLP modules only fuse swiglu/geglu computations together and leave matmuls untouched. These elementwise operations (including multiplier support #936 ) could be easily fused into matmul's epilogues. We can investigate the performance of this approach and see if we should adopt it.

TL;DR

Instead of

gate_states = self.gate_proj(x)
up_states = self.up_proj(x)
intermidiate_states = LigerSiLUMulFunction.apply(gate_states , up_states)
return self.down_proj(intermidiate_states)

There are some other approaches worth exploring:

  1. fuse activations (and multiplier) into gate_proj(x)
up_states = self.up_proj(x)
intermidiate_states = LigerFusedLinearActMultiplierFunction.apply(x, self.gate_proj.weight, gate_multiplier, up_states)
return self.down_proj(intermidiate_states)
  1. stack gate and up projections then put it into activation functions
gate_up_states = self.gate_up_proj(x)
intermidiate_states = LigerSplitStatesActMultiplierFunction.apply(
	gate_up_states, 
	config.hidden_act, 
	gate_multiplier, 
	up_states
)
return self.down_proj(intermidiate_states)
  1. dual gemm with activations (and multiplier)
intermidiate_states = LigerDualGemmActMulFuncion.apply(
	x, 
	self.gate_proj.weight, 
	self.up_proj.weight, 
	config.hidden_act, 
	gate_multiplier,
)
return self.down_proj(intermidiate_states)

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions