assert attn_channel_group_size % num_key_value_groups == 0 and (attn_channel_group_size // num_key_value_groups) % head_size == 0

When applying MultiPruner to qwen2.5-0.5b, because qwen2.5-0.5b has num_attention_heads: 14, num_key_value_heads: 2, and head_size: 64, the attn_channel_group_size must be set to at least 448. However, this results in too few groups (only two), which might impact the results. Could this have a significant effect on the outcome? Are there alternative approaches?