-
-
Notifications
You must be signed in to change notification settings - Fork 11.6k
[Core] Rename PassConfig flags as per RFC #27995 #29646
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
👋 Hi! Thank you for contributing to the vLLM project. 💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels. Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run You ask your reviewers to trigger select CI tests on top of Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging. To run CI, PR reviewers can either: Add If you have any questions, please reach out to us on Slack at https://slack.vllm.ai. 🚀 |
|
This pull request has merge conflicts that must be resolved before it can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request refactors PassConfig flags to be more descriptive, which is a great improvement for clarity. The changes are mostly mechanical renames across the test suite and configuration files. However, I've found a critical issue in the backward compatibility logic for the deprecated flags in vllm/config/compilation.py. The current implementation incorrectly handles cases where old flags are set to False, which would unintentionally enable features. I've provided a suggestion to fix this to ensure correct behavior.
vllm/config/compilation.py
Outdated
| if self.enable_fusion is not None: | ||
| logger.warning_once( | ||
| "enable_fusion is deprecated and will be removed in a future release. " | ||
| "Use fuse_norm_quant and fuse_act_quant instead." | ||
| ) | ||
| if not self.fuse_norm_quant: | ||
| self.fuse_norm_quant = True | ||
| if not self.fuse_act_quant: | ||
| self.fuse_act_quant = True | ||
|
|
||
| if self.enable_attn_fusion is not None: | ||
| logger.warning_once( | ||
| "enable_attn_fusion is deprecated and will be removed in " | ||
| "a future release. Use fuse_attn_quant instead." | ||
| ) | ||
| if not self.fuse_attn_quant: | ||
| self.fuse_attn_quant = True | ||
|
|
||
| if self.enable_noop is not None: | ||
| logger.warning_once( | ||
| "enable_noop is deprecated and will be removed in a future release. " | ||
| "Use eliminate_noops instead." | ||
| ) | ||
| if not self.eliminate_noops: | ||
| self.eliminate_noops = True | ||
|
|
||
| if self.enable_sequence_parallelism is not None: | ||
| logger.warning_once( | ||
| "enable_sequence_parallelism is deprecated and will be " | ||
| "removed in a future release. Use enable_sp instead." | ||
| ) | ||
| if not self.enable_sp: | ||
| self.enable_sp = True | ||
|
|
||
| if self.enable_async_tp is not None: | ||
| logger.warning_once( | ||
| "enable_async_tp is deprecated and will be removed in " | ||
| "a future release. Use fuse_gemm_comms instead." | ||
| ) | ||
| if not self.fuse_gemm_comms: | ||
| self.fuse_gemm_comms = True | ||
|
|
||
| if self.enable_fi_allreduce_fusion is not None: | ||
| logger.warning_once( | ||
| "enable_fi_allreduce_fusion is deprecated and will be " | ||
| "removed in a future release. Use fuse_allreduce_rms instead." | ||
| ) | ||
| if not self.fuse_allreduce_rms: | ||
| self.fuse_allreduce_rms = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current backward compatibility logic for deprecated flags is incorrect. If an old flag is explicitly set to False (e.g., enable_fusion=False), the corresponding new flag(s) will be set to True. This is because the check if not self.new_flag: will be true for the default False value, and then it's set to True regardless of the old flag's value.
This can lead to unexpected behavior and bugs where features are enabled when they should be disabled.
The logic should be to transfer the boolean value of the old flag to the new flag(s) when the old flag is provided. This ensures that False values are respected.
if self.enable_fusion is not None:
logger.warning_once(
"enable_fusion is deprecated and will be removed in a future release. "
"Use fuse_norm_quant and fuse_act_quant instead."
)
self.fuse_norm_quant = self.enable_fusion
self.fuse_act_quant = self.enable_fusion
if self.enable_attn_fusion is not None:
logger.warning_once(
"enable_attn_fusion is deprecated and will be removed in "
"a future release. Use fuse_attn_quant instead."
)
self.fuse_attn_quant = self.enable_attn_fusion
if self.enable_noop is not None:
logger.warning_once(
"enable_noop is deprecated and will be removed in a future release. "
"Use eliminate_noops instead."
)
self.eliminate_noops = self.enable_noop
if self.enable_sequence_parallelism is not None:
logger.warning_once(
"enable_sequence_parallelism is deprecated and will be "
"removed in a future release. Use enable_sp instead."
)
self.enable_sp = self.enable_sequence_parallelism
if self.enable_async_tp is not None:
logger.warning_once(
"enable_async_tp is deprecated and will be removed in "
"a future release. Use fuse_gemm_comms instead."
)
self.fuse_gemm_comms = self.enable_async_tp
if self.enable_fi_allreduce_fusion is not None:
logger.warning_once(
"enable_fi_allreduce_fusion is deprecated and will be "
"removed in a future release. Use fuse_allreduce_rms instead."
)
self.fuse_allreduce_rms = self.enable_fi_allreduce_fusionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codex Review
Here are some automated review suggestions for this pull request.
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
vllm/config/compilation.py
Outdated
| if not self.fuse_norm_quant: | ||
| self.fuse_norm_quant = True | ||
| if not self.fuse_act_quant: | ||
| self.fuse_act_quant = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honor false deprecated pass flags
In PassConfig.__post_init__, the deprecated flag handling treats any specified legacy flag as opt-in: inside the enable_fusion block the new fuse_norm_quant/fuse_act_quant flags are forced to True regardless of whether the deprecated flag was explicitly set to False, and the same pattern is repeated for the other deprecated flags. A user setting -O.pass_config.enable_fusion=false (or enable_async_tp=false, etc.) now gets the corresponding fusions/async TP enabled anyway, which inverts the requested behavior and can trigger unwanted compilation paths. The mapping should respect False values, not just non-None.
Useful? React with 👍 / 👎.
vllm/config/compilation.py
Outdated
| if self.enable_fusion is not None: | ||
| logger.warning_once( | ||
| "enable_fusion is deprecated and will be removed in a future release. " | ||
| "Use fuse_norm_quant and fuse_act_quant instead." | ||
| ) | ||
| if not self.fuse_norm_quant: | ||
| self.fuse_norm_quant = self.enable_fusion | ||
| if not self.fuse_act_quant: | ||
| self.fuse_act_quant = self.enable_fusion | ||
|
|
||
| if self.enable_attn_fusion is not None: | ||
| logger.warning_once( | ||
| "enable_attn_fusion is deprecated and will be removed in " | ||
| "a future release. Use fuse_attn_quant instead." | ||
| ) | ||
| if not self.fuse_attn_quant: | ||
| self.fuse_attn_quant = self.enable_attn_fusion | ||
|
|
||
| if self.enable_noop is not None: | ||
| logger.warning_once( | ||
| "enable_noop is deprecated and will be removed in a future release. " | ||
| "Use eliminate_noops instead." | ||
| ) | ||
| if not self.eliminate_noops: | ||
| self.eliminate_noops = self.enable_noop | ||
|
|
||
| if self.enable_sequence_parallelism is not None: | ||
| logger.warning_once( | ||
| "enable_sequence_parallelism is deprecated and will be " | ||
| "removed in a future release. Use enable_sp instead." | ||
| ) | ||
| if not self.enable_sp: | ||
| self.enable_sp = self.enable_sequence_parallelism | ||
|
|
||
| if self.enable_async_tp is not None: | ||
| logger.warning_once( | ||
| "enable_async_tp is deprecated and will be removed in " | ||
| "a future release. Use fuse_gemm_comms instead." | ||
| ) | ||
| if not self.fuse_gemm_comms: | ||
| self.fuse_gemm_comms = self.enable_async_tp | ||
|
|
||
| if self.enable_fi_allreduce_fusion is not None: | ||
| logger.warning_once( | ||
| "enable_fi_allreduce_fusion is deprecated and will be " | ||
| "removed in a future release. Use fuse_allreduce_rms instead." | ||
| ) | ||
| if not self.fuse_allreduce_rms: | ||
| self.fuse_allreduce_rms = self.enable_fi_allreduce_fusion |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a utility to vllm/config/utils.py that takes the old name, the new name and the removal version? We do this often enough that a utility like this would be nice to have.
i.e. in utills:
def handle_deprecated(config: ConfigT, old_name, new_name, removal_version):
old_val = getattr(config, old_name)
if old_val is None:
return
logger.warning(
"%s is deprecated and will be removed in %s. Use %s instead",
old_name, removal_version, new_name,
)
setattr(config, new_name, old_val)and then here:
handle_deprecated(self, "enable_noop", "eliminate_noops", "v0.13.0")
...Signed-off-by: arpitkh101 <[email protected]>
15411fa to
0c4eb98
Compare
Signed-off-by: arpitkh101 <[email protected]>
ece81a1 to
289336c
Compare
Purpose
Implements the PassConfig flag renaming from issue #27995.
Changes: