Skip to content

Conversation

@arpitkh101
Copy link

@arpitkh101 arpitkh101 commented Nov 28, 2025

Purpose

Implements the PassConfig flag renaming from issue #27995.

Changes:

  • Renamed all PassConfig flags to be more descriptive (e.g., enable_fusion → fuse_norm_quant + fuse_act_quant)
  • Added deprecation warnings for old flag names
  • Old flags still work for backward compatibility
  • Updated all internal code and tests to use new names

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors.

You ask your reviewers to trigger select CI tests on top of fastcheck CI.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

If you have any questions, please reach out to us on Slack at https://slack.vllm.ai.

🚀

@mergify mergify bot added the v1 label Nov 28, 2025
@mergify
Copy link

mergify bot commented Nov 28, 2025

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @arpitkh101.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Nov 28, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors PassConfig flags to be more descriptive, which is a great improvement for clarity. The changes are mostly mechanical renames across the test suite and configuration files. However, I've found a critical issue in the backward compatibility logic for the deprecated flags in vllm/config/compilation.py. The current implementation incorrectly handles cases where old flags are set to False, which would unintentionally enable features. I've provided a suggestion to fix this to ensure correct behavior.

Comment on lines 190 to 238
if self.enable_fusion is not None:
logger.warning_once(
"enable_fusion is deprecated and will be removed in a future release. "
"Use fuse_norm_quant and fuse_act_quant instead."
)
if not self.fuse_norm_quant:
self.fuse_norm_quant = True
if not self.fuse_act_quant:
self.fuse_act_quant = True

if self.enable_attn_fusion is not None:
logger.warning_once(
"enable_attn_fusion is deprecated and will be removed in "
"a future release. Use fuse_attn_quant instead."
)
if not self.fuse_attn_quant:
self.fuse_attn_quant = True

if self.enable_noop is not None:
logger.warning_once(
"enable_noop is deprecated and will be removed in a future release. "
"Use eliminate_noops instead."
)
if not self.eliminate_noops:
self.eliminate_noops = True

if self.enable_sequence_parallelism is not None:
logger.warning_once(
"enable_sequence_parallelism is deprecated and will be "
"removed in a future release. Use enable_sp instead."
)
if not self.enable_sp:
self.enable_sp = True

if self.enable_async_tp is not None:
logger.warning_once(
"enable_async_tp is deprecated and will be removed in "
"a future release. Use fuse_gemm_comms instead."
)
if not self.fuse_gemm_comms:
self.fuse_gemm_comms = True

if self.enable_fi_allreduce_fusion is not None:
logger.warning_once(
"enable_fi_allreduce_fusion is deprecated and will be "
"removed in a future release. Use fuse_allreduce_rms instead."
)
if not self.fuse_allreduce_rms:
self.fuse_allreduce_rms = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The current backward compatibility logic for deprecated flags is incorrect. If an old flag is explicitly set to False (e.g., enable_fusion=False), the corresponding new flag(s) will be set to True. This is because the check if not self.new_flag: will be true for the default False value, and then it's set to True regardless of the old flag's value.

This can lead to unexpected behavior and bugs where features are enabled when they should be disabled.

The logic should be to transfer the boolean value of the old flag to the new flag(s) when the old flag is provided. This ensures that False values are respected.

        if self.enable_fusion is not None:
            logger.warning_once(
                "enable_fusion is deprecated and will be removed in a future release. "
                "Use fuse_norm_quant and fuse_act_quant instead."
            )
            self.fuse_norm_quant = self.enable_fusion
            self.fuse_act_quant = self.enable_fusion

        if self.enable_attn_fusion is not None:
            logger.warning_once(
                "enable_attn_fusion is deprecated and will be removed in "
                "a future release. Use fuse_attn_quant instead."
            )
            self.fuse_attn_quant = self.enable_attn_fusion

        if self.enable_noop is not None:
            logger.warning_once(
                "enable_noop is deprecated and will be removed in a future release. "
                "Use eliminate_noops instead."
            )
            self.eliminate_noops = self.enable_noop

        if self.enable_sequence_parallelism is not None:
            logger.warning_once(
                "enable_sequence_parallelism is deprecated and will be "
                "removed in a future release. Use enable_sp instead."
            )
            self.enable_sp = self.enable_sequence_parallelism

        if self.enable_async_tp is not None:
            logger.warning_once(
                "enable_async_tp is deprecated and will be removed in "
                "a future release. Use fuse_gemm_comms instead."
            )
            self.fuse_gemm_comms = self.enable_async_tp

        if self.enable_fi_allreduce_fusion is not None:
            logger.warning_once(
                "enable_fi_allreduce_fusion is deprecated and will be "
                "removed in a future release. Use fuse_allreduce_rms instead."
            )
            self.fuse_allreduce_rms = self.enable_fi_allreduce_fusion

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 195 to 198
if not self.fuse_norm_quant:
self.fuse_norm_quant = True
if not self.fuse_act_quant:
self.fuse_act_quant = True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Honor false deprecated pass flags

In PassConfig.__post_init__, the deprecated flag handling treats any specified legacy flag as opt-in: inside the enable_fusion block the new fuse_norm_quant/fuse_act_quant flags are forced to True regardless of whether the deprecated flag was explicitly set to False, and the same pattern is repeated for the other deprecated flags. A user setting -O.pass_config.enable_fusion=false (or enable_async_tp=false, etc.) now gets the corresponding fusions/async TP enabled anyway, which inverts the requested behavior and can trigger unwanted compilation paths. The mapping should respect False values, not just non-None.

Useful? React with 👍 / 👎.

Comment on lines 190 to 260
if self.enable_fusion is not None:
logger.warning_once(
"enable_fusion is deprecated and will be removed in a future release. "
"Use fuse_norm_quant and fuse_act_quant instead."
)
if not self.fuse_norm_quant:
self.fuse_norm_quant = self.enable_fusion
if not self.fuse_act_quant:
self.fuse_act_quant = self.enable_fusion

if self.enable_attn_fusion is not None:
logger.warning_once(
"enable_attn_fusion is deprecated and will be removed in "
"a future release. Use fuse_attn_quant instead."
)
if not self.fuse_attn_quant:
self.fuse_attn_quant = self.enable_attn_fusion

if self.enable_noop is not None:
logger.warning_once(
"enable_noop is deprecated and will be removed in a future release. "
"Use eliminate_noops instead."
)
if not self.eliminate_noops:
self.eliminate_noops = self.enable_noop

if self.enable_sequence_parallelism is not None:
logger.warning_once(
"enable_sequence_parallelism is deprecated and will be "
"removed in a future release. Use enable_sp instead."
)
if not self.enable_sp:
self.enable_sp = self.enable_sequence_parallelism

if self.enable_async_tp is not None:
logger.warning_once(
"enable_async_tp is deprecated and will be removed in "
"a future release. Use fuse_gemm_comms instead."
)
if not self.fuse_gemm_comms:
self.fuse_gemm_comms = self.enable_async_tp

if self.enable_fi_allreduce_fusion is not None:
logger.warning_once(
"enable_fi_allreduce_fusion is deprecated and will be "
"removed in a future release. Use fuse_allreduce_rms instead."
)
if not self.fuse_allreduce_rms:
self.fuse_allreduce_rms = self.enable_fi_allreduce_fusion
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a utility to vllm/config/utils.py that takes the old name, the new name and the removal version? We do this often enough that a utility like this would be nice to have.

i.e. in utills:

def handle_deprecated(config: ConfigT, old_name, new_name, removal_version):
   old_val = getattr(config, old_name)
    if old_val is None:
        return
    logger.warning(
        "%s is deprecated and will be removed in %s. Use %s instead",
        old_name, removal_version, new_name,
    )
    setattr(config, new_name, old_val)

and then here:

        handle_deprecated(self, "enable_noop", "eliminate_noops", "v0.13.0")
        ...

Signed-off-by: arpitkh101 <[email protected]>
@arpitkh101 arpitkh101 force-pushed the rename-passconfig-flags branch from 15411fa to 0c4eb98 Compare November 29, 2025 04:22
@mergify mergify bot removed the needs-rebase label Nov 29, 2025
@arpitkh101 arpitkh101 force-pushed the rename-passconfig-flags branch from ece81a1 to 289336c Compare November 29, 2025 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants