Fix skip softmax defaults by rohansjoshi · Pull Request #923 · NVIDIA/Model-Optimizer

rohansjoshi · 2026-02-24T00:51:11Z

Set default skip softmax attention sparsity settings (in SKIP_SOFTMAX_CALIB) to correct values: max_seqlen 16384 and prefill/decode target sparsity ratios of 0.5

Summary by CodeRabbit

Bug Fixes
- Updated attention sparsity calibration parameters to reduce target sparse ratio from 0.9 to 0.5 and maximum sequence length from 65536 to 16384.

copy-pr-bot · 2026-02-24T00:51:14Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-24T00:51:31Z

📝 Walkthrough

Walkthrough

Configuration values updated in attention sparsity settings. Target sparse ratio reduced from 0.9 to 0.5 for both prefill and decode modes, and maximum sequence length reduced from 65536 to 16384.

Changes

Cohort / File(s)	Summary
Configuration Updates `modelopt/torch/sparsity/attention_sparsity/config.py`	Updated SKIP_SOFTMAX_CALIB sparse_cfg calibration parameters: reduced target_sparse_ratio values from 0.9 to 0.5 for prefill and decode, and decreased max_seqlen from 65536 to 16384.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Fix skip softmax defaults' directly and concisely describes the main change in the pull request—updating default values for skip softmax attention sparsity settings.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch rohjoshi/sparse

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

modelopt/torch/sparsity/attention_sparsity/config.py (1)
404-404: max_seqlen in SKIP_SOFTMAX_CALIB now diverges from CalibrationConfig's default.

SKIP_SOFTMAX_CALIB sets max_seqlen = 16384, but CalibrationConfig.max_seqlen still defaults to 32768 (line 187). A user constructing CalibrationConfig() directly will get a different effective ceiling than a user relying on SKIP_SOFTMAX_CALIB. Consider aligning the two, or add a comment to SKIP_SOFTMAX_CALIB noting the deliberate divergence.

Additionally, for models commonly used with sequences > 16384 tokens (e.g., 32K/128K-context variants), the exponential threshold model will be extrapolating beyond its calibrated range, which may degrade calibration quality at those lengths.
💡 Aligning `CalibrationConfig.max_seqlen` default with `SKIP_SOFTMAX_CALIB`
     max_seqlen: int = ModeloptField(
-        default=32768,
+        default=16384,
         title="Maximum sequence length",
         description="Maximum sequence length for calibration (length bins auto-generated as powers of 2).",
     )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@modelopt/torch/sparsity/attention_sparsity/config.py` at line 404,
SKIP_SOFTMAX_CALIB sets "max_seqlen = 16384" but CalibrationConfig.max_seqlen
defaults to 32768, causing inconsistent ceilings; update them to match or
document the intentional divergence. Fix by either (A) changing
CalibrationConfig.max_seqlen default to 16384 to align with SKIP_SOFTMAX_CALIB,
or (B) updating the SKIP_SOFTMAX_CALIB entry to set max_seqlen =
CalibrationConfig.max_seqlen (or add a comment on why 16384 is intentionally
lower), and add a comment on the extrapolation risk for contexts >16384 so
callers know calibration may degrade; reference the identifiers
SKIP_SOFTMAX_CALIB and CalibrationConfig.max_seqlen when applying the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@modelopt/torch/sparsity/attention_sparsity/config.py`:
- Line 404: SKIP_SOFTMAX_CALIB sets "max_seqlen = 16384" but
CalibrationConfig.max_seqlen defaults to 32768, causing inconsistent ceilings;
update them to match or document the intentional divergence. Fix by either (A)
changing CalibrationConfig.max_seqlen default to 16384 to align with
SKIP_SOFTMAX_CALIB, or (B) updating the SKIP_SOFTMAX_CALIB entry to set
max_seqlen = CalibrationConfig.max_seqlen (or add a comment on why 16384 is
intentionally lower), and add a comment on the extrapolation risk for contexts
>16384 so callers know calibration may degrade; reference the identifiers
SKIP_SOFTMAX_CALIB and CalibrationConfig.max_seqlen when applying the change.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52e662d and d0c0f51.

📒 Files selected for processing (1)

modelopt/torch/sparsity/attention_sparsity/config.py

Fix skip softmax defaults

d0c0f51

rohansjoshi requested a review from a team as a code owner February 24, 2026 00:51

rohansjoshi requested a review from RalphMao February 24, 2026 00:51

coderabbitai bot reviewed Feb 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix skip softmax defaults#923

Fix skip softmax defaults#923
rohansjoshi wants to merge 1 commit intomainfrom
rohjoshi/sparse

rohansjoshi commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Feb 24, 2026

Uh oh!

coderabbitai bot commented Feb 24, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

rohansjoshi commented Feb 24, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Feb 24, 2026

Uh oh!

coderabbitai bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rohansjoshi commented Feb 24, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 24, 2026 •

edited

Loading