Skip to content

Conversation

@VimalWill
Copy link

The PR includes a fix to prevent hoisting of the bit extension (i.e., the dequantization operation), which was previously missing in ConstExpr.cpp. The absence of this logic caused the dequantization operation to be incorrectly hoisted in HoistIntoGlobalsPass.cpp.

Regards,
Vimal William

@VimalWill VimalWill requested a review from benvanik as a code owner October 26, 2025 03:31
@VimalWill VimalWill changed the title fixed constexpr to avoid hoisting dequant operations [GlobalOptimization]fixed constexpr to avoid hoisting dequant operations Oct 26, 2025
Comment on lines 465 to 469
// Check 4: Does hoisting this value significantly increase the size of the
// module?
if (doesHoistingIncreaseSizeSignificantly(
info, constExprMaxSizeIncreaseThreshold)) {
return decision->disableHoist();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this already be handled by this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases, the size-based heuristic may fail to detect dequantization hoisting (e.g., for tensors like tensor<3x3x3x32xi8>), resulting in inflated global constants. Introducing the semantic-based condition ensures such bit-extend (i8→f32) operations are excluded from hoisting, preventing unnecessary runtime memory overhead and preserving quantized model compactness.

PS: I found it's hosting dequantization (3x3x3x32xI8 -> 3x3x3x32xf32)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Groverkss, what's your idea on this ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants