Skip to content

Conversation

@nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Nov 20, 2025

Changes

Added optimized OpenVINO weights compression for fp8e4m3 data type.

Model Memory Before (MiB) Memory After (MiB) Time Before (sec) Time After (sec)
Llama-3.2-1B 2328.03 2394.14 (+2.84%) 63.52 14.03 (-77.92%)
Phi-4-mini 5608.48 5197.70 (-7.33%) 187.34 28.05 (-85.03%)
Llama-3.1-8B 9918.52 8443.87 (-14.86%) 399.14 48.48 (-87.86%)

Reason for changes

UX improvement.

Tests

Extended existing tests.

https://github.com/openvinotoolkit/nncf/actions/runs/19767009608

@github-actions github-actions bot added the NNCF OpenVINO Pull requests that updates NNCF OpenVINO label Nov 20, 2025
@nikita-savelyevv nikita-savelyevv force-pushed the ns/f8e4m3-optimized-compression branch from bd421c7 to d5cc3f8 Compare November 28, 2025 11:50
@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review November 28, 2025 16:04
@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner November 28, 2025 16:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

NNCF OpenVINO Pull requests that updates NNCF OpenVINO

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant