Skip to content

Conversation

@namgyu-youn
Copy link
Contributor

@namgyu-youn namgyu-youn commented Jan 10, 2026

Summary:
Calibration-based PTQ methods like AWQ and GPTQ have been used TransformerEvalWrapper, which wraps ao model to run lm-eval. Instead, we want to run lm-eval directly, by using HFLM. The change log can be summarized to

# Old
from torchao._models._eval import TransformerEvalWrapper

TransformerEvalWrapper(
    model=model,
    tokenizer=tokenizer,
    max_seq_length=max_seq_length,
).run_eval(
    tasks=tasks,
    limit=calibration_limit,
)
# New (direct lm-eval)
from lm_eval import evaluator
from lm_eval.models.huggingface import HFLM

evaluator.simple_evaluate(
    HFLM(pretrained=model, tokenizer=tokenizer),
    tasks=tasks,
    limit=calibration_limit,
    batch_size=1,
)

Test plan:

python quantize_and_upload.py --model_id meta-llama/Llama-3.2-1B --quant AWQ-INT4

@pytorch-bot
Copy link

pytorch-bot bot commented Jan 10, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3617

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 10, 2026
@namgyu-youn
Copy link
Contributor Author

@pytorchbot label "topic: deprecation"

@pytorch-bot pytorch-bot bot added the topic: deprecation Use this tag if this PR deprecates a feature label Jan 10, 2026
@jerryzh168
Copy link
Contributor

jerryzh168 commented Jan 10, 2026

oh does this work with gemma3 as well? I remember I need to make some modifications before: 2a98f58

```
For using lm_eval as a calibration data source, you would:
1. Run lm_eval on your model with calibration tasks
2. Collect the inputs during that run using `MultiTensorInputRecorder`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code examples?

Copy link
Contributor Author

@namgyu-youn namgyu-youn Jan 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider example code because: 1) not familiar with GPTQ workflow, 2) GPTQ api migration is work in progress at #3517. How about focusing on migration instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code example for temporary filling up, please take a look.

@namgyu-youn
Copy link
Contributor Author

namgyu-youn commented Jan 10, 2026

oh does this work with gemma3 as well? I remember I need to make some modifications before: 2a98f58

Not yet, but isn't it lm-eval issue? Should we update this opaque wrapper whenever architecture issue happen?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: deprecation Use this tag if this PR deprecates a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants