This repository was archived by the owner on Oct 25, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 210
Llava Models #852
Comments
Thanks for your usage!!! Regrads, |
It's for multi-model training, but optimization is WIP. |
@kevinintel how do you do the optimization for the llava model and use it ? |
Someone tried low-bits for llava: https://arxiv.org/pdf/2306.00978.pdf and we will try to quantize it. |
Thanks @kevinintel |
delock
pushed a commit
to delock/intel-extension-for-transformers
that referenced
this issue
Dec 16, 2023
Hi, support for quantization of multimodal models is currently planned, and any updates will be communicated here. |
we can optimize llava in intel/neural-compressor#1797 |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Does the Llava part work ?
https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/transformers/modeling/llava_models
If so are they optimized for Intel Device and are there any examples ?
Thanks for building this library. I have seen the token generation performance to be very good compared to OpenVino.
Great work !
Thanks
The text was updated successfully, but these errors were encountered: