llava conversion fixes #399

RaymondLi0 · 2025-12-02T18:55:15Z

✨ Description

Small changes to be able to load Apriel-1.5-15B:

remove projector_intermediate_size from llava_hybrid and llava converter (use text-config's hidden-size, like in llava)
map hf's gelu activation
fix some of the hf weight prefixes

With these changes, I am able to load Apriel-1.5-15B in fast-llm.
This also points to the issue that these were not caught by the conversion tests.

fast_llm/functional/config.py

jlamypoirier · 2025-12-04T22:59:07Z

fast_llm/models/multimodal/conversion/llava.py

        return {
            "projector_hidden_act": config.activation.hf_name,
            "multimodal_projector_bias": config.add_linear_biases,
-            # Not in LlavaConfig, but needed for consistency check in LlavaBaseModelConverter.


Why removing? This is essential to ensure compatibility.

This is to ensure compatibility with what?
As stated in the comment, this is not in LlavaConfig. And it caused issues when trying to load Apriel-1.5, where it would set the default value for this param and fail in the assertion here https://github.com/ServiceNow/Fast-LLM/pull/399/files#diff-319643f77a4055995eb8f844aee095266ba3b15fa11f52e16acd89386058e51bL314

The projector intermediate size needs to match with the LM hidden size, which is not guaranteed on the Fast-LLM size. The entry is not in the final output, it's there specifically for the assertion in https://github.com/ServiceNow/Fast-LLM/pull/399/files#diff-319643f77a4055995eb8f844aee095266ba3b15fa11f52e16acd89386058e51bL314. A failing assertion points to an actual error elsewhere.

What do you mean by "load Apriel-1.5"? Shouldn't that go through import?

Hmm, I guess this could be due to the bug you fixed above, where the intermediate size was set incorrectly on import?

jlamypoirier · 2025-12-04T23:08:16Z

fast_llm/models/multimodal/conversion/llava.py

        return [
            *cls.embeddings_converter_class.get_converters(
-                config.embeddings, "vision_encoder.embeddings", "model.vision_tower"
+                config.embeddings, "vision_encoder.embeddings", "vision_tower"


Why these changes? The current names are required for LlavaForConditionalGeneration and confirmed to work. The model prefix is explicitly needed for LlavaForConditionalGeneration https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava/modeling_llava.py#L316and the language model is a MistralModel which takes no model prefix.

Hmm indeed, it's strange.
Without all these changes, we're not able to load https://huggingface.co/ServiceNow-AI/Apriel-1.5-15b-Thinker/tree/main in fast-llm. The weights in that model somehow match this different format with language_model.model...

It must have something to do with _checkpoint_conversion_mapping:
https://github.com/huggingface/transformers/blob/main/src/transformers/models/llava/modeling_llava.py#L306-L311

My understanding is that there are two equivalent ways to see the model. It can either be a LlavaForConditionalGeneration with a MistralModel text model, or a LlavaModel with a MistralForCausalLM. Main exports in the first format, but the dev branch seems to use the second one, though is still uses LlavaForConditionalGeneration as the architecture (maybe _checkpoint_conversion_mapping addresses the mismatch?)

I'd think the first option is more appropriate, but I could be wrong. Maybe we could just support both cases.

I see. From what I understand, this _checkpoint_conversion_mapping is something they made for backward compatibility. So indeed I think you're right that the first option is the right one, but our Apriel-1.5 checkpoint uses this older format.
How should we support both cases? Shall we create a new format called llava_legacy or something?

That would work.

RaymondLi0 added 3 commits December 2, 2025 16:35

add non-approximated gelu

8445aaf

remove projector_intermediate_size

aa46283

fix llava hf weight prefixes

17c9970

jlamypoirier reviewed Dec 2, 2025

View reviewed changes

fast_llm/functional/config.py Outdated Show resolved Hide resolved

RaymondLi0 added 3 commits December 3, 2025 20:38

fix vision tower hf prefix

6e5da16

fix intermediate size import

f260277

remove gelu_gaussian

98b6283

RaymondLi0 marked this pull request as ready for review December 3, 2025 22:25

RaymondLi0 requested a review from jlamypoirier December 3, 2025 22:29

RaymondLi0 changed the title ~~Raymond/gelu act~~ llava conversion fixes Dec 3, 2025

jlamypoirier reviewed Dec 4, 2025

View reviewed changes

update

9cce2aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llava conversion fixes #399

llava conversion fixes #399

Uh oh!

RaymondLi0 commented Dec 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

jlamypoirier Dec 4, 2025

Uh oh!

RaymondLi0 Dec 5, 2025

Uh oh!

jlamypoirier Dec 5, 2025

Uh oh!

jlamypoirier Dec 5, 2025

Uh oh!

jlamypoirier Dec 4, 2025

Uh oh!

RaymondLi0 Dec 5, 2025

Uh oh!

RaymondLi0 Dec 5, 2025

Uh oh!

jlamypoirier Dec 5, 2025 •

edited

Loading

Uh oh!

RaymondLi0 Dec 8, 2025

Uh oh!

jlamypoirier Dec 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

llava conversion fixes #399

Are you sure you want to change the base?

llava conversion fixes #399

Uh oh!

Conversation

RaymondLi0 commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✨ Description

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jlamypoirier Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

RaymondLi0 commented Dec 2, 2025 •

edited

Loading

jlamypoirier Dec 5, 2025 •

edited

Loading