[Question] Clarification on the LLaMA initialization checkpoint for Emu2

Hello Emu2 Team,
First and foremost, thank you for your incredible work on Emu2 and for open-sourcing this powerful model. It's a fantastic contribution to the multimodal research community.
I am currently digging deep into your work to better understand its foundations. For the purpose of reproducibility and a thorough analysis of the model's properties, understanding the exact starting checkpoint is crucial.
In your paper (arXiv:2312.13286, Sec. 2.1), you state that the Multimodal Modeling component was initialized with a **LLaMA-33B** model. As far as I know, Meta's official LLaMA-1 release included 7B, 13B, 30B, and 65B models. This has led to a crucial question for my research setup:
Could you please clarify which specific pre-trained checkpoint was used for the LLaMA-33B initialization?
For instance, was it the **LLaMA-1-30B** checkpoint? Or perhaps a community-finetuned version like **Vicuna-33B**?
Knowing the precise origin of the base LLM would be immensely helpful for any researcher looking to build upon or reproduce aspects of your training methodology. I've looked through the repository configs but couldn't pinpoint this specific detail.
Thank you for your time and for creating such an inspiring project!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Question] Clarification on the LLaMA initialization checkpoint for Emu2 #99

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Question] Clarification on the LLaMA initialization checkpoint for Emu2 #99

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions