Skip to content

[Question] Clarification on the LLaMA initialization checkpoint for Emu2 #99

@panli0

Description

@panli0

Hello Emu2 Team,
First and foremost, thank you for your incredible work on Emu2 and for open-sourcing this powerful model. It's a fantastic contribution to the multimodal research community.
I am currently digging deep into your work to better understand its foundations. For the purpose of reproducibility and a thorough analysis of the model's properties, understanding the exact starting checkpoint is crucial.
In your paper (arXiv:2312.13286, Sec. 2.1), you state that the Multimodal Modeling component was initialized with a LLaMA-33B model. As far as I know, Meta's official LLaMA-1 release included 7B, 13B, 30B, and 65B models. This has led to a crucial question for my research setup:
Could you please clarify which specific pre-trained checkpoint was used for the LLaMA-33B initialization?
For instance, was it the LLaMA-1-30B checkpoint? Or perhaps a community-finetuned version like Vicuna-33B?
Knowing the precise origin of the base LLM would be immensely helpful for any researcher looking to build upon or reproduce aspects of your training methodology. I've looked through the repository configs but couldn't pinpoint this specific detail.
Thank you for your time and for creating such an inspiring project!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions