Skip to content

Why is the load_from attribute in the train config for snan-print set to a pre-trained SCM model? #291

@PeiqinSun

Description

@PeiqinSun

Hi, your open-source work is excellent. While reproducing snan-print, I noticed that in the training script, the load_from attribute for the student is set to a checkpoint that can already produce good results in one step. Is this the intended behavior? Shouldn't the student be initialized with the teacher instead? This seems to be inconsistent with the description in the paper. If I want to understand your training process better, should I change it to initialize the student with the teacher?
Looking forward to your response.
Config: https://github.com/NVlabs/Sana/blob/main/configs/sana_sprint_config/1024ms/SanaSprint_1600M_1024px_allqknorm_bf16_scm_ladd.yaml#L23

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions