Skip to content

[feature] Load saved model to fine tune #48

@supreme-gg-gg

Description

@supreme-gg-gg

User Story

Instead of always picking from a base model from hf or unsloth, use an already-tuned model. This sounds weird but in some use case you might want to first do instruction tuning with SFT for domain adaptation, then do DPO for preference alignment, both requires different dataset and different setup. This is not possible now since you cannot load a saved model to fine tune.

Alternative

We can also implement trainers like ORPO that does instruction + preference in one step + doesn't need reference model
https://huggingface.co/docs/trl/main/en/orpo_trainer

Acceptance Criteria

Find a use case for this and test it and submit a performance improvement validation

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requestedtrainingFine tuning related features

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions