[feature] Load saved model to fine tune

## User Story

Instead of always picking from a base model from hf or unsloth, use an already-tuned model. This sounds weird but in some use case you might want to first do instruction tuning with SFT for domain adaptation, then do DPO for preference alignment, both requires different dataset and different setup. This is not possible now since you cannot load a saved model to fine tune.

## Alternative

We can also implement trainers like ORPO that does instruction + preference in one step + doesn't need reference model
https://huggingface.co/docs/trl/main/en/orpo_trainer

## Acceptance Criteria

Find a use case for this and test it and submit a performance improvement validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[feature] Load saved model to fine tune #48

User Story

Alternative

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[feature] Load saved model to fine tune #48

Description

User Story

Alternative

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions