[Feature] Organize and compare model revisions

### Feature request

Oumi should provide a way for people to organize and compare experiments across different model architectures or revisions. Hyper parameter tracking is supported through WANDB and Tensorboard, but these are insufficient when modifying model architecture (core underlying layers or components).

### Motivation / references

As a user of Oumi, I often run multiple experiments with many different models or different versions of architectures (changing attention versions, activation functions, normalization, etc.)

Right now, there's no easy way to compare or track performance between runs when modifying these types of things. While hyperparameters are tracked and the model summary can be output to logs, these mechanisms aren't sufficient to properly organize and study model revisions when you change other aspects like layer types, parameter counts, etc.

From wolffrost:
This is more a general question about managing artifact versions and best practices. Let me just give some context for my experiments. I started with a gpt2 causal model because I wanted to be able to compare with a known baseline and make sure my general model was behaving as expected. Then I added multihead attention. This morning I switched to RMSNorm and flash attention. Generally speaking, each version of the the model requires a different trained model. How do people manage the different versions of the models with the different trained artifacts? I'm trying to balance "go fast" with "be able to regress and compare". I'm using wandb so I need to distinguish those logs as well. I'm probably looking for a free lunch here that doesn't exist 🙂

### Your contribution

N/A

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Organize and compare model revisions #1481

Feature request

Motivation / references

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Organize and compare model revisions #1481

Description

Feature request

Motivation / references

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions