Skip to content

[Feature] Organize and compare model revisions #1481

@jgreer013

Description

@jgreer013

Feature request

Oumi should provide a way for people to organize and compare experiments across different model architectures or revisions. Hyper parameter tracking is supported through WANDB and Tensorboard, but these are insufficient when modifying model architecture (core underlying layers or components).

Motivation / references

As a user of Oumi, I often run multiple experiments with many different models or different versions of architectures (changing attention versions, activation functions, normalization, etc.)

Right now, there's no easy way to compare or track performance between runs when modifying these types of things. While hyperparameters are tracked and the model summary can be output to logs, these mechanisms aren't sufficient to properly organize and study model revisions when you change other aspects like layer types, parameter counts, etc.

From wolffrost:
This is more a general question about managing artifact versions and best practices. Let me just give some context for my experiments. I started with a gpt2 causal model because I wanted to be able to compare with a known baseline and make sure my general model was behaving as expected. Then I added multihead attention. This morning I switched to RMSNorm and flash attention. Generally speaking, each version of the the model requires a different trained model. How do people manage the different versions of the models with the different trained artifacts? I'm trying to balance "go fast" with "be able to regress and compare". I'm using wandb so I need to distinguish those logs as well. I'm probably looking for a free lunch here that doesn't exist 🙂

Your contribution

N/A

Metadata

Metadata

Assignees

No one assigned

    Labels

    FeaturedesignDiscussion for major changes or additions.enhancementNew feature or requesthelp wantedExtra attention is needed

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions