Add wandb play and resume option #3426

renezurbruegg · 2025-09-10T12:38:41Z

Description

Porting over my old orbit PR for wandb loading and resuming of checkpoints. This allows the user to directly play a specified run from wandb.

Usage:
To e.g. resume (or play) the training from a run located at https://wandb.ai/<user_name>/<project_name>/runs/<run_id>

 python scripts/reinforcement_learning/rsl_rl/play.py --task Isaac-Velocity-Flat-Cassie-Play-v0 --num_envs 4 --headless --logger wandb \
 --wandb_run_id <run_id> --wandb_username <user_name> --log_project_name <project_name>

This will immediately download the newest checkpoint (or the one with iteration --wandb_checkpoint_iteration XX) and use it for play or resuming.

If you are using an extension template, make sure to carry over the changes to your cli_args.py

Checklist

I have run the pre-commit checks with ./isaaclab.sh --format
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
I have updated the changelog and the corresponding version in the extension's config/extension.toml file
I have added my name to the CONTRIBUTORS.md or my name already exists there

Mayankm96 · 2025-09-10T13:02:11Z

scripts/reinforcement_learning/rsl_rl/cli_args.py

@@ -38,6 +39,30 @@ def add_rsl_rl_args(parser: argparse.ArgumentParser):
        "--log_project_name", type=str, default=None, help="Name of the logging project when using wandb or neptune."
    )

+    arg_group.add_argument(


Could we make a separate group for wandb instead of adding it to rsl-rl group?

Mayankm96 · 2025-09-10T13:03:00Z

source/isaaclab/isaaclab/utils/wandb.py

+# SPDX-License-Identifier: BSD-3-Clause
+
+
+import contextlib


This should be moved to isaaclab_rl directory. Core has nothing to do with logging :)

Mayankm96 · 2025-09-10T13:04:04Z

source/isaaclab_rl/isaaclab_rl/rsl_rl/rl_cfg.py

@@ -196,6 +196,9 @@ class RslRlBaseRunnerCfg:
    ``{time-stamp}_{run_name}``.
    """

+    run_id: str | None = None


Hmm why does this need to be part of rl config? It isn't used by rsl-rl wrapper.

Mayankm96 · 2025-09-10T13:05:19Z

scripts/reinforcement_learning/rsl_rl/cli_args.py

+        help="Select which wandb checkpoint iteration to load. If not provided, the latest checkpoint will be used.",
+    )
+    arg_group.add_argument(
+        "--wandb_username",


Should this be wandb entity? Or that's the same concept?

Mayankm96 · 2025-09-10T13:05:53Z

source/isaaclab/isaaclab/utils/wandb.py

+    Example:
+        model_path = get_model_checkpoint(run_id="my_run_id", project="my_project", checkpoint=100, wandb_username="my_username")
+        This will download the model checkpoint from https://wandb.ai/my_username/my_project/runs/my_run_id and save it
+        to models_tmp/my_project/my_run_id/model_100.pt


Would put this in python code-block

Mayankm96 · 2025-09-10T13:06:21Z

source/isaaclab/isaaclab/utils/wandb.py

+
+
+def get_model_checkpoint(
+    run_id: str, project="isaaclab", checkpoint: int = -1, wandb_username=None, tmp_folder_dir: str = "models_tmp"


Suggested change

run_id: str, project="isaaclab", checkpoint: int = -1, wandb_username=None, tmp_folder_dir: str = "models_tmp"

run_id: str, project: str = "isaaclab", checkpoint: int = -1, wandb_username=None, tmp_folder_dir: str = "models_tmp"

Mayankm96 · 2025-09-10T13:07:19Z

source/isaaclab/isaaclab/utils/wandb.py

+    models = []
+    # List all available files in the run
+    for file in wdb_run.files():
+        if "model" in file.name and file.name.endswith(".pt"):


This is very specific to rsl-rl naming of checkpoints. If plan is to only keep support for rsl-rl, then I suggest moving this to rsl-rl sub-package in isaaclab_rl.

Would suggest putting this print after the model is found so you also get to know which model iteration you downloaded.

Mayankm96 · 2025-09-10T13:07:49Z

source/isaaclab/isaaclab/utils/wandb.py

+    if wandb_username is None:
+        wandb_username = os.environ.get("WANDB_USERNAME")
+
+    print("Downloading model from wandb...", f"{wandb_username}/{project}/{run_id}")


Suggested change

print("Downloading model from wandb...", f"{wandb_username}/{project}/{run_id}")

print(f"Downloading model from wandb: {wandb_username}/{project}/{run_id}")

Mayankm96 · 2025-09-10T13:08:53Z

scripts/reinforcement_learning/rsl_rl/cli_args.py

+    arg_group.add_argument(
+        "--wandb_username",
+        type=str,
+        default=None,


Would this make sense?

Suggested change

default=None,

default=os.environ["WANDB_USERNAME"],

Mayankm96

Thanks for this feature. Added a few comments.

Also is this using the latest wandb version? Or this works with the default wandb version in IsaacLab (which is 0.12). I think the minimum version I needed was 0.19.

ooctipus · 2025-09-13T02:30:35Z

This seems a nice feature!! Thanks for the PR, I agree with Mayank:

Unless we can make it general to all rl-library, it makes more sense to put in rsl-rl sub package in isaaclab_rl/rsl_rl, or directly pr to rsl_rl if eth folks like it.
Different rl-library also expects different keys for the same wandb utility.

renezurbruegg added 4 commits September 10, 2025 14:36

Add wandb loading

6fd6fec

add wandb run id to rl cfg

2231ba6

update args and run script to resume from wandb

76acc8c

minor fixes

cb80aa1

renezurbruegg requested review from ooctipus, Mayankm96 and ClemensSchwarke as code owners September 10, 2025 12:38

renezurbruegg changed the title ~~Feature/resume wandb~~ Add wandb play and resume option Sep 10, 2025

formatter

6cea958

Mayankm96 reviewed Sep 10, 2025

View reviewed changes

Mayankm96 requested changes Sep 10, 2025

View reviewed changes

Merge branch 'main' into feature/resume_wandb

c120b06

github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request isaac-lab Related to Isaac Lab team labels Sep 11, 2025

Mayankm96 removed the documentation Improvements or additions to documentation label Sep 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add wandb play and resume option #3426

Add wandb play and resume option #3426

renezurbruegg commented Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 Sep 10, 2025

Uh oh!

Mayankm96 left a comment

Uh oh!

ooctipus commented Sep 13, 2025

Uh oh!

Uh oh!



		def get_model_checkpoint(
		run_id: str, project="isaaclab", checkpoint: int = -1, wandb_username=None, tmp_folder_dir: str = "models_tmp"

	run_id: str, project="isaaclab", checkpoint: int = -1, wandb_username=None, tmp_folder_dir: str = "models_tmp"
	run_id: str, project: str = "isaaclab", checkpoint: int = -1, wandb_username=None, tmp_folder_dir: str = "models_tmp"

	print("Downloading model from wandb...", f"{wandb_username}/{project}/{run_id}")
	print(f"Downloading model from wandb: {wandb_username}/{project}/{run_id}")

Add wandb play and resume option #3426

Are you sure you want to change the base?

Add wandb play and resume option #3426

Conversation

renezurbruegg commented Sep 10, 2025

Description

Checklist

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mayankm96 left a comment

Choose a reason for hiding this comment

Uh oh!

ooctipus commented Sep 13, 2025

Uh oh!

Uh oh!