Logging of episodic returns in ppo implementations

In `ppo.py` and `ppo_atari.py`, episodic information is logged [as follows](https://github.com/vwxyzjn/cleanrl/blob/1ed80620842b4cdeb1edc07e12825dff18091da9/cleanrl/ppo.py#L210):

```
            if "final_info" in infos:
                for info in infos["final_info"]:
                    if info and "episode" in info:
                        print(f"global_step={global_step}, episodic_return={info['episode']['r']}")
                        writer.add_scalar("charts/episodic_return", info["episode"]["r"], global_step)
                        writer.add_scalar("charts/episodic_length", info["episode"]["l"], global_step)
```

If more than one episode truncate / terminate at the same global step, this code ends up logging only one of them as they are assigned to the same `global_step`. Is this the intended behavior?

To log them all, we could do something like

```
            if "final_info" in infos:
                for i, info in enumerate(infos["final_info"]):
                    if info and "episode" in info:
                        logging_step = global_step - args.num_envs + i
                        print(f"logging_step={logging_step}, episodic_return={info['episode']['r']}")
                        writer.add_scalar("charts/episodic_return", info["episode"]["r"], logging_step)
                        writer.add_scalar("charts/episodic_length", info["episode"]["l"], logging_step)
```

Alternatively, if we insist on not defining `logging_step`, we should log mean return, std dev of return, and the number of terminated / truncated at each `global_step` as to not bias logging in favor of one of many parallel environments.

Similar issues may also be present in other (PPO) files.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Logging of episodic returns in ppo implementations #508

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Logging of episodic returns in ppo implementations #508

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions