Skip to content

Question about roll_timesteps in Trajectory.forward_test #55

Open
@xuzhuoran0106

Description

@xuzhuoran0106

Hi, This is an impressive work for introducing diffusion model in E2E-AD. I would like to ask some questions.

ref core code

step_num = 2
step_ratio = 20 / step_num
roll_timesteps = (np.arange(0, step_num) * step_ratio).round()[::-1].copy().astype(np.int64)
roll_timesteps = torch.from_numpy(roll_timesteps).to(device)

for k in roll_timesteps[:]:
  ...
  timesteps = k # actually k is 10 and 0
  ...
  img = self.diffusion_scheduler.step(model_output=x_start, timestep=k, sample=img).prev_sample
  ...

In the code above, the roll_timesteps is actually [10, 0], for each self.diffusion_scheduler.step, second param timestep list is [10, 0].
Considering the.diffusion_scheduler.num_train_timesteps is 1000, the self.diffusion_scheduler.num_inference_steps is also 1000, the interval of timestep in inference stage is 1.
The code in DDIM.step for calculating prev_timestep is

# 1. get previous step value (=t-1)
prev_timestep = timestep - self.config.num_train_timesteps // self.num_inference_steps

The prev_timestep for the first denoise step (with k=10) is 9, so the denoised sample is at timestep 9.
The second denoise step (with direct k=0), actually directs return the prediceted 'pose_reg' result.

My question, "The denoising step is executed 2 times, but the timestep is not consecutive. For example consecutive timestep is 10->9->8 in DDPM, or 10->8->6->4... in DDIM, with 'time_spaceing' interval."
Could you explain why the forward denoise is not a standard DDIM process? How much impact does it have on the results?
@LegendBC

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions