Memory leak during newEpisode in data pre-processing #16

GengzeZhou · 2023-10-26T02:59:55Z

Hi Shizhe,

Thanks for your great work. I have observed a memory leak when calling the newEpisode function in MatterSim when running the data pre-processing code.

def process_features(proc_id, out_queue, scanvp_list, args):
    print('start proc_id: %d' % proc_id)

    # Set up the simulator
    sim = build_simulator(args.connectivity_dir, args.scan_dir)

    # Set up PyTorch CNN model
    torch.set_grad_enabled(False)
    model, img_transforms, device = build_feature_extractor(args.model_name, args.checkpoint_file)

    for scan_id, viewpoint_id in scanvp_list:
        # Loop all discretized views from this location
        images = []
        for ix in range(VIEWPOINT_SIZE):
            if ix == 0:
                sim.newEpisode([scan_id], [viewpoint_id], [0], [math.radians(-30)])
            elif ix % 12 == 0:
                sim.makeAction([0], [1.0], [1.0])
            else:
                sim.makeAction([0], [1.0], [0])
            state = sim.getState()[0]
            assert state.viewIndex == ix

            image = np.array(state.rgb, copy=True) # in BGR channel
            image = Image.fromarray(image[:, :, ::-1]) #cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            images.append(image)

        images = torch.stack([img_transforms(image).to(device) for image in images], 0)
        fts, logits = [], []
        for k in range(0, len(images), args.batch_size):
            b_fts = model.forward_features(images[k: k+args.batch_size])
            b_logits = model.head(b_fts)
            b_fts = b_fts.data.cpu().numpy()
            b_logits = b_logits.data.cpu().numpy()
            fts.append(b_fts)
            logits.append(b_logits)
        fts = np.concatenate(fts, 0)
        logits = np.concatenate(logits, 0)

        out_queue.put((scan_id, viewpoint_id, fts, logits))

    out_queue.put(None)

My memory (64GB) will be gradually taken up when loading new viewpoints, the previously taken memory will not be released. The same issue was raised in the Matterport3D simulator's official repo but no solutions have been provided yet.

This issue is not solved even if I manually add a garbage collection in the for loop:

    for scan_id, viewpoint_id in scanvp_list:
        # Set up the simulator
        sim = build_simulator(args.connectivity_dir, args.scan_dir)

        # Loop all discretized views from this location
        images = []
        for ix in range(VIEWPOINT_SIZE):
            if ix == 0:
                sim.newEpisode([scan_id], [viewpoint_id], [0], [math.radians(-30)])
            elif ix % 12 == 0:
                sim.makeAction([0], [1.0], [1.0])
            else:
                sim.makeAction([0], [1.0], [0])
            state = sim.getState()[0]
            assert state.viewIndex == ix

            image = np.array(state.rgb, copy=True) # in BGR channel
            image = Image.fromarray(image[:, :, ::-1]) #cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            images.append(image)

        images = torch.stack([transform(image).to(device) for image in images], 0)
        fts = []
        for k in range(0, len(images), args.batch_size):
            with torch.cuda.amp.autocast(dtype=torch.float16):
                b_fts = ln_vision(visual_encoder(images[k: k+args.batch_size]))
            b_fts = b_fts.data.cpu().numpy()
            fts.append(b_fts)
        fts = np.concatenate(fts, 0)

        # free memory
        del sim
        gc.collect()

Therefore I believe it is caused by the memory leak in the MatterSim, do you have any suggestions on this issue?

The text was updated successfully, but these errors were encountered:

goodstudent9 · 2024-10-17T14:05:47Z

Hi Shizhe,

Thanks for your great work. I have observed a memory leak when calling the newEpisode function in MatterSim when running the data pre-processing code.

def process_features(proc_id, out_queue, scanvp_list, args):
    print('start proc_id: %d' % proc_id)

    # Set up the simulator
    sim = build_simulator(args.connectivity_dir, args.scan_dir)

    # Set up PyTorch CNN model
    torch.set_grad_enabled(False)
    model, img_transforms, device = build_feature_extractor(args.model_name, args.checkpoint_file)

    for scan_id, viewpoint_id in scanvp_list:
        # Loop all discretized views from this location
        images = []
        for ix in range(VIEWPOINT_SIZE):
            if ix == 0:
                sim.newEpisode([scan_id], [viewpoint_id], [0], [math.radians(-30)])
            elif ix % 12 == 0:
                sim.makeAction([0], [1.0], [1.0])
            else:
                sim.makeAction([0], [1.0], [0])
            state = sim.getState()[0]
            assert state.viewIndex == ix

            image = np.array(state.rgb, copy=True) # in BGR channel
            image = Image.fromarray(image[:, :, ::-1]) #cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            images.append(image)

        images = torch.stack([img_transforms(image).to(device) for image in images], 0)
        fts, logits = [], []
        for k in range(0, len(images), args.batch_size):
            b_fts = model.forward_features(images[k: k+args.batch_size])
            b_logits = model.head(b_fts)
            b_fts = b_fts.data.cpu().numpy()
            b_logits = b_logits.data.cpu().numpy()
            fts.append(b_fts)
            logits.append(b_logits)
        fts = np.concatenate(fts, 0)
        logits = np.concatenate(logits, 0)

        out_queue.put((scan_id, viewpoint_id, fts, logits))

    out_queue.put(None)

My memory (64GB) will be gradually taken up when loading new viewpoints, the previously taken memory will not be released. The same issue was raised in the Matterport3D simulator's official repo but no solutions have been provided yet.

This issue is not solved even if I manually add a garbage collection in the for loop:

    for scan_id, viewpoint_id in scanvp_list:
        # Set up the simulator
        sim = build_simulator(args.connectivity_dir, args.scan_dir)

        # Loop all discretized views from this location
        images = []
        for ix in range(VIEWPOINT_SIZE):
            if ix == 0:
                sim.newEpisode([scan_id], [viewpoint_id], [0], [math.radians(-30)])
            elif ix % 12 == 0:
                sim.makeAction([0], [1.0], [1.0])
            else:
                sim.makeAction([0], [1.0], [0])
            state = sim.getState()[0]
            assert state.viewIndex == ix

            image = np.array(state.rgb, copy=True) # in BGR channel
            image = Image.fromarray(image[:, :, ::-1]) #cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            images.append(image)

        images = torch.stack([transform(image).to(device) for image in images], 0)
        fts = []
        for k in range(0, len(images), args.batch_size):
            with torch.cuda.amp.autocast(dtype=torch.float16):
                b_fts = ln_vision(visual_encoder(images[k: k+args.batch_size]))
            b_fts = b_fts.data.cpu().numpy()
            fts.append(b_fts)
        fts = np.concatenate(fts, 0)

        # free memory
        del sim
        gc.collect()

Therefore I believe it is caused by the memory leak in the MatterSim, do you have any suggestions on this issue?

Don't render image from the simulator, which means set the variable "Render***"(sorry for forgetting the whole spell) to false.
Do as this work, only get angle and connection information from sim will not cause memory leaking.

jj023721 · 2024-10-17T14:06:20Z

你好，这里是ylJiang，你的邮件已收到！

GengzeZhou · 2024-10-23T12:10:08Z

@goodstudent9 Thanks for your reply. The point here is that I want to render RGB images at any resolution during navigation and also when saving visual features, where memory leaking is observed.
However, according to your answer, the memory leaking could be located in the image rendering process in the simulator. This makes sense because all transformer-based VLN methods (DUET, HAMT, RecBERT, BEVBERT) preload visual features in their code, and they would avoid this problem during training.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak during newEpisode in data pre-processing #16

Memory leak during newEpisode in data pre-processing #16

GengzeZhou commented Oct 26, 2023

goodstudent9 commented Oct 17, 2024

jj023721 commented Oct 17, 2024 via email

GengzeZhou commented Oct 23, 2024

Memory leak during newEpisode in data pre-processing #16

Memory leak during newEpisode in data pre-processing #16

Comments

GengzeZhou commented Oct 26, 2023

goodstudent9 commented Oct 17, 2024

jj023721 commented Oct 17, 2024 via email

GengzeZhou commented Oct 23, 2024