You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the field of 3D object detection for autonomous driving, LiDAR-Camera (LC) fusion is the top-performing sensor configuration.
262
-
Still, LiDAR is relatively high cost, which hinders adoption of this technology for consumer automobiles.
263
-
Alternatively, camera and radar are commonly deployed on vehicles already on the road today, but performance of Camera-Radar (CR) fusion falls behind LC fusion.
264
-
In this work, we propose CRKD to bridge the performance gap between LC and CR detectors with a novel cross-modality knowledge distillation (KD) framework.
265
-
We use the Bird's-Eye-View (BEV) representation as the shared feature space to enable effective knowledge distillation.
266
-
To accommodate the unique cross-modality KD path, we propose four distillation losses to help the student learn crucial features from the teacher model.
267
-
We present extensive evaluations on the nuScenes dataset to demonstrate the effectiveness of the proposed CRKD framework.
266
+
High-definition (HD) maps provide environmental information for autonomous driving systems and are essential for safe planning.
267
+
While existing methods with single-frame input achieve impressive performance for online vectorized HD map construction, they still struggle with complex scenarios and occlusions.
268
+
We propose MemFusionMap, a novel temporal fusion model with enhanced temporal reasoning capabilities for online HD map construction.
269
+
Specifically, we contribute a working memory fusion module that improves the model's memory capacity to reason across a history of frames. We also design a novel temporal overlap heatmap to explicitly inform the model about the temporal overlap information and vehicle trajectory in the Bird's Eye View space. By integrating these two designs, MemFusionMap significantly outperforms existing methods while also maintaining a versatile design for scalability.
270
+
We conduct extensive evaluation on open-source benchmarks and demonstrate a maximum improvement of 5.4% in mAP over state-of-the-art methods.
\item We propose a novel design of maintaining a temporal overlap heatmap, providing a strong cue for the model to reason across a history of frames and also implicitly encoding valuable insights of the vehicle's trajectory.
We propose a novel cross-modality KD framework to enable LC-to-CR distillation in the BEV feature space. With the transferred knowledge
296
-
from an LC teacher detector, the CR student detector can outperform existing baselines without additional cost during inference.
302
+
We propose a simple yet effective model to fuse working memory features in BEV space for online vectorized HD map construction.
303
+
MemFusionMap focuses on improving the network's temporal reasoning capability while also maintaining a versatile design for scalability and compatibility.
297
304
</p>
298
305
<p>
299
-
We design four KD modules to address the notable discrepancies between different sensors to realize realize effective cross-modality KD.
300
-
As we operate KD in the BEV space, the proposed loss designs can be applied to other KD configurations.
301
-
Our improvement also includes adding a gated network to the baseline model for adaptive fusion.
306
+
We propose a novel design of maintaining a temporal overlap heatmap, providing a strong cue for the model to reason across a history of frames and also implicitly encoding valuable insights of the vehicle's trajectory.
302
307
</p>
303
308
<p>
304
-
We conduct extensive evaluation on nuScenes to demonstrate the effectiveness of CRKD.
305
-
CRKD can improve the mAP and NDS of student detectors by 3.5% and 3.2% respectively.
306
-
Since our method focuses on a novel KD path with distinctively large modality gap, we provide thorough study and analysis to support our design choices.
309
+
We conduct extensive evaluation on nuScenes and Argoverse2 to demonstrate the effectiveness of MemFusionMap.
310
+
The proposed method significantly outperforms the state-of-the-art method, achieving a maximum improvement of 5.4% in mAP.
307
311
</div>
308
312
</div>
309
313
</div>
310
314
<!-- add an image docs/static/images/Overall_Diagram.svg -->
0 commit comments