EA-BEV: Attention-enhanced Multimodal Fusion Method for 3D Object Detection
DOI:
https://doi.org/10.70695/AA1202502A18Keywords:
BEV; 3D Object Detection; Multimodal FusionAbstract
To address the issue of low accuracy in object detection for autonomous driving, we propose an attention-enhanced multi-modal fusion three-dimensional object detection method (EA-BEV). This method incorporates a self-attention mechanism in the image processing network, which effectively extracts deep features and reduces the problem of insufficient image feature extraction caused by semantic information blurriness. In the point cloud processing network, we designed a high-order convolutional spatial attention mechanism that significantly enhances the network's ability to model and express non-linear deep features of point clouds, thereby improving the global descriptive capability of point cloud information. We conducted comparative experiments on the nuScenes dataset, and the results show that the mAP metric is 76.2% and the NDS metric is 74.4%. The EA-BEV method demonstrates a clear advantage in the precision of 3D object detection.