EA-BEV: Attention-enhanced Multimodal Fusion Method for 3D Object Detection

Authors

  • Yehui Ding North China University of Technology Author
  • Yuntao Shi North China University of Technology Author

DOI:

https://doi.org/10.70695/AA1202502A18

Keywords:

BEV; 3D Object Detection; Multimodal Fusion

Abstract

To address the issue of low accuracy in object detection for autonomous driving, we propose an attention-enhanced multi-modal fusion three-dimensional object detection method (EA-BEV). This method incorporates a self-attention mechanism in the image processing network, which effectively extracts deep features and reduces the problem of insufficient image feature extraction caused by semantic information blurriness. In the point cloud processing network, we designed a high-order convolutional spatial attention mechanism that significantly enhances the network's ability to model and express non-linear deep features of point clouds, thereby improving the global descriptive capability of point cloud information. We conducted comparative experiments on the nuScenes dataset, and the results show that the mAP metric is 76.2% and the NDS metric is 74.4%. The EA-BEV method demonstrates a clear advantage in the precision of 3D object detection.

Author Biographies

  • Yehui Ding, North China University of Technology

    Yehui Ding is currently pursuing the phd degree in Control Science and Engineering at the School of Electrical and Control Engineering, North China University of Technology, Beijing, China. His research interests focus on environmental perception for autonomous driving, including computer vision, object detection, and multimodal perception.

  • Yuntao Shi, North China University of Technology

    Yuntao Shi is a professor, master's student advisor, and doctoral student advisor at the School of Electrical and Control Engineering, North China University of Technology, Beijing, China. His research interests include cloud computing, the industrial internet, and environmental perception for autonomous driving.

Published

2025-06-30

How to Cite

Ding, Y., & Shi, Y. (2025). EA-BEV: Attention-enhanced Multimodal Fusion Method for 3D Object Detection. Innovative Applications of AI, 2(2), 61-69. https://doi.org/10.70695/AA1202502A18