SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Jiang H, Ren J, Li A. Sensors (Basel) 2024; 24(11).

Copyright

(Copyright © 2024, MDPI: Multidisciplinary Digital Publishing Institute)

DOI

10.3390/s24113267

PMID

38894060

PMCID

PMC11174701

Abstract

To enhance the accuracy of detecting objects in front of intelligent vehicles in urban road scenarios, this paper proposes a dual-layer voxel feature fusion augmentation network (DL-VFFA). It aims to address the issue of objects misrecognition caused by local occlusion or limited field of view for targets. The network employs a point cloud voxelization architecture, utilizing the Mahalanobis distance to associate similar point clouds within neighborhood voxel units. It integrates local and global information through weight sharing to extract boundary point information within each voxel unit. The relative position encoding of voxel features is computed using an improved attention Gaussian deviation matrix in point cloud space to focus on the relative positions of different voxel sequences within channels. During the fusion of point cloud and image features, learnable weight parameters are designed to decouple fine-grained regions, enabling two-layer feature fusion from voxel to voxel and from point cloud to image. Extensive experiments on the KITTI dataset demonstrate the significant performance of DL-VFFA. Compared to the baseline network Second, DL-VFFA performs better in medium- and high-difficulty scenarios. Furthermore, compared to the voxel fusion module in MVX-Net, the voxel feature fusion results in this paper are more accurate, effectively capturing fine-grained object features post-voxelization. Through ablative experiments, we conducted in-depth analyses of the three voxel fusion modules in DL-VFFA to enhance the performance of the baseline detector and achieved superior results.


Language: en

Keywords

object detection; multimodal fusion; channel attention mechanism; voxel feature fusion

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print