供稿: 王红星, 杨亚萍, 王璟源, 张勃阳 | 时间: 2024-09-24 | 次数: |
王红星, 杨亚萍, 王璟源,等.基于YOLOv5与多视图几何联合的动态V-SLAM[J].河南理工大学学报(自然科学版),2024,43(6):129-138.
WANG H X, YANG Y P, WANG JY, ,et al.Dynamic V-SLAM based on YOLOv5 combined with Multi-view Geometry[J].Journal of Henan Polytechnic University(Natural Science) ,2024,43(6):129-138.
基于YOLOv5与多视图几何联合的动态V-SLAM
王红星1, 杨亚萍1, 王璟源2, 张勃阳3
1.河南理工大学 物理与电子信息学院,河南 焦作 454000;2.河南大学 迈阿密学院,河南 开封 475004;3.河南理工大学 土木工程学院,河南 焦作 454000
摘要: 目的 为了解决传统视觉SLAM(simultaneous localization and mapping,SLAM)系统在动态环境下容易受运动物体干扰而无法实现准确定位与建图的问题, 方法 基于传统ORB_SLAM2算法,提出一种YOLOv5与多视图几何联合的动态V-SLAM算法。首先,在视觉SLAM系统的前端加入一个目标检测模块,该模块采用深度学习中的目标检测网络YOLOv5,结合多视图几何方法识别并分割动态物体和静态物体;其次,根据目标检测模块检测结果,在系统跟踪线程中进行动态特征点的剔除,仅使用静态特征点进行帧间匹配和位姿估计,同时结合渐进采样一致性算法(progressive sample consensus,PROSAC)剔除误匹配的特征点并求得位姿估计模型;最后,根据剔除动态特征点信息的关键帧完成稠密点云地图构建。为了评估改进算法的有效性,主要基于德国慕尼黑工业大学(TUM)数据集中的动态场景进行实验。 结果 结果表明,在图像特征匹配实验中,与基于ORB特征粗匹配和随机抽样一致性(random sample consensus,RANSAC)算法对比,提出的算法具有更高的运算效率;在轨迹跟踪实验中,所提算法的定位精度较ORB_SLAM2系统平均提升了96.14%,较ORB_SLAM3系统平均提升了94.52%;在点云建图实验中,本文算法在3种不同运动状态场景下均能构建出与实际场景一致的稠密点云地图。 结论 改进后的视觉SLAM算法在室内动态场景下具有较高的可靠性和准确度。
关键词:V-SLAM;YOLOv5;多视图几何;PROSAC;动态场景
doi:10.16186/j.cnki.1673-9787.2023070027
基金项目:国家自然科学基金资助项目(41807209)
收稿日期:2023/07/18
修回日期:2024/02/26
出版日期:2024-09-24
Dynamic V-SLAM based on YOLOv5 combined with Multi-view Geometry
WANG Hongxing1, YANG Yaping1, WANG Jingyuan2, ZHANG Boyang3
1.School of Physics and Electronic Information Engineering,Henan Polytechnic University,Jiaozuo 454000,Henan,China;2.Miami College,Henan University,Kaifeng 475004,Henan,China;3.School of Civil Engineering,Henan Polytechnic University,Jiaozuo 454000,Henan,China
Abstract: Objectives In order to solve the problem that the traditional visual SLAM (simultaneous localization and mapping) system is easily disturbed by moving objects in dynamic environment and cannot achieve accurate localization and mapping, Methods based on the ORB_SLAM2 algorithm,a dynamic V-SLAM algorithm based on YOLOv5 and Multi-view Geometry was proposed.Firstly,an object detection module was added to the front end of visual SLAM system.This module used YOLOv5,an object detection network in deep learning,combined with Multi-view Geometry method to identify and segment dynamic and static objects.Secondly,based on the detection results of the object detection module,in the tracking thread of the system,dynamic feature points were discarded,and only static feature points were used for inter-frame matching and pose estimation.Additionally,the progressive sample consensus (PROSAC) algorithm was employed to eliminate misaligned feature points and obtain the pose estimation model.Finally,the keyframes with dynamic information removed were used to complete the construction of a dense point cloud map.To evaluate the effectiveness of the improved algorithm,experiments were primarily conducted on dynamic scenes from the Technical University of Munich dataset in Germany. Results The results showed that,in the experiments of image feature matching,the proposed algorithm had higher computational efficiency ompared with ORB feature coarse matching and random sample consensus (RANSAC) algorithm.In the trajectory tracking experiments,the proposed algorithm improved the positioning accuracy by an average of 96.14% compared with the ORB_SLAM2 system,and by an average of 94.52% compared with the ORB_SLAM3 system.In the point cloud mapping experiments,the proposed algorithm was able to construct globally consistent dense point cloud maps in three different dynamic scenes. Conclusions The improved V-SLAM algorithm had high reliability and accuracy in indoor dynamic scenarios.
Key words:V-SLAM;YOLOv5;Multi-view Geometry;PROSAC;dynamic scene