时间: 2025-07-23 | 次数: |
孙世明, 唐元合, 邰曈,等.基于MobileNetV2的轻量级输电线路绝缘子图像分割方法[J].河南理工大学学报(自然科学版),2025,44(5):35-42.
SUN S M, TANG Y H, TAI T,et al. Lightweight image segmentation method for transmission line insulators based on MobileNetV2[J].Journal of Henan Polytechnic University(Natural Science) ,2025,44(5):35-42.
基于MobileNetV2的轻量级输电线路绝缘子图像分割方法
孙世明1, 唐元合1, 邰曈1, 魏学云1, 方巍2
1.国电南瑞南京控制系统有限公司,江苏 南京 211106;2.南京信息工程大学 计算机学院,江苏 南京 210044
摘要: 目的 针对输电线路巡检航拍图像绝缘子分割精度不高、边缘端设备算力有限以及模型参数量大、实时性不足等问题,提出一种基于MobileNetV2的轻量级输电线路绝缘子分割网络ISNet。 方法 首先,采用轻量级的MobileNetV2作为编码器骨干网络,从输入图像重提取多尺度特征;其次,提出一种新的多样化特征聚合模块(DFAM),通过具有不同卷积核的卷积层、通道注意力和空间注意力机制,聚合多样化的空间位置信息和高级语义信息;最后,设计多级对称解码器(MSD)融合来自同一层编码器和上一步解码器的输出特征,以此生成最终预测图。 结果 实验结果表明,本文方法在航拍图像绝缘子分割数据集上表现优异,在mIoU指标上,ISNet达到了90.9%,相比DeepLabV3plus和SegFormer,分别提高了5.2%和1.2%;在mPA指标上,ISNet达到了93.6%,相比DeepLabV3plus和SegFormer,分别提高了5.2%和0.8%。此外,本文方法在单张NVIDIA RTX 3090 GPU上的推理速度可达71.2 F/s,参数量仅为3.1 M,浮点运算量(FLOPs)仅为21.2 G(输入图像大小为1 024×1 024),优于目前主流的语义分割方法。 结论 ISNet在提升模型的轻量化和实时性的同时,可实现最佳的分割精度。
关键词:MobileNetV2;语义分割;绝缘子;输电线路巡检;深度学习;计算机视觉
DOI:10.16186/j.cnki.1673-9787.2024070039
基金项目:国家自然科学基金资助项目(42475149);国电南瑞南京控制系统有限公司科技信息项目(2023h581)
收稿日期:2024/07/09
修回日期:2024/10/02
出版日期:2025/07/23
Lightweight image segmentation method for transmission line insulators based on MobileNetV2
Sun Shiming1, Tang Yuanhe1, Tai Tong1, Wei Xueyun1, Fang Wei2
1.NARI-TECH Nanjing Control Systems Co., Ltd., Nanjing 211106, Jiangsu, China;2.School of Computer Science, Nanjing University of Information Science & Technology, Nanjing 210044, Jiangsu, China
Abstract: Objectives To address the issues of low accuracy in insulator segmentation in aerial images of transmission line inspections, limited computing power of edge devices, large model parameters, and insufficient real-time performance, a lightweight transmission line insulator segmentation network (ISNet) based on MobileNetV2 was proposed. Methods Firstly, a lightweight MobileNetV2 was used as the encoder backbone network to re extract multi-scale features from the input image; Secondly, a new diverse feature aggregation module (DFAM) was proposed, which aggregated diverse spatial position information and advanced semantic information through convolutional layers with different convolution kernels, channel attention, and spatial attention mechanisms; Finally, a multi-level symmetric decoder (MSD) was designed to fuse the output features from the same layer encoder and the previous decoder to generate the final prediction image. Results The experimental results showed that the proposed method achieved excellent performance on the aerial image insulator segmentation dataset. In terms of mIoU index, ISNet reached 90.9%, which was 5.2% and 1.2% higher than DeepLabV3plus and SegFormer, respectively; On the mPA metric, ISNet achieved 93.6%, which was 5.2% and 0.8% higher than DeepLabV3plus and SegFormer, respectively; In addition, the proposed method ISNet could achieve an inference speed of 71.2 F/s on a single NVIDIA RTX 3090 GPU, with only 3.1 M of parameters and 21.2 G of floating-point operations (FLOPs) (input image size of 1 024 × 1 024), which was superior to current mainstream semantic segmentation methods. Conclusions In summary, the proposed method ISNet achieved the best segmentation accuracy while improving the lightweighting and real-time performance of the model.
Key words:MobileNetV2;semantic segmentation;insulator;transmission line inspection;deep learning;computer vision