Author: ZHAO Yanru, NIU Dongjie, SUN Donghong, YANG Huimeng | Time: 2023-11-10 | Counts: |
ZHAO Y R, NIU D J, SUN D H, et al.Video person re-identification based on convolutional neural network and Transformer[J].Journal of Henan Polytechnic University(Natural Science) ,2023,42(6):149-156.
doi:10.16186/j.cnki.1673-9787.2021120013
Received:2021/12/03
Revised:2022/02/25
Published:2023/11/25
Video person re-identification based on convolutional neural network and Transformer
ZHAO Yanru, NIU Dongjie, SUN Donghong, YANG Huimeng
School of Mechanical and Power Engineering,Henan Polytechnic University,Jiaozuo 454000,Henan,China
Abstract:To solve the problem of poor effect of person feature extraction using only convolutional neural network in the field of video person re-identification,a network model ResTNet (ResNet and Transformer Network) based on convolutional neural network and Transformer was proposed.ResNet50 network was used to obtain local features and the output of its middle layer was input to Transformer as prior knowledge in ResTNet.In the Transformer branch,the size of the feature map was continuously reduced,the field of perception was expanded,and the relationship among local features was fully explored to generate the global features of pedestrians,while the model computation was decreased with the shift window method.The Rank-1 and mAP on the large-scale MARS dataset reached 86.8% and 80.3%,respectively,which were 3.8% and 3.3% higher than the benchmark.Meanwhile,excellent performance was also achieved on the two small-scale datasets.In this paper,not only the Transformer model was successfully applied to the field of video person re-identification,but also extensive experiments on several large datasets showed that the proposed ResTNet network could enhance the robustness of the recognition and improve the accuracy of person re-identification effectively.
Key words:video person re-identification;convolutional neural network;Transformer;local feature;global feature
019_2021120013_赵彦如_L.pdf