Two-branch video action recognition method based on high-efficiency parameter fine-tuning-河南理工大学出版中心

>> English >> Current Issue >> 正文

Two-branch video action recognition method based on high-efficiency parameter fine-tuning

Time: 2025-06-19

Counts:

WANG X W, SHEN Y F, XING Q J, et al. Two-branch video action recognition method based on high-efficiency parameter fine-tuning [J]. Journal of Henan Polytechnic University (Natural Science) , 2025, 44(4): 21-28.

doi: 10.16186/j.cnki.1673-9787.2025020018

Received:2025/02/18

Revised:2025/05/11

Published:2025/06/19

Two-branch video action recognition method based on high-efficiency parameter fine-tuning

Wang Xiaowei1, Shen Yanfei2, Xing Qingjun2

1.Big Data Center， Physical Education College of Zhengzhou University， Zhengzhou 450052， Henan， China； 2.College of Sports Engineering， Beijing Sport University， Beijing 100084， China

Abstract: Objectives Video-oriented AI intelligent sports has important practical value for personalized training and customized sports analysis. Existing video motion analysis frameworks rely on the “pre-training then fine-tuning” paradigm to transfer image pre-training models to video timing modeling. However， with the continuous expansion of model size and pre-training scale， on the one hand， full-parameter updating through direct fine-tuning was demonstrated to cause high computational costs， On the other hand， effective modeling of spatiotemporal video features was shown to be unachievable when relying solely on large-scale image-based architectures. Methods Therefor， a two-branch video action recognition framework named TBN （two branch network） was proposed， which was constructed based on large-scale image pre-trained models. The architecture incorporated a spatiotemporally decoupled two-branch structure， where static background features and temporal dynamic motion features were separately processed through distinct computational pathways. During the migration process， the pre-trained weights remained frozen， while parameter-efficient transferring from the image pre-trained model to video temporal modeling was achieved through exclusive training of the minimally augmented parameters in both components of Prompt and Adaptor. Additionally， to address the limitations of existing benchmark datasets in high-speed motion scenarios， a large-scale sports dataset named Kinetics-Sports was constructed. The dataset comprised 42 sports categories （including basketball， ice skating， hurdling， etc.）， establishing a more rigorous testing benchmark for motion analysis. Results The experimental results on the Kinetics-Sports， UCF101， and HDBM51 datasets demonstrated that the proposed method achieved recognition accuracies of 97.8%， 78.0%， and 74.2% respectively across these three benchmarks， outperforming state-of-the-art approaches on the corresponding datasets. Furthermore， the framework was implemented with merely 12 M parameters and exhibited lower computational complexity compared to prevailing mainstream algorithms. Conclusions The proposed model achieved a more favorable balance between accuracy and efficiency， whereby the accuracy of sports action detection was enhanced and computational efficiency during inference was improved. This approach thereby provided an efficient solution for video transfer learning in prevailing large-scale vision models.

Key words: video action recognition; pre-training model; high efficient parameter fine-tuning; two-branch network; space-time modeling

附件【003_2025020018_王小伟_H.pdf】Download 次

Lastest

Improved GRU-based adaptive extraction model for energy consumption characteristics in steel production[19/06]

Few-shot action recognition in video method based on continuous fame information fusion modeling[19/06]

Two-branch video action recognition method based on high-efficiency parameter fine-tuning[19/06]

Research on multi-objective edge task scheduling based on an improved cat swarm optimization algorithm[19/06]

Small-object detection in remote sensing images using multi-scale feature fusion[19/06]

Coverage optimization in wireless sensor networks based on improved cuckoo search algorithm[19/06]

Anomaly detection method for power grid indicator data based on deep learning[19/06]

Multivariate long-term time series prediction based on the lifting wavelet transform[19/06]

Study on software self-adaptive mechanism for intelligent AGV scheduling system[19/06]

Research on active advanced support technology for gob-side entry retaining with roof cutting and pressure relief in ultra-soft thick coal seams[19/06]