供稿: 任立勇;何永彬;贺茜;于永斌;刘思怡 | 时间: 2020-11-11 | 次数: |
任立勇, 何永彬, 贺茜,等.基于MFCC,MODGDF和支持向量机的环境音识别研究[J].河南理工大学学报(自然科学版),2020,39(6):127-132.
REN L Y, HE Y B, HE X,et al.Research on environmental sounds recognition based on MFCC,MODGDF and SVM[J].Journal of Henan Polytechnic University(Natural Science) ,2020,39(6):127-132.
任立勇1, 何永彬1, 贺茜1, 于永斌1, 刘思怡2
1.电子科技大学 信息与软件工程学院,四川 成都 610054;2.四川长江职业学院,四川 成都 610106
摘要:环境音识别是机器学习领域中的一个研究重点和难点,它可以帮助智能系统识别音频数据中的环境音。本文提出一种新的环境音识别方法,它是将梅尔频率倒谱系数(mel frequency cepstral coefficents , MFCC )和修正群延迟函数(modified group delay function , MODGDF)联合作为特征参数,然后利用多分类支持向量机(support vector machine , SVM )进行参数分类,达到识别音频数据中环境音的目的。结果表明,在DCASE 2018数据集上,该方法的实验效果优于 DCASE 2018数据集基线系统识别效果,整体识别准确率提高了 25. 8%。
基金项目:国家自然科学基金资助项目(61550110248 )
Research on environmental sounds recognition based on MFCC,MODGDF and SVM
REN Liyong1, HE Yongbin1, HE Xi1, YU Yongbin1, LIU Siyi2
1.School of Information and Software Engineering, University of Electronic Science and Technology of China , Chengdu 610054 , Sichuan, China;2.Sichuan Changjiang Vocational College,Chengdu 610106,Sichuan,China
Abstract:Environmental sounds recognition is an important research direction in the field of machine learning. It is used to help intelligent systems recognize environmental sounds from audio data. A new environmental sounds recognition method was proposed which combined Mel frequency cepstral coefficents (MFCC) and modified group delay function (MODGDF) as feature parameters, and then used multi-classification support vector machine (SVM) to classify the parameters to achieve the purpose of recognizing environmental sounds from audio data. The results showed that the experimental results of the DCASE 2018 datasets of the proposed method were better than the DCASE 2018 dataset baseline system recognition, and the recognition accuracy was 25.8% higher,respectively.
Key words:environmental sound recognition;Mel frequency cepstral coefficent;modified group delay function;support vector machine