>> 自然科学版期刊 >> 2019年04期 >> 正文
基于ESN-RBF框架的声效模式检测
供稿: 晁浩;董亮 时间: 2019-07-04 次数:

作者:晁浩董亮

作者单位:河南理工大学计算机科学与技术学院

摘要:针对声效检测过程中基于帧的谱特征不能描述语音现象中固有的时间相关性和动态变化信息的问题,提出一种结合回声状态网络和径向基函数网络的声效检测方法。首先将声学观测特征序列输入到回声状态网络,根据回声状态网络中储备池的节点状态对输入的观测矢量序列进行编码,从而将基于语音帧的声学观测矢量序列映射到高维编码空间;然后径向基函数网络被用于拟合每种声效模式编码后的概率密度函数;最后使用最小错误率贝叶斯决策方法来确定声效模式。对拥有5 000个孤立词的测试集进行声效检测试验,获得79.5%的识别精度。结果表明,所提方法可以有效获取语音帧之间的相关性信息,克服帧间独立假设的缺陷。

基金:国家自然科学基金资助项目(61502150,61403128);河南省高等学校重点科研项目(19A520004);河南省高等学校青年骨干教师科研项目(2015GGJS-068);河南省高校基本科研业务费专项项目(NSFRF1616);

关键词:声效检测;回声状态网络;储备池;径向基函数;支持向量机;

DOI:10.16186/j.cnki.1673-9787.2019.4.16

分类号:TN912.3

Vocal effort detection based on ESN-RBF framework

CHAO HaoDONG Liang

College of Computer Science and Technology, Henan Polytechnic University

Abstract:The frame based spectral feature cannot describe the inherent temporal correlation and dynamic change information in speech phenomena for vocal effort detection.In view of this, a vocal effort detection method based on ESN-RBF framework was proposed.The acoustic observation sequence was fed to an echo state network, and the reservoir of this echo state network was used to map the acoustic observation sequence to a vector in the high dimensional coding space.Then, RBF was employed to fit the probability density function of each VE mode by using the vectors in the high dimensional coding space.Finally, the minimum error rate Bayes decision was employed to judge the vocal effort mode.Experiments were conducted on test set with 5 000 isolated words, and the proposed method achieved 79.5% average recognition accuracy.The results showed that the proposed method could effectively obtain the correlation information between speech frames and overcome the defect of the independent hypothesis between frames.

最近更新