首页 | 本学科首页   官方微博 | 高级检索  
检索        

基于深度学习和手工设计特征融合的翻唱歌曲识别模型
引用本文:杨妹,陈宁.基于深度学习和手工设计特征融合的翻唱歌曲识别模型[J].医学教育探索,2018,44(5):752-759.
作者姓名:杨妹  陈宁
作者单位:华东理工大学信息科学与工程学院, 上海 200237,华东理工大学信息科学与工程学院, 上海 200237
基金项目:国家自然科学基金(61271349)
摘    要:在翻唱歌曲识别中,手工设计的特征虽然具有高可定制性,但其采用的浅层线性结构难以表现音乐的非线性长效结构,而采用基于深度学习的特征提取算法分析音乐的非线性动力学特性可以弥补这一缺陷。本文在研究两者互补性的基础上,提出了一种融合手工特征和深度特征的翻唱歌曲识别算法。该算法分别采用深度学习模型和手工设计算法提取歌曲的音级轮廓特征和旋律特征,然后将基于这两种特征的相似度组合成相似度向量输入到改进的SVM模型中,并将输入歌曲属于翻唱组合的概率作为融合相似度。为了验证算法性能,以两个公开的数据库(covers80,covers1212)作为测试对象进行测试,实验结果表明该算法比基于单个特征的算法和基于相似度融合的算法取得了更高的识别率和分类准确率。

关 键 词:特征融合  深度学习  翻唱歌曲识别  SVM
收稿时间:2017/7/15 0:00:00

Cover Song Identification Based on Fusion of Deep Learning and Manual Design Features
YANG Mei and CHEN Ning.Cover Song Identification Based on Fusion of Deep Learning and Manual Design Features[J].Researches in Medical Education,2018,44(5):752-759.
Authors:YANG Mei and CHEN Ning
Institution:School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China and School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China
Abstract:Since the cover version may differ from the original version in various respects, such as timbre, tempo, structure, key, arrangement, and even the language of the vocals, it will be a challenging work for automatically identifying all cover versions for a given original version. Most of the conventional cover song identification (CSI) schemes adopt hand-crafted features, which are highly customizable and effective. However, their shallow processing strategy and linear mapping cannot precisely describe the complex dynamic characteristics contained in the music. To deal with this problem, the deep-learning architecture has been recently introduced in some music feature extraction algorithms for achieving good results. However, it is noted that the performance of the deep-learning based schemes totally depend on the size of the training set such that the easily fall into local optimum. In this paper, by analyzing the complementarity between the hand-craft feature and deep-learning feature by experiment, we propose a feature fusion model. Firstly, a deep learning model is trained to extract deep pitch class profile (DPCP) feature. Meanwhile, a hand-crafted model is utilized to extract the main melody (MLD) feature. And then, the DPCP-based similarity score and MLD-based one are calculated via Dmax and the similarity scores are used to construct a similarity function. Furthermore, the two similarity scores are used to construct a similarity vector, by which an improved support vector machine (SVM) is given to obtain the probability that the input track pair belongs to reference/cover pair. Finally, in terms of the receiver operating characteristic (ROC) curve and the area under curve (AUC), the proposed model is compared with the state-of-the-art CSI schemes based on single feature and multiple features, respectively. It is shown from experimental results that the proposed scheme outperforms the CSI schemes based on hand-crafted feature and deep learning feature, respectively, and has the common and complementary properties in hand-crafted feature and deep-learning feature.
Keywords:features fusion  deep learning  cover song identification  SVM
点击此处可从《医学教育探索》浏览原始摘要信息
点击此处可从《医学教育探索》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号