一种改进相似度的协同过滤算法-河南理工大学出版中心

>> 自然科学版期刊 >> 2019年02期 >> 正文

一种改进相似度的协同过滤算法

供稿: 于金霞；臧利明；王俊峰；汤永利

时间: 2019-04-16

次数:

作者：于金霞；臧利明；王俊峰；汤永利

摘要：传统的相似度计算方法通过评分信息得出用户之间的相关关系,这些方法仅仅从用户评价信息考虑用户之间的相似度,使计算结果过于片面,在稀疏数据集中受较大影响,导致推荐结果的准确性有所降低。针对一般的协同过滤推荐算法中存在的数据稀疏性问题,通过引入用户相似度权重系数,将Pearson相关系数进行加权处理后与Jaccard相似性方法相结合,提出一种新的计算方案,改进算法考虑了用户对共同评分项目所占的比率和用户对项目的评分取值大小,优化了协同过滤算法中相似度量的关键性能。在MovieLens和Book-Crossing两个公共数据集中进行试验,结果表明,改进算法使平均绝对误差值最大程度上降低了5. 2%,从而有效降低稀疏数据集对推荐结果的影响,显著提升了推荐系统的准确度。

基金：国家密码发展基金资助项目(MMJJ20170122)；河南省基础与前沿技术研究项目(142300410147)；河南省高等学校重点科研项目(12A520021,16A520013)；

关键词：推荐系统;协同过滤;相似度;平均绝对误差;

DOI：10.16186/j.cnki.1673-9787.2019.2.18

分类号：TP391.3

Abstract：Traditional similarity calculation methods draw correlation information between users through scoring information, these methods only consider the similarity between users from user evaluation information, and the sparse data sets can influence correctness of the recommendations in the recommendation system. Aiming to the problem of data sparsity in the general collaborative filtering recommendation algorithm, an improved similarity collaborative filtering algorithm was proposed. In the scheme, the user similarity weight coefficient was introduced, and the Pearson correlation coefficient was weighted and combined with the Jaccard similarity method.In this way, the key performance of the similarity metric algorithm in the collaborative filtering algorithm was improved by the scheme. The experiments in the public data sets ( MovieLens and Book-Crossing set) showed that the improved algorithm could reduce the mean absolute error by 5. 2% to the maximum. The improved algorithm could also reduce the influence of sparse data set on the recommendation accuracy greatly, and improved the accuracy of the recommendation system significantly.

Keyword：recommendation system;collaborative filtering;similarity;mean absolute error;

附件【一种改进相似度的协同过滤算法_于金霞.pdf】已下载次