供稿: 刘永利, 常冉 | 时间: 2024-05-15 | 次数: |
刘永利, 常冉.基于ECM的多视图模糊聚类算法[J].河南理工大学学报(自然科学版),2024,43(3):154-160.
LIU Y L , CHANG R .Multi-view fuzzy clustering algorithm based on ECM[J].Journal of Henan Polytechnic University(Natural Science) ,2024,43(3):154-160.
基于ECM的多视图模糊聚类算法
刘永利, 常冉
河南理工大学 计算机科学与技术学院,河南 焦作 454000
摘要: 目的 传统聚类算法多属于单视图聚类的范畴,在数据结构形式日趋复杂的今天,单视图聚类越来越难以对数据集进行全面而准确的知识表达。特别地,虽然证据C-均值聚类算法的数据结构揭示能力比较突出,但是囿于单视图的算法设计,其对于数据集的综合描述能力较为薄弱。 方法 为解决该问题,提出一种基于证据C-均值聚类的多视图模糊聚类算法。该算法在信念函数的理论框架下形成凭证分区,然后计算各特征在不同视图下的权重,并将该权重赋予不同视角下的各个分区,从而生成最终的聚类结果。一方面扩展了硬划分、模糊划分和可能性划分的概念,可同时继承证据C-均值聚类算法和多视图模糊聚类的优点,挖掘不同视图下的有价值信息,另一方面能够根据视图重要程度自动分配权重,据此提高聚类准确率。 结果 为验证算法的聚类效果,在4个多视图数据集上与其他5种算法进行了对比实验,实验内容包括聚类准确率、聚类效率和参数分析3部分。实验结果表明,所提算法在准确率、F度量和标准化互信息3个量化指标上表现较好,说明在聚类准确率方面优于对比算法;在聚类效率上,除去在个别数据集上因迭代次数过多导致聚类时间略长外,总体接近于对比算法中的最优表现。 结论 这些表现进一步证明了所提算法在处理多视图数据集时的有效性。
关键词:聚类;多视图;特征;权重;准确率
doi:10.16186/j.cnki.1673-9787.2021110037
基金项目:国家自然科学基金资助项目(61872126)
收稿日期:2021/11/11
修回日期:2022/05/30
出版日期:2024/05/15
Multi-view fuzzy clustering algorithm based on ECM
LIU Yongli, CHANG Ran
School of Computer Science and Technology, Henan Polytechnic University, Jiaozuo 454000, Henan, China
Abstract: Objectives Traditional clustering algorithms mostly belong to the category of single-view clustering.Today,as the data structure is becoming increasingly complex,single-view clustering is becoming more and more challenging to provide a comprehensive and accurate representation of knowledge for datasets. Notably,while the evidential C-means clustering algorithm demonstrates a relatively outstanding ability to reveal data structure,it is limited by its design for single views,which makes it relatively weak in providing an comprehensive description of datasets. Methods To address this issue,a multi-view fuzzy clustering algorithm based on the evidential C-means clustering was proposed.This algorithm created credential partitions under the theoretical framework of the belief function,then calculated the weights of various features in different views,assigned these weights to respective partitions across different perspectives,thereby generated the final clustering results. On the one hand,it extended the concepts of hard partition,fuzzy partition,and possibilistic partition,inherited the advantages of both the evidential C-Means clustering algorithm and multi-view fuzzy clustering, mined valuable information from different views. On the other hand,it could automatically allocate weights according to the importance of each view, thereby improved the clustering accuracy. Results To verify the clustering performance of the proposed algorithm, comparative experiments were conducted on four multi-view datasets against five other algorithms. The experiments included three parts, clustering accuracy,clustering efficiency, and parameter analysis.Experimental results showed that the proposed algorithm performed well in terms of three quantitative metrics:clustering accuracy,F-measure, and normalized mutual information, indicating superiority over the comparative algorithms in terms of clustering accuracy. In terms of clustering efficiency,except for slightly longer clustering times on certain datasets due to excessive iterations,the overall performance was close to the best among the comparative algorithms. Conclusions These outcomes further substantiated the effectiveness of the proposed algorithm when dealing with multi-view datasets.
Key words:clustering;multi-view;feature;weight;accuracy