一种基于映射簇的聚类分析算法
来源期刊:中南大学学报(自然科学版)2004年第1期
论文作者:赖邦传 陈晓红 周辉
文章页码:112 - 116
关键词:多维数据;数据挖掘;聚类分析;映射簇
Key words:high dimensional data; data mining; clustering; projected cluster
摘 要:应用映射簇的概念来明确多维数据中簇与维度的关系,将聚类问题转化为映射簇问题;将采样技术与PAM相结合,根据曼哈坦距离计算数据对象之间和簇之间的距离实现聚类分析。此外,给出了相应的具体算法,并将该算法与k-中心点算法进行了试验比较。试验结果证明了该算法的有效性。
Abstract: Projected cluster is used to analysis the relationship between cluster and its dimensions in high dimensional data, and clustering is realized by solving the projected cluster problem and combining sampling with PAM. In the course of clustering, the Manhattan distance is used to compute the distance among data objects or clusters. Corresponding fast algorithm is developed based on projected cluster. The experiment was done by comparing with the algorithm ofk-center and proved its validity.