Your conditions: 姜智涵
  • 一种基于信息熵的混合属性数据谱聚类算法

    Subjects: Computer Science >> Integration Theory of Computer Science submitted time 2018-04-24 Cooperative journals: 《计算机应用研究》

    Abstract: Aiming at the problem that the traditional clustering algorithm can only deal with single attribute data and can’t handle the clustering problem of mixed type data very well. Most of the clustering algorithms for mixed type data currently have the problem of initializing sensitive and can’t handle the data of arbitrary shape. This paper proposed an entropy-based spectral clustering algorithm for mixed type data to deal with mixed type data. First, it proposed a new similarity measure. It used the numerical data in the spectral clustering algorithm constitutes a Gaussian kernel function of the matrix, and used the classification data constitutes an entropy-based the influence factor of the matrix. A new similarity matrix combines these two matrices. Instead of the traditional similarity matrix, it proposed the new similarity matrix avoid feature transformation and parameter adjustment between the numerical data and the classification data. Then, it applied the new similarity matrix to the spectral clustering algorithm so as to deal with the data of arbitrary shape, and finally got the clustering result. Experiments on UCI data sets show that this algorithm can effectively deal with the clustering problem of mixed attribute data, with high stability and good robustness.