Collective of algorithms with weights for clustering heterogeneous data | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2013. № 2(23).

Collective of algorithms with weights for clustering heterogeneous data

The paper considers a problem of heterogeneous data clustering. Under heterogeneous data one can understand the data that contain different structures: sphere-like and strip-like clusters; various geometric figures etc. To raise the grouping quality for such types of data, we suggest using the ensemble of different clustering algorithms. When including an algorithm into the ensemble, it is assumed that the algorithm produces better results for a specific type of structures. Besides, it is supposed that the experiment is planned so that the algorithms work independently, and each algorithm is functioning on independently chosen sets of parameters (learning conditions). For the construction of final decision it is recognized the behavior of each algorithm in the ensemble, on the basis of which a weight is attributed to it. A probabilistic model of ensemble clustering with latent classes and algorithm's weights is introduced. With use of the model, an expression for the upper bound of classification error probability is derived. To minimize the bound, a method of weights selection is suggested. The procedure of ensemble construction and finding the weights is implemented in correspondent algorithm. The efficiency of the suggested method is demonstrated by making use of Monte-Carlo modeling.

Download file
Counter downloads: 439

Keywords

кластерный анализ, коллективное принятие решений, алгоритмы с весами, вероятность ошибки классификации, cluster analysis, collective decision, algorithms with weights, probability of wrong classification

Authors

NameOrganizationE-mail
Berikov Vladimir B.Sobolev Institute of mathematics Siberian Branch of the Russian Academy of Sciences (Novosibirsk)berikov@math.nsc.ru
Всего: 1

References

Миркин Б.Г. Методы кластер-анализа для поддержки принятия решений: обзор. М.: Изд. дом НИУ ВШЭ, 2011.
Дуда Р.,Харт П. Распознавание образов и анализ сцен. М.: Мир, 1976.
Jain A.K., Dubes R.C. Algorithms for clustering data. Prentice Hall, NY, 1988.
Jain A.K. Data clustering: 50 years beyond k-means // Pattern Recognition Letters. 2010. V. 31. No. 8. P. 651-666.
Ghosh J., Acharya A. Cluster ensembles // Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery. 2011. V. 1(4). P. 305-315.
Berikov V. A latent variable pairwise classification model of a clustering ensemble // Multiple Classifier Systems, 2011. Lecture Notes on Computer Science, LNCS 6713 / C. Sansone, J. Kittler, and F. Roli (Eds.). Springer, Heidelberg, 2011. P. 279-288.
Жиглявский А.А., Жилинкас А.Г. Методы поиска глобального экстремума. М.: Наука, Физматлит, 1991.
 Collective of algorithms with weights for clustering heterogeneous data | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2013. № 2(23).

Collective of algorithms with weights for clustering heterogeneous data | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2013. № 2(23).

Download file