The generalizing ability of algorithms by the measure of compactness | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2018. № 42. DOI: 10.17223/19988605/42/5

The generalizing ability of algorithms by the measure of compactness

To estimate the generalizing ability of recognition algorithms, it is offered to use a measure of compactness. It is assumed that a training sample Eo = [Si,...JSm] is defined, divided by disjoint classes Ki, ..., Ki, l > 2. The objects of Eo are described by a set of different-type features ofX(n) = (x1, ..., xn). The compactness value depends on the dimension and composition of the feature set, the number of noise objects to be deleted, and the number of objects-standards of the minimal coverage of Eo. The compactness measure on the sample Eo in the set of features X(k) с X (n) (k < n) is calculated as ' m - Sh (X,X (к))Y m - Sh (X,X (к))л m F ( X (к) X) = CF where CF is the number of objects-standards of the minimal coverage of the sample in which Sh(X,X(k)) noise objects are removed. Let Sk e Ki, p(Sk, Sr) = min p(Sk, Sj) and Z = |{Sц e Ki | p(Sk, S^) < p(Sk, Sr)}| is the number of objects in the hypersphere with the SjeCK, center in Sk. The object Sr e CKi is considered as the noise object if the condition holds ZZ -1 1 K\ m - \K,\' where ZZ = | S e Ki\ p(Sr, Sk) < p(Sp, Sk) < p(Sn, Sk)} |, < min \Ki\, p(Sn, Sk) = min p(Sj, Sk). The ZZ value is the number of 1r } representatives of the class Ki added to the hypersphere with center at Sk e Ki after removing the noise object Sr. To find informative sets {X(k) | X(k) с X(n)}, two criteria are proposed. Both criteria do not explicitly use the number of objects-standards of minimum coverage CF. The generalizing ability of algorithms was calculated by the method of Cross Validation on the initial and informative sets of features. The highest values were on the sets obtained according to the criterion i X mt ©, R (£0 ,p) = - ^ max, m where mi is the number of Ki objects after removing the noise objects, ©i is the compactness which calculated by the minimal number of disjoint groups of objects of class Ki by the metric p. The set of admissible values R(Eo, p) belongs to (0, 1] and can be interpreted in terms of fuzzy logic. A direct correlation is shown between values by the method of Cross Validation and the average number of objects attracted by the target object of the minimum coverage of the training sample. It is concluded that a measure of compactness F(X(k), X) can serve as an indicator of the generalizing ability. This measure is recommended for evaluating the quality of recognition algorithms in the data mining.

Download file
Counter downloads: 176

Keywords

мера компактности, шумовые объекты, информативные признаки, объекты-эталоны, measure of compactness, noise objects, informative features, objects-standards

Authors

NameOrganizationE-mail
Ignatiev Nikolay A.National University of Uzbekistanignatev@rambler.ru
Всего: 1

References

Воронцов К.В. Комбинаторный подход к оценке качества обучаемых алгоритмов // Математические вопросы кибернетики. 2004. № 13. С. 5-34.
Вапник В.Н. Восстановление зависимостей по эмпирическим данным. М. : Наука, 1979.
Загоруйко Н.Г., Кутненко О.А., Зырянов А.О., Леванов Д.А. Обучение распознаванию образов без переобучения // Машин ное обучение и анализ данных. 2014. Т. 1, № 7. С. 891-901.
Игнатьев Н.А. Кластерный анализ данных и выбор объектов-эталонов в задачах распознавания с учителем // Вычислитель ные технологии. 2015. Т. 20, № 6. С. 34-43.
Борисова И.А., Кутненко О.А. Цензурирование ошибочно классифицированных объектов выборки // Математические ме тоды распознавания образов - 2015 : 17-я Всерос. конф., 19-25 сент. 2015. Светлогорск, 2015.
Мадрахимов Ш.Ф., Саидов Д.Ю. Устойчивость объектов классов и группировка признаков // Проблемы вычислительной и прикладной математики. 2016. № 3 (5). С. 50-55.
Айвазян С.А., Бухштабер В.М., Енюков И.С., Мешалкин Л.Д. Прикладная статистика. Классификация и снижение размер ности. М. : Финансы и статистика, 1989. 608 с.
Asuncion A., Newman D.J. UCI Machine Learning Repository // University of California. Irvine. 2007. www.ics.uci.edu/mleam/ MLRepository.html.
 The generalizing ability of algorithms by the measure of compactness | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2018. № 42. DOI: 10.17223/19988605/42/5

The generalizing ability of algorithms by the measure of compactness | Vestnik Tomskogo gosudarstvennogo universiteta. Upravlenie, vychislitelnaja tehnika i informatika – Tomsk State University Journal of Control and Computer Science. 2018. № 42. DOI: 10.17223/19988605/42/5

Download full-text version
Counter downloads: 846