Regularizers on sets of generalized estimates
This paper proposes a novel method for constructing ensembles of recognition algorithms based on stacking technology, incorporating regularizers to enhance the generalization capability of models. The primary focus is on preventing overfitting through the use of majorizing functions that adjust the margins (offsets) of objects from the class boundary. The study investigates the conditions necessary for the correct separation of objects during the training of both base algorithms and the meta-algorithm. A hierarchical agglomerative feature grouping algorithm is proposed, which forms latent features based on intra-class similarity and inter-class differences. It is demonstrated that margin regularization and the transformation of quantitative features into nominal ones improve recognition accuracy. The results show that choosing appropriate parameters for the majorizing functions minimizes the accuracy gap between the base and meta-algorithms. Key advantages of the proposed method: - flexible feature selection for the meta-algorithm based on a greedy strategy; - unification of measurement scales through feature transformation; - robustness to overfitting due to margin regularization. The author declares no conflicts of interests.
Keywords
algorithm ensembles,
stacking,
regularization,
majorizing functionsAuthors
| Ignatev Nikolay A. | National University of Uzbekistan named after Mirzo Ulugbek | n_ignatev@rambler.ru |
Всего: 1
References
Zhou Z.H. Ensemble learning: foundations and algorithms. Chapman & Hall/CRC, 2021. 394 p.
Ignatev N.A. On Nonlinear Transformations of Features Based on the Functions of Objects Belonging to Classes // Pattern Recog nition and Image Analysis. 2021. V. 31 (2). P. 197-204.
Игнатьев Н.А., Акбаров Б.Х. Оценка близости структур отношений объектов обучающей выборки на многообразиях набо ров латентных признаков // Вестник Томского государственного университета. Управление, вычислительная техника и информатика. 2023. № 65. С. 69-78. doi: 10.17223/19988605/65/7.
Ignatev N.A., Rahimova M.A. Formation and analysis of sets of informative features of objects by pairs of classes // Scientific and Technical Information Processing. 2022. V. 49 (6). P. 439-445.
Hastie T., Tibshirani R., Friedman J. The elements of statistical learning: data mining, inference and prediction. 2nd ed. Springer, 2009. 767 p. (Springer Series in Statistics).
Воронцов К.В. Комбинаторный подход к оценке качества обучаемых алгоритмов // Математические вопросы кибернетики / под ред. О.Б. Лупанов. М. : Физматлит, 2004. Т. 13. С. 5-36.
Игнатьев Н.А., Турсунмуротов Д.Х. Цензурирование обучающих выборок с использованием регуляризации отношений связанности объектов классов // Научно-технический вестник информационных технологий, механики и оптики. 2024. Т. 24 (2). С. 2226-1494. doi: 10.17586/2226-1494-2024-24-2-322-329.
Згуральская Е.Н. Алгоритм выбора оптимальных границ интервалов разбиения значений признаков при классификации // Известия Самарского научного центра Российской академии наук. 2012. № 4-3. С. 826-829.
UCI repository of machine learning databases/molecular-biology/promoter-gene-sequences. URL: https://archive.ics.uci.edu/dataset/67/molecular+biology+promoter+gene+sequences (accessed: 02.07.2025).
UCI repository of machine learning databases. Ionosphere. URL: http://archive.ics.uci.edu/ml/datasets/Heart+Disease (accessed: 02.07.2025).
UCI repository of machine learning databases. spambase. URL: https://archive.ics.uci.edu/dataset/94/spambase (accessed: 02.07.2025).