A Robustified Metalearning Procedure for Regression Estimators

Kalina, Jan; Neoral, A.

doi:https://dx.doi.org/10.18267/pr.2019.los.186.61

Number of the records: 1

A Robustified Metalearning Procedure for Regression Estimators

1.

SYSNO ASEP	0510554
Document Type	C - Proceedings Paper (int. conf.)
R&D Document Type	Conference Paper
Title	A Robustified Metalearning Procedure for Regression Estimators
Author(s)	Kalina, Jan (UIVT-O)_{RID, SAI, ORCID} Neoral, A. (CZ)
Source Title	The 13th International Days of Statistics and Economics Conference Proceedings. - Slaný : Melandrium, 2019 / Löster T. ; Pavelka T. - ISBN 978-80-87990-18-6
Pages	s. 617-626
Number of pages	10 s.
Publication form	Online - E
Action	International Days of Statistics and Economics /13./
Event date	05.09.2019 - 07.09.2019
VEvent location	Prague
Country	CZ - Czech Republic
Event type	WRD
Language	eng - English
Country	CZ - Czech Republic
Keywords	model choice ; computational statistics ; robustness ; variable selection
Subject RIV	IN - Informatics, Computer Science
OECD category	Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Institutional support	UIVT-O - RVO:67985807
UT WOS	000589182000062
DOI	10.18267/pr.2019.los.186.61
Annotation	Metalearning represents a useful methodology for selecting and recommending a suitable algorithm or method for a new dataset exploiting a database of training datasets. While metalearning is potentially beneficial for the analysis of economic data, we must be aware of its instability and sensitivity to outlying measurements (outliers) as well as measurement errors. The aim of this paper is to robustify the metalearning process. First, we prepare some useful theoretical tools exploiting the idea of implicit weighting, inspired by the least weighted squares estimator. These include a robust coefficient of determination, a robust version of mean square error, and a simple rule for outlier detection in linear regression. We perform a metalearning study for recommending the best linear regression estimator for a new dataset (not included in the training database). The prediction of the optimal estimator is learned over a set of 20 real datasets with economic motivation, while the least squares are compared with several (highly) robust estimators. We investigate the effect of variable selection on the metalearning results. If the training as well as validation data are considered after a proper robust variable selection, the metalearning performance is improved remarkably, especially if a robust prediction error is used.
Workplace	Institute of Computer Science
Contact	Tereza Šírová, sirova@cs.cas.cz, Tel.: 266 053 800
Year of Publishing	2020
Electronic address	https://msed.vse.cz/msed_2019/sbornik/toc.html

Number of the records: 1