A Comparison of Regularization Techniques for Shallow Neural Networks Trained on Small Datasets

Tumpach, Jiří; Kalina, Jan; Holeňa, Martin

Number of the records: 1

A Comparison of Regularization Techniques for Shallow Neural Networks Trained on Small Datasets

1.

SYSNO ASEP	0546161
Document Type	C - Proceedings Paper (int. conf.)
R&D Document Type	Conference Paper
Title	A Comparison of Regularization Techniques for Shallow Neural Networks Trained on Small Datasets
Author(s)	Tumpach, Jiří (UIVT-O)_{ORCID, SAI} Kalina, Jan (UIVT-O)_{RID, SAI, ORCID} Holeňa, Martin (UIVT-O)_{SAI, RID}
Number of authors	3
Source Title	Proceedings of the 21st Conference Information Technologies – Applications and Theory (ITAT 2021). - Aachen : Technical University & CreateSpace Independent Publishing, 2021 / Brejová B. ; Ciencialová L. ; Holeňa M. ; Mráz F. ; Pardubská D. ; Plátek M. ; Vinař T. - ISSN 1613-0073
Pages	s. 94-103
Number of pages	10 s.
Publication form	Online - E
Action	ITAT 2021: Information Technologies - Applications and Theory /21./
Event date	24.09.2021 - 28.09.2021
VEvent location	Heľpa
Country	SK - Slovakia
Event type	EUR
Language	eng - English
Country	DE - Germany
Keywords	artificial neural networks ; regularization ; robustness ; optimization
Subject RIV	IN - Informatics, Computer Science
OECD category	Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
R&D Projects	GA18-18080S GA ČR - Czech Science Foundation (CSF)
	GA19-05704S GA ČR - Czech Science Foundation (CSF)
Institutional support	UIVT-O - RVO:67985807
EID SCOPUS	85116716777
Annotation	Neural networks are frequently used as regression models. Their training is usually difficult when the model is subject to a small training dataset with numerous outliers. This paper investigates the effects of various regularisation techniques that can help with this kind of problem. We analysed the effects of the model size, loss selection, L2 weight regularisation, L2 activity regularisation, Dropout, and Alpha Dropout. We collected 30 different datasets, each of which has been split by ten-fold cross-validation. As an evaluation metric, we used cumulative distribution functions (CDFs) of L1 and L2 losses to aggregate results from different datasets without a considerable amount of distortion. Distributions of the metrics are shown, and thorough statistical tests were conducted. Surprisingly, the results show that Dropout models are not suited for our objective. The most effective approach is the choice of model size and L2 types of regularisations.
Workplace	Institute of Computer Science
Contact	Tereza Šírová, sirova@cs.cas.cz, Tel.: 266 053 800
Year of Publishing	2022
Electronic address	https://ics.upjs.sk/~antoni/ceur-ws.org/Vol-0000/paper38.pdf

Number of the records: 1