Number of the records: 1
A Comparison of Regularization Techniques for Shallow Neural Networks Trained on Small Datasets
- 1.
SYSNO ASEP 0546161 Document Type C - Proceedings Paper (int. conf.) R&D Document Type Conference Paper Title A Comparison of Regularization Techniques for Shallow Neural Networks Trained on Small Datasets Author(s) Tumpach, Jiří (UIVT-O) ORCID, SAI
Kalina, Jan (UIVT-O) RID, SAI, ORCID
Holeňa, Martin (UIVT-O) SAI, RIDNumber of authors 3 Source Title Proceedings of the 21st Conference Information Technologies – Applications and Theory (ITAT 2021). - Aachen : Technical University & CreateSpace Independent Publishing, 2021 / Brejová B. ; Ciencialová L. ; Holeňa M. ; Mráz F. ; Pardubská D. ; Plátek M. ; Vinař T. - ISSN 1613-0073 Pages s. 94-103 Number of pages 10 s. Publication form Online - E Action ITAT 2021: Information Technologies - Applications and Theory /21./ Event date 24.09.2021 - 28.09.2021 VEvent location Heľpa Country SK - Slovakia Event type EUR Language eng - English Country DE - Germany Keywords artificial neural networks ; regularization ; robustness ; optimization Subject RIV IN - Informatics, Computer Science OECD category Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8) R&D Projects GA18-18080S GA ČR - Czech Science Foundation (CSF) GA19-05704S GA ČR - Czech Science Foundation (CSF) Institutional support UIVT-O - RVO:67985807 EID SCOPUS 85116716777 Annotation Neural networks are frequently used as regression models. Their training is usually difficult when the model is subject to a small training dataset with numerous outliers. This paper investigates the effects of various regularisation techniques that can help with this kind of problem. We analysed the effects of the model size, loss selection, L2 weight regularisation, L2 activity regularisation, Dropout, and Alpha Dropout. We collected 30 different datasets, each of which has been split by ten-fold cross-validation. As an evaluation metric, we used cumulative distribution functions (CDFs) of L1 and L2 losses to aggregate results from different datasets without a considerable amount of distortion. Distributions of the metrics are shown, and thorough statistical tests were conducted. Surprisingly, the results show that Dropout models are not suited for our objective. The most effective approach is the choice of model size and L2 types of regularisations. Workplace Institute of Computer Science Contact Tereza Šírová, sirova@cs.cas.cz, Tel.: 266 053 800 Year of Publishing 2022 Electronic address https://ics.upjs.sk/~antoni/ceur-ws.org/Vol-0000/paper38.pdf
Number of the records: 1