Počet záznamů: 1
Random-Forest-Based Analysis of URL Paths
- 1.
SYSNO ASEP 0478626 Druh ASEP C - Konferenční příspěvek (mezinárodní konf.) Zařazení RIV D - Článek ve sborníku Název Random-Forest-Based Analysis of URL Paths Tvůrce(i) Puchýř, J. (CZ)
Holeňa, Martin (UIVT-O) SAI, RIDZdroj.dok. Proceedings ITAT 2017: Information Technologies - Applications and Theory. - Aachen & Charleston : Technical University & CreateSpace Independent Publishing Platform, 2017 / Hlaváčová J. - ISSN 1613-0073 - ISBN 978-1974274741 Rozsah stran s. 129-135 Poč.str. 7 s. Forma vydání Online - E Akce ITAT 2017. Conference on Theory and Practice of Information Technologies - Applications and Theory /17./ Datum konání 22.09.2017 - 26.09.2017 Místo konání Martinské hole Země SK - Slovensko Typ akce EUR Jazyk dok. eng - angličtina Země vyd. DE - Německo Klíč. slova malicious URLs detection ; classification ; random forest Vědní obor RIV IN - Informatika Obor OECD Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8) CEP GA17-01251S GA ČR - Grantová agentura ČR Institucionální podpora UIVT-O - RVO:67985807 EID SCOPUS 85045771719 Anotace One of the key sources of spreading malware are malicious web sites - either tricking user to install malware imitating legitimate software or, in the case of various exploit kits, initiating malware installation even without any user action. The most common technique against such web sites is blacklisting. However, it provides little to no information about new sites never seen before. Therefore, there has been important research into predicting malicious web sites based on their features. This work-in-progress paper presents a light-weight prediction method using solely lexical features of the site URL and classification by random forests. To this end, three possibilities of feature extraction have been elaborated and investigated on real-world data sets with respect to precision and recall. The obtained results indicate that there is nearly never a significant difference betweeen the considered methods, and that in spite of the limitation to the lexical features of the site URL, they have an impressive performance in terms of area under the precision-recall curve for the path parts of URLs. Pracoviště Ústav informatiky Kontakt Tereza Šírová, sirova@cs.cas.cz, Tel.: 266 053 800 Rok sběru 2018 Elektronická adresa http://ceur-ws.org/Vol-1885/129.pdf
Počet záznamů: 1