Počet záznamů: 1  

Random-Forest-Based Analysis of URL Paths

  1. 1.
    SYSNO ASEP0478626
    Druh ASEPC - Konferenční příspěvek (mezinárodní konf.)
    Zařazení RIVD - Článek ve sborníku
    NázevRandom-Forest-Based Analysis of URL Paths
    Tvůrce(i) Puchýř, J. (CZ)
    Holeňa, Martin (UIVT-O) SAI, RID
    Zdroj.dok.Proceedings ITAT 2017: Information Technologies - Applications and Theory. - Aachen & Charleston : Technical University & CreateSpace Independent Publishing Platform, 2017 / Hlaváčová J. - ISSN 1613-0073 - ISBN 978-1974274741
    Rozsah strans. 129-135
    Poč.str.7 s.
    Forma vydáníOnline - E
    AkceITAT 2017. Conference on Theory and Practice of Information Technologies - Applications and Theory /17./
    Datum konání22.09.2017 - 26.09.2017
    Místo konáníMartinské hole
    ZeměSK - Slovensko
    Typ akceEUR
    Jazyk dok.eng - angličtina
    Země vyd.DE - Německo
    Klíč. slovamalicious URLs detection ; classification ; random forest
    Vědní obor RIVIN - Informatika
    Obor OECDComputer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
    CEPGA17-01251S GA ČR - Grantová agentura ČR
    Institucionální podporaUIVT-O - RVO:67985807
    EID SCOPUS85045771719
    AnotaceOne of the key sources of spreading malware are malicious web sites - either tricking user to install malware imitating legitimate software or, in the case of various exploit kits, initiating malware installation even without any user action. The most common technique against such web sites is blacklisting. However, it provides little to no information about new sites never seen before. Therefore, there has been important research into predicting malicious web sites based on their features. This work-in-progress paper presents a light-weight prediction method using solely lexical features of the site URL and classification by random forests. To this end, three possibilities of feature extraction have been elaborated and investigated on real-world data sets with respect to precision and recall. The obtained results indicate that there is nearly never a significant difference betweeen the considered methods, and that in spite of the limitation to the lexical features of the site URL, they have an impressive performance in terms of area under the precision-recall curve for the path parts of URLs.
    PracovištěÚstav informatiky
    KontaktTereza Šírová, sirova@cs.cas.cz, Tel.: 266 053 800
    Rok sběru2018
    Elektronická adresahttp://ceur-ws.org/Vol-1885/129.pdf
Počet záznamů: 1  

  Tyto stránky využívají soubory cookies, které usnadňují jejich prohlížení. Další informace o tom jak používáme cookies.