Number of the records: 1  

Random-Forest-Based Analysis of URL Paths

  1. 1.
    0478626 - ÚI 2018 RIV DE eng C - Conference Paper (international conference)
    Puchýř, J. - Holeňa, Martin
    Random-Forest-Based Analysis of URL Paths.
    Proceedings ITAT 2017: Information Technologies - Applications and Theory. Aachen & Charleston: Technical University & CreateSpace Independent Publishing Platform, 2017 - (Hlaváčová, J.), s. 129-135. CEUR Workshop Proceedings, V-1885. ISBN 978-1974274741. ISSN 1613-0073.
    [ITAT 2017. Conference on Theory and Practice of Information Technologies - Applications and Theory /17./. Martinské hole (SK), 22.09.2017-26.09.2017]
    R&D Projects: GA ČR GA17-01251S
    Institutional support: RVO:67985807
    Keywords : malicious URLs detection * classification * random forest
    OECD category: Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
    http://ceur-ws.org/Vol-1885/129.pdf

    One of the key sources of spreading malware are malicious web sites - either tricking user to install malware imitating legitimate software or, in the case of various exploit kits, initiating malware installation even without any user action. The most common technique against such web sites is blacklisting. However, it provides little to no information about new sites never seen before. Therefore, there has been important research into predicting malicious web sites based on their features. This work-in-progress paper presents a light-weight prediction method using solely lexical features of the site URL and classification by random forests. To this end, three possibilities of feature extraction have been elaborated and investigated on real-world data sets with respect to precision and recall. The obtained results indicate that there is nearly never a significant difference betweeen the considered methods, and that in spite of the limitation to the lexical features of the site URL, they have an impressive performance in terms of area under the precision-recall curve for the path parts of URLs.
    Permanent Link: http://hdl.handle.net/11104/0274765

     
    FileDownloadSizeCommentaryVersionAccess
    a0478626.pdf1367.6 KBPublisher’s postprintrequire
     
Number of the records: 1  

  This site uses cookies to make them easier to browse. Learn more about how we use cookies.