Počet záznamů: 1  

Study of cache performance in distributed environment for data processing

  1. 1.
    SYSNO ASEP0433186
    Druh ASEPC - Konferenční příspěvek (mezinárodní konf.)
    Zařazení RIVD - Článek ve sborníku
    NázevStudy of cache performance in distributed environment for data processing
    Tvůrce(i) Makatun, D. (CZ)
    Lauret, J. (US)
    Šumbera, Michal (UJF-V) RID, ORCID, SAI
    Celkový počet autorů3
    Zdroj.dok.Journal of Physics Conference Series, 15th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2013). - Bristol : IOP Publishing Ltd, 2014 - ISSN 1742-6588
    Rozsah stran012016
    Poč.str.8 s.
    Forma vydáníTištěná - P
    Akce15th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT2013)
    Datum konání16.05.2013-21.05.2013
    Místo konáníBeijing
    ZeměCN - Čína
    Typ akceWRD
    Jazyk dok.eng - angličtina
    Země vyd.GB - Velká Británie
    Klíč. slovaSTAR collaboration
    Vědní obor RIVBG - Jaderná, atomová a mol. fyzika, urychlovače
    CEPGA13-20841S GA ČR - Grantová agentura ČR
    Institucionální podporaUJF-V - RVO:61389005
    UT WOS000339627300016
    DOI10.1088/1742-6596/523/1/012016
    AnotaceProcessing data in distributed environment has found its application in many fields of science (Nuclear and Particle Physics (NPP), astronomy, biology to name only those). Efficiently transferring data between sites is an essential part of such processing. The implementation of caching strategies in data transfer software and tools, such as the Reasoner for Intelligent File Transfer (RIFT) being developed in the STAR collaboration, can significantly decrease network load and waiting time by reusing the knowledge of data provenance as well as data placed in transfer cache to further expand on the availability of sources for files and data-sets. Though, a great variety of caching algorithms is known, a study is needed to evaluate which one can deliver the best performance in data access considering the realistic demand patterns. Records of access to the complete data-sets of NPP experiments were analyzed and used as input for computer simulations. Series of simulations were done in order to estimate the possible cache hits and cache hits per byte for known caching algorithms. The simulations were done for cache of different sizes within interval 0.001 - 90% of complete data-set and low-watermark within 0-90%. Records of data access were taken from several experiments and within different time intervals in order to validate the results. In this paper, we will discuss the different data caching strategies from canonical algorithms to hybrid cache strategies, present the results of our simulations for the diverse algorithms, debate and identify the choice for the best algorithm in the context of Physics Data analysis in NPP. While the results of those studies have been implemented in RIFT, they can also be used when setting up cache in any other computational work-flow (Cloud processing for example) or managing data storages with partial replicas of the entire data-set.
    PracovištěÚstav jaderné fyziky
    KontaktMarkéta Sommerová, sommerova@ujf.cas.cz, Tel.: 266 173 228
    Rok sběru2015
Počet záznamů: 1  

  Tyto stránky využívají soubory cookies, které usnadňují jejich prohlížení. Další informace o tom jak používáme cookies.