SigHunt: horizontal gene transfer finder optimized for eukaryotic genomes

  1. 1. 0422473 - UBO-W 2015 RIV GB eng J - Článek v odborném periodiku
    Jaron, K. S. - Moravec, J. C. - Martínková, Natália
    Bioinformatics. Roč. 30, č. 8 (2014), s. 1081-1086. ISSN 1367-4803
    Grant CEP: GA ČR(CZ) GAP506/12/1064
    Institucionální podpora: RVO:68081766
    Klíčová slova: fungus Aspergillus fumigatus * Cryptosporidium parvum * sequence * evolution * identification * islands * ecology
    Kód oboru RIV: EB - Genetika a molekulární biologie
    Impakt faktor: 4.981, rok: 2014

    Motivation: Genomic islands are DNA fragments incorporated into a genome through horizontal gene transfer (also called lateral gene transfer), often with functions novel for a given organism. While methods for their detection are well researched in prokaryotes, the complexity of eukaryotic genomes makes direct utilization of these methods unreliable and so labour-intensive phylogenetic searches are employed instead. Results: We present a surrogate method that investigates nucleotide base composition of the DNA sequence in a eukaryotic genome and identifies putative genomic islands. We calculate a genomic signature as a vector of tetranucleotide (4-mer) frequencies utilizing a sliding window approach. Extending the neighbourhood of the sliding window, we establish a local kernel density estimate of the 4-mer frequency. We score the number of 4-mer frequencies in the sliding window that deviate from the credibility interval of their local genomic density using a newly developed discrete interval accumulative score (DIAS). To further improve the effectiveness of DIAS, we select informative 4-mers in a range of organisms using the tetranucleotide quality score (TES) developed herein. We show that the SigHunt method is computationally efficient and able to detect genomic islands in eukaryotic genomes that represent non-ameliorated integration. Thus, it is suited to scanning for change in organisms with different DNA composition. Availability: Source code and scripts freely available for download at http://www.iba.muni.cz/index-en.php?pg=research--data-analysistools-- sighunt are implemented in C and R and are platformindependent.
    Trvalý link: http://hdl.handle.net/11104/0228726