Deakin University
Browse

Greedily assemble tandem repeats for next generation sequences

journal contribution
posted on 2019-01-01, 00:00 authored by Yongqing Jiang, Jinhua Lu, Jingyu HouJingyu Hou, Wanlei Zhou
Eukaryotic genomes contain high volumes of intronic and intergenic regions in which repetitive sequences are abundant. These repetitive sequences represent challenges in genomic assignment of short read sequences generated through next generation sequencing and are often excluded in analysis losing invaluable genomic information. Here we present a method, known as tandem repeat assembler (TRA), for the assembly of repetitive sequences by constructing contigs directly from paired-end reads. Using an experimentally acquired data set for human chromosome 14, tandem repeats >200 bp were assembled. Alignment of the contigs to the human genome reference (GRCh38) revealed that 84.3% of tandem repetitive regions were correctly covered. For tandem repeats, this method outperformed state-of-the-art assemblers by generating correct N50 of contigs up to 512 bp.

History

Journal

International journal of high performance computing and networking

Volume

15

Pagination

1-11

Location

[London, Eng.]

ISSN

1740-0562

eISSN

1740-0570

Language

eng

Publication classification

C1 Refereed article in a scholarly journal

Issue

1/2

Publisher

Inderscience Publishers

Usage metrics

    Research Publications

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC