×

Next generation sequencing under de novo genome assembly. (English) Zbl 1335.92029

Summary: The next generation sequencing (NGS) is an important process which assures inexpensive organization of vast size of raw sequence dataset over any traditional sequencing systems or methods. Various aspects of NGS such as template preparation, sequencing imaging and genome alignment and assembly outline the genome sequencing and alignment. Consequently, de Bruijn graph (dBG) is an important mathematical tool that graphically analyzes how the orientations are constructed in groups of nucleotides. Basically, dBG describes the formation of the genome segments in circular iterative fashions. Some pivotal dBG-based de novo algorithms and software packages such as T-IDBA, Oases, IDBA-tran, Euler, Velvet, ABySS, AllPaths, SOAPde novo and SOAPde novo2 are illustrated in this paper. Consequently, overlap layout consensus (OLC) graph-based algorithms also play vital role in NGS assembly. Some important OLC-based algorithms such as MIRA3, CABOG, Newbler, Edena, Mosaik and SHORTY are portrayed in this paper. It has been experimented that greedy graph-based algorithms and software packages are also vital for proper genome dataset assembly. A few algorithms named SSAKE, SHARCGS and VCAKE help to perform proper genome sequencing.

MSC:

92C40 Biochemistry, molecular biology
92D20 Protein sequences, DNA sequences
05C90 Applications of graph theory
Full Text: DOI

References:

[1] DOI: 10.1021/ac2010857 · doi:10.1021/ac2010857
[2] DOI: 10.1373/clinchem.2008.112789 · doi:10.1373/clinchem.2008.112789
[3] DOI: 10.1111/j.1365-2443.2012.01615.x · doi:10.1111/j.1365-2443.2012.01615.x
[4] DOI: 10.1371/journal.pone.0017915 · doi:10.1371/journal.pone.0017915
[5] DOI: 10.1093/bib/5.3.237 · doi:10.1093/bib/5.3.237
[6] DOI: 10.1038/nrg3068 · doi:10.1038/nrg3068
[7] DOI: 10.1038/nmeth1157 · doi:10.1038/nmeth1157
[8] DOI: 10.1038/nrg2484 · doi:10.1038/nrg2484
[9] DOI: 10.1007/s13238-010-0065-3 · doi:10.1007/s13238-010-0065-3
[10] DOI: 10.1186/1471-2164-13-341 · doi:10.1186/1471-2164-13-341
[11] DOI: 10.1093/jhered/esr104 · doi:10.1093/jhered/esr104
[12] DOI: 10.1038/nbt.2198 · doi:10.1038/nbt.2198
[13] DOI: 10.1038/nrg3367 · doi:10.1038/nrg3367
[14] DOI: 10.1186/gb-2008-9-3-r55 · doi:10.1186/gb-2008-9-3-r55
[15] DOI: 10.1093/nar/gni170 · doi:10.1093/nar/gni170
[16] DOI: 10.1126/science.287.5461.2196 · doi:10.1126/science.287.5461.2196
[17] DOI: 10.1101/gr.208902 · doi:10.1101/gr.208902
[18] DOI: 10.1101/gr.828403 · doi:10.1101/gr.828403
[19] Huang X., Curr. Protoc. Bioinfor. 11 pp 3– (2005)
[20] DOI: 10.1089/10665270050081478 · doi:10.1089/10665270050081478
[21] Margulies M., Nature 437 pp 376– (2005)
[22] DOI: 10.1101/gr.2264004 · doi:10.1101/gr.2264004
[23] DOI: 10.1101/gr.072033.107 · doi:10.1101/gr.072033.107
[24] DOI: 10.1186/1471-2105-10-S1-S16 · doi:10.1186/1471-2105-10-S1-S16
[25] DOI: 10.1093/bioinformatics/btl629 · doi:10.1093/bioinformatics/btl629
[26] DOI: 10.1101/gr.6435207 · doi:10.1101/gr.6435207
[27] DOI: 10.1093/bioinformatics/btm451 · doi:10.1093/bioinformatics/btm451
[28] Peng Y., Research in Computational Molecular Biology (2011)
[29] Peng Y., Research in Computational Molecular Biology (2010)
[30] DOI: 10.1093/bioinformatics/bts094 · doi:10.1093/bioinformatics/bts094
[31] DOI: 10.1101/gr.074492.107 · doi:10.1101/gr.074492.107
[32] DOI: 10.1038/nbt.1883 · doi:10.1038/nbt.1883
[33] DOI: 10.1093/bioinformatics/btt219 · doi:10.1093/bioinformatics/btt219
[34] DOI: 10.1038/nbt.1621 · doi:10.1038/nbt.1621
[35] DOI: 10.1073/pnas.171285098 · Zbl 0993.92018 · doi:10.1073/pnas.171285098
[36] DOI: 10.1101/gr.074492.107 · doi:10.1101/gr.074492.107
[37] DOI: 10.1371/journal.pone.0008407 · doi:10.1371/journal.pone.0008407
[38] Simpson J. T., Res. ABySS (2009)
[39] DOI: 10.1101/gr.7337908 · doi:10.1101/gr.7337908
[40] Li R., Genome Res. (2009)
[41] DOI: 10.1038/nature08696 · doi:10.1038/nature08696
[42] DOI: 10.1038/nbt.1596 · doi:10.1038/nbt.1596
[43] DOI: 10.1186/2047-217X-1-18 · doi:10.1186/2047-217X-1-18
[44] DOI: 10.1186/1471-2105-13-134 · doi:10.1186/1471-2105-13-134
[45] DOI: 10.1093/bioinformatics/bts174 · doi:10.1093/bioinformatics/bts174
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.