Velvet: algorithms for de novo short read assembly using de Bruijn graphs
- PMID: 18349386
- PMCID: PMC2336801
- DOI: 10.1101/gr.074492.107
Velvet: algorithms for de novo short read assembly using de Bruijn graphs
Abstract
We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.
Figures
Similar articles
-
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8. BMC Genomics. 2016. PMID: 27556636 Free PMC article.
-
Efficient parallel and out of core algorithms for constructing large bi-directed de Bruijn graphs.BMC Bioinformatics. 2010 Nov 15;11:560. doi: 10.1186/1471-2105-11-560. BMC Bioinformatics. 2010. PMID: 21078174 Free PMC article.
-
ALLPATHS: de novo assembly of whole-genome shotgun microreads.Genome Res. 2008 May;18(5):810-20. doi: 10.1101/gr.7337908. Epub 2008 Mar 13. Genome Res. 2008. PMID: 18340039 Free PMC article.
-
The present and future of de novo whole-genome assembly.Brief Bioinform. 2018 Jan 1;19(1):23-40. doi: 10.1093/bib/bbw096. Brief Bioinform. 2018. PMID: 27742661 Review.
-
De novo assembly of short sequence reads.Brief Bioinform. 2010 Sep;11(5):457-72. doi: 10.1093/bib/bbq020. Epub 2010 Aug 19. Brief Bioinform. 2010. PMID: 20724458 Review.
Cited by
-
Molecular and Phylogenomic Analysis of a Vancomycin Intermediate Resistance USA300LV Strain in Chile.Microorganisms. 2024 Jun 25;12(7):1284. doi: 10.3390/microorganisms12071284. Microorganisms. 2024. PMID: 39065053 Free PMC article.
-
Genomic Analysis of Aspergillus Section Terrei Reveals a High Potential in Secondary Metabolite Production and Plant Biomass Degradation.J Fungi (Basel). 2024 Jul 22;10(7):507. doi: 10.3390/jof10070507. J Fungi (Basel). 2024. PMID: 39057392 Free PMC article.
-
Unlocking plant genetics with telomere-to-telomere genome assemblies.Nat Genet. 2024 Jul 24. doi: 10.1038/s41588-024-01830-7. Online ahead of print. Nat Genet. 2024. PMID: 39048791 Review.
-
Exploring gene content with pangene graphs.Bioinformatics. 2024 Jul 23;40(7):btae456. doi: 10.1093/bioinformatics/btae456. Online ahead of print. Bioinformatics. 2024. PMID: 39041615 Free PMC article.
-
Transition of survival strategies under global climate shifts in the grape family.Nat Plants. 2024 Jul;10(7):1100-1111. doi: 10.1038/s41477-024-01726-8. Epub 2024 Jul 15. Nat Plants. 2024. PMID: 39009829
References
-
- Batzoglou S. Algorithmic challenges in mammalian genome sequence assembly. In: Dunn M., et al., editors. Encyclopedia of genomics, proteomics and bioinformatics. John Wiley and Sons; New York: 2005. Part 4.
-
- Batzoglou S., Jaffe D.B., Stanley K., Butler J., Gnerre S., Mauceli E., Berger B., Mesirov J.P., Lander E.S., Jaffe D.B., Stanley K., Butler J., Gnerre S., Mauceli E., Berger B., Mesirov J.P., Lander E.S., Stanley K., Butler J., Gnerre S., Mauceli E., Berger B., Mesirov J.P., Lander E.S., Butler J., Gnerre S., Mauceli E., Berger B., Mesirov J.P., Lander E.S., Gnerre S., Mauceli E., Berger B., Mesirov J.P., Lander E.S., Mauceli E., Berger B., Mesirov J.P., Lander E.S., Berger B., Mesirov J.P., Lander E.S., Mesirov J.P., Lander E.S., Lander E.S. ARACHNE: A whole genome shotgun assembler. Genome Res. 2002;12:177–189. - PMC - PubMed
-
- Bentley D.R. Whole-genome re-sequencing. Curr. Opin. Genet. Dev. 2006;16:545–552. - PubMed
-
- Bokhari S.H., Sauer J.R., Sauer J.R. A parallel graph decomposition algorithm for DNA sequencing with nanopores. Bioinformatics. 2005;21:889–896. - PubMed
-
- Chaisson M., Pevzner P.A., Tang H., Pevzner P.A., Tang H., Tang H. Fragment assembly with short reads. Bioinformatics. 2004;20:2067–2074. - PubMed
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources