Help Login Create account

Data released on February 03, 2017

Supporting data for "MinION nanopore sequencing of environmental metagenomes: a synthetic approach"

Brown, B, L; Watson, M; Minot, S, S; Rivera, M, C; Franklin, R, B (2017): Supporting data for "MinION nanopore sequencing of environmental metagenomes: a synthetic approach" GigaScience Database. http://dx.doi.org/10.5524/100278 RIS BibTeX Text

Environmental metagenomic analysis is typically accomplished by assigning taxonomy and/or function from whole genome sequencing (WGS) or 16S amplicon sequences. Both of these approaches are limited by read length and other technical and biological factors. A nanopore-based sequencing platform, MinION™, produces reads that are ≥10000 bp in length, potentially providing for more precise assignment, thereby alleviating some of the limitations inherent in determining metagenome composition from short reads. We tested the ability of sequence data produced by MinION (R7.3 flow cells) to correctly assign taxonomy in single bacterial species runs and in three types of low complexity synthetic communities: a mixture of DNA using equal mass from four species, a community with one relatively rare (1%) and three abundant (33% each) components, and a mixture of genomic DNA from 20 bacterial strains of staggered representation. Taxonomic composition of the low-complexity communities was assessed by analyzing the MinION sequence data with three different bioinformatic approaches: Kraken, MG-RAST, and One Codex. Long read sequences generated from libraries prepared from single strains using the SQK–MAP005 kit and chemistry, run on the original MinION device, yielded as few as 224 to as many as 3,497 bidirectional high-quality (2D) reads with an average overall study length of 6,000 bp. For the single-strain analyses, assignment of reads to the correct genus by different methods ranged from 53.1% to 99.5%, assignment to the correct species ranged from 23.9% to 99.5%, and the majority of mis-assigned reads were to closely related organisms. A synthetic metagenome sequenced with the same setup yielded 714 high quality 2D reads of approximately 5,500 bp that were up to 98% correctly assigned to the species level. Synthetic metagenomes from MinION libraries generated using the SQK–MAP006 kit and chemistry yielded 899-3,497 2D reads with lengths averaging 5,700 bp with up to 98% assignment accuracy at the species-level. The observed community proportions for “equal” and “rare” synthetic libraries were close to the known proportions, deviating from 0.1 – 10% across all tests. For a 20-species mock community with staggered contributions, a sequencing run detected all but 3 species (each included at <0.05% of DNA in the total mixture); 91% of reads were assigned to the correct species, 93% of reads were assigned to the correct genus, and >99% of reads were assigned to the correct family.

Contact Submitter

Related manuscripts:

doi:10.1093/gigascience/gix007

Accessions (data included in GigaDB):

BioProject: PRJEB8672
BioProject: PRJEB8716

Metagenomic

/images/uploads/image_upload/Images_357.png

Samples: Table Settings

Columns:

Common Name
Scienfic Name
Sample Attributes
Taxonomic ID
Genbank Name

Sample IDTaxonomic IDCommon NameGenbank NameScientific NameSample Attributes
ERS6712661235509synthetic metagenome Alternative names:Equal
Alternative accession-BioSample:SAMEA3283728
Description:Pooled DNA of E. coli; P. fluorescens; S. elongatus; and M. aeruginosa (250ng each). Library made using V5 kit.
ERS14271931235509synthetic metagenome Alternative names:equal_v5
Alternative accession-BioSample:unknown
Description:Pooled DNA of E. coli; P. fluorescens; S. elongatus; and M. aeruginosa (250ng each). Library made using V5 kit.
ERS14271941235509synthetic metagenome Alternative names:equal_v6
Alternative accession-BioSample:unknown
Description:Pooled DNA of E. coli; P. fluorescens; S. elongatus and M. aeruginosa (250ng each). Library made using V6 kit.
ERS14271971235509synthetic metagenome Alternative names:Rare_v6
Alternative accession-BioSample:unknown
Description:Pooled DNA of E. coli; P. fluorescens; S. elongatus (330ng each); plus M. aeruginosa (10 ng). Library made using V6 kit.
ERS14271991235509synthetic metagenome Alternative names:Staggered
Alternative accession-BioSample:unknown
Description:Genomiphi (V3) treated DNA pool from twenty species of bacteria (cat #HM-783D BEI Resources ATCC). Library made using V6 kit.
ERS1427192562E. coli Escherichia coli Alternative names:Ecoli
Alternative accession-BioSample:unknown
Description:DNA extracted from a log-phase pure culture of E. coli
ERS14271951126  Microcystis aeruginosa Alternative names:Maeru
Alternative accession-BioSample:unknown
Description:DNA extracted from a log-phase (possibly not axenic according to ATCC) culture of Microcystis aeruginosa
ERS1427196294  Pseudomonas fluorescens Alternative names:Pfluor
Alternative accession-BioSample:unknown
Description:DNA extracted from a log-phase pure culture of Pseudomonas fluorescens
ERS142719832046  Synechococcus elongatus Alternative names:Selong
Alternative accession-BioSample:unknown
Description:DNA extracted from a log-phase pure culture of Synechococcus elongatus
Displaying 1-9 of 9 Sample(s).

Files: (FTP site) Table Settings

Columns:

File Description
Sample ID
File Type
File Format
Size
Release Date
Download Link
File Attributes

File NameSample IDFile TypeFile FormatSizeRelease Date 
ERS1427192BLASTBLAST results924.5 KB2017-01-27
ERS1427192otherUNKNOWN3.79 KB2017-01-27
ERS1427192Genome sequenceFASTA5.83 MB2017-01-27
ERS1427193BLASTBLAST results694.13 KB2017-01-27
ERS1427193otherUNKNOWN6.14 KB2017-01-27
ERS1427193Genome sequenceFASTA4.17 MB2017-01-27
ERS1427194BLASTBLAST results366.43 MB2017-01-27
ERS1427194Genome sequenceFASTA6.93 MB2017-01-27
ERS1427194otherUNKNOWN8.29 KB2017-01-27
ERS1427195BLASTBLAST results455.73 KB2017-01-27
Displaying 1-10 of 31 File(s).

History:

+

Other datasets you might like: