Skip to main content

Table 2 Contigs and dataset statistics for the two MetaG workflows

From: Integrative meta-omics in Galaxy and beyond

Dataset

Workflow

Contigs (MEGAHIT)

Unbinned contigs (%)

N50/L50

Longest contig

Bioreactor,

Small dataset with 10 MAGs

MetaG

11,386

6

38,958/118

391,662

Optimized MetaG

 

5

  

Co-assembly

11,296

 

28,943/145

351,556

Individual assemblies:

    

Sample-1

4003

 

44,326/58

391,662

Sample-2

12,098

 

27,635/128

391,715

Comp,

Large dataset with 253 MAGs

MetaG

1,923,986

11

2309/93,659

797,197

Optimized MetaG

 

20

  

Co-assembly

2,331,350

 

2474/10,2387

1,098,235

Individual assemblies:

    

Sample-1

310,224

 

2109/14,283

625,541

Sample-2

511,518

 

2530/24,745

715,289

Sample-3

450,745

 

2083/20,003

872,994

Sample-4

532,077

 

2820/21,306

862,734

Sample-5

303,656

 

2548/13,484

497,688

Sample-6

223,130

 

2523/9460

1,098,235

  1. Contigs were analyzed with CoverM and metaQuast. For the optimized MetaG workflow, which includes both co- and single assemblies, the percentage of unbinned contigs is reported as the average number after dereplication. Both a small (bioreactor) and a large (in-house complementary; comp) dataset is included to stress-test the analysis pipelines