Skip to main content

Table 3 Genome statistics

From: Genome sequence of the sulfur-oxidizing Bathymodiolus thermophilus gill endosymbiont

Attribute

Value

%

Genome size (bp) a

3,088,407

100

DNA coding (bp)

2,621,999

84.9

DNA G + C (bp)

1,164,329

37.7

DNA scaffolds

1281

100

Total genes

3097

100

Protein-coding genes

3045

98.3

RNA genes

46

1.5

Pseudo genes

6

0.2

Genes in internal clusters

-

-

Genes with function prediction b

2051

67.4

Genes assigned to COGs

1659

54.5

Genes with Pfam domains

1984

65.2

Genes with signal peptides c

337

11.1

Genes with transmembrane helices

626

20.6

CRISPR repeats

10

 
  1. aAll 1281 scaffolds >200 bp. 478 of these (37.3%) are scaffolds >1000 bp, comprising 2,726,561 bp (88.3% of all base pairs)
  2. bGenes with function prediction are all 3045 protein-coding genes minus those 994 genes annotated as “hypothetical proteins” that have no COG category or fall into the COG categories “unknown function” or “general function prediction only” and that have no Pfam domain or a Pfam “domain of unknown function”
  3. cIncludes genes for which a signal peptide was predicted with at least two of the three tools used. Percentages of genes with function prediction, COGs, Pfam domains, signal peptides and transmembrane helices were calculated against a total of 3045 protein-coding genes