Skip to main content

Table 6. Minimal annotation standards and guidelines accepted At 2010 NCBI genome annotation workshop1

From: Solving the Problem: Genome Annotation Standards before the Data Deluge

1. A complete prokaryotic genome should have:

a. set of ribosomal RNAs (at least one each 5S, 16S, 23S)

b. a set of tRNAS (at least one each for each amino acid)

c. protein-coding genes at expected density (not all named ‘hypothetical protein’ and all core genes annotated)

2. Annotations should follow INSDC submission guidelines:

Annotation standards should follow feature table format and submission guidelines (GenBank/ENA/DDBJ - Table 1)

a. prior to genome submission a submitted Bioproject record with a registered locus_tag prefix is required and the genome record should contain the Bioproject ID. All proper features should have genes and locus_tags

b. the genome submission should be valid according to feature table documentation and follow the standards

3. Methodologies and SOPs (Standard Operating Procedures):

Information about SOPs and additional meta data can be provided in a structured comment with more specific information about experimental or inference support provided on annotated features (see Table 2).

4. Exceptions:

Exceptions (unusual annotations, annotations not within expected ranges - see Table 1) should be documented on the genome record and strong supporting evidence should be provided.

5. Pseudogenes:

Annotated pseudogenes should follow the accepted formats (see Table 4).

6. Additional/enriched annotations:

Additional (enriched) annotations should follow INSDC guidelines, and be documented as above (SOPs and evidence).

7. Catalog of reputable annotation guidelines, software, and pipelines:

This non-exhaustive list of reliable software, sources, and databases for the production of microbial genome annotation is a useful community resource that aids in producing high quality genome annotation (Table 1).

8. Validation checks and annotation measures:

Validation checks should be done prior to the submission of a new genome record. NCBI has already provided numerous tools to validate and ensure correctness of annotation and additional checks and reports will be put in place to ensure minimal standards are met (see Table 1).

  1. 1 Guidelines were created for complete genomes (all replicons closed to single contigs). In some cases the minimal set of annotations will not be found on draft genomes, but the guidelines for annotation still apply.