High quality draft genomic sequence of Arenimonas donghaensis DSM 18148T
Standards in Genomic Sciences volume 10, Article number: 59 (2015)
Arenimonas donghaensis is the type species of genus Arenimonas which belongs to family Xanthomonadaceae within Gammaproteobacteria. In this study, a total of five type strains of Arenimonas were sequenced. The draft genomic information of A. donghaensis DSM 18148T is described and compared with other four genomes of Arenimonas. The genome size of A. donghaensis DSM 18148T is 2,977,056 bp distributed in 51 contigs, containing 2685 protein-coding genes and 49 RNA genes.
Arenimonas donghaensis DSM 18148T (= HO3-R19T = KACC 11381T ) was isolated from seashore sand  which belongs to family Xanthomonadaceae . So far, the genus Arenimonas contained seven species, Arenimonas donghaensis (type species) , Arenimonas malthae , Arenimonas oryziterrae , Arenimonas composti , Arenimonas metalli , Arenimonas daejeonensis  and Arenimonas daechungensis . These bacteria were isolated from seashore sand , oil-contaminated soil , rice rhizosphere , compost , iron mine , compost  and sediment of a eutrophic reservoir , respectively. The species A. composti  was previously classified as Aspromonas composti .
The common characteristics of the Arenimonas strains are Gram-staining-negative, aerobic, rod-shaped, non-spore-forming, oxidase-positive, non-indole-producing, non-nitrate-reducing, containing iso-C16:0 and iso-C15:0 as the major fatty acids, phosphatidylglycerol and phosphatidylethanolamine as the major polar lipids, Q-8 as the major respiratory quinone, and possessing relatively high DNA G + C content (63.9–70.8 mol %) [1–7].
In order to provide genome information and determine genomic differences of Arenimonas species, we performed genome sequencing of strains A. donghaensis DSM 18148T , A. composti KCTC 12666T , A. malthae CCUG 53596T , A. metalli CF5-1T and A. oryziterrae KCTC 22247T . In this study, we report the genomic features of A. donghaensis DSM 18148T and compare it to the close relatives.
Classification and features
Strain A. donghaensis DSM 18148T shares 93.1–95.7 % 16S rRNA gene identities with the other six type strains of Arenimonas species, A. malthae CC-JY-1T (DQ239766) (95.7 %), A. daejeonensis T7-07T (AM229325) (95.7 %), A. metalli CF5-1T (HQ698842) (94.6 %), A. oryziterrae YC6267T (EU376961) (94.3 %), A. composti TR7-09T (AM229324) (94.3 %) and A. daechungensis CH15-1T (JN033774) (93.1 %). A 16S rRNA gene based neighbor-joining phylogenetic tree of the related strains was obtained using MEGA 5.05 software  (Fig. 1).
Cells of A. donghaensis DSM 18148T are Gram-negative, aerobic, non-spore-forming, straight or slightly curved rods, motile by means of a single polar flagellum. Colonies are yellowish white, translucent and convex on R2A agar after 3 d cultivation (Fig. 2). API ID 32 GN and Biolog GN2 MicroPlate systems (bioMe’rieux) were used to investigate sole carbon source utilization, and β-hydroxybutyric acid, L-alaninamide, L-glutamic acid and glycyl-L-glutamic acid could be utilized by strain DSM 18148T (Table 1).
The major fatty acids of A. donghaensis DSM 18148T are iso-branched types, such as iso-C16:0, iso-C15:0 and iso-C17:1 ω9c . Major isoprenoid quinone of this bacterium is Q-8 . Diphosphatidylglycerol (DPG), PG and PE are the major polar lipids of this strain .
Genome sequencing information
Genome project history
Genome sequencing project of A. donghaensis DSM 18148T was carried out in April, 2013 and was finished in two months. The obtained high-quality draft genome of A. donghaensis DSM 18148T has been deposited at DDBJ/EMBL/GenBank under accession number AVCJ00000000. The version described in this study is the first version, AVCJ01000000. The genome sequencing project information is summarized in Table 2.
Growth conditions and genomic DNA preparation
A. donghaensis DSM 18148T was cultivated aerobically in LB medium at 28 °C for 3 d. The DNA was extracted, concentrated and purified using the QiAamp kit according to the manufacturer’s instruction (Qiagen, Germany).
Genome sequencing and assembly
The whole-genome sequence of A. donghaensis DSM 18148T was determined using the Illumina Hiseq2000  with the Paired-End library strategy (300 bp insert size) at Shanghai Majorbio Bio-pharm Technology Co., Ltd.  (Shanghai, China). A total of 9,571,421 reads with an average read length of 93 bp (885.9 Mb data) was obtained. The detailed methods of library construction and sequencing can be found at Illumina’s official website . Using SOAPdenovo v1.05 , these reads were assembled into 51 contigs (>200 bp) with a genome size of 2,977,056 bp and an average coverage of 332.4 x.
The draft sequence of strain A. donghaensis DSM 18148T was submitted to NCBI Prokaryotic Genome Annotation Pipeline  for annotation according to the draft WGS annotation guideline at this website. This annotation pipeline combines the GeneMarkS+ algorithm with the similarity-based gene detection approach to calling gene. The function of the predicted genes from the automatic result was manually modified through BlastX analysis against the NCBI protein database with E-value cutoff 1-e20.
The whole genome of A. donghaensis DSM 18148T is 2,977,056 bp in length, with a G + C content of 68.7 % (Fig. 3 and Table 3), and distributed in 51 contigs (>200 bp). Of the 2735 predicted genes, 2685 (98.17 %) are protein-coding genes, 49 (1.79 %) are RNA genes and 1 (0.04 %) are pseudogenes. A total of 472 (17.26 %) CDSs were assigned with putative functions, while the remaining ones were annotated as hypothetical proteins. The result of protein function classification is shown in Table 4, which was performed by searching all the predicted coding sequences of strain DSM 18148T against the Clusters of Orthologous Groups protein database  using BlastP algorithm with E-value cutoff 1-e10. A more detailed summary of the genome properties about this strain is provided in Table 3.
Insights from the genome sequences
Strain A. donghaensis DSM 18148T can only use several sole carbon sources and cannot assimilate glucose and other sugars . Genome analysis using the Kyoto Encyclopedia of Genes and Genomes (KEGG)  orthology and pathway assignment analysis revealed this strain has a complete TCA cycle, but lacks the hexokinase which catalyzes the first step of glycolysis, as well as the glucose-6-phosphate dehydrogenase, gluconolactonase and 6-phosphogluconate dehydrogenase that responsible for the oxidative phase of pentose phosphate pathway. This is in agreement with the experimental result that this bacterium can only use several sole carbon sources.
The general features of the five Arenimonas sequenced genomes are summarized in Table 5. Orthologs clustering analysis was performed using OrthoMCL  with Match cutoff of 50 % and E-value Exponent cutoff of 1-e5 for the five Arenimonas genomes. These five Arenimonas bacteria share 1014 genes, which are classified into 21 COG functional categories. The major categories are energy production and conversion (8.7 %), amino acid transport and metabolism (8.7 %), coenzyme transport and metabolism (5.8 %), lipid transport and metabolism (5.1 %), translation, ribosomal structure and biogenesis (12.4 %), replication, recombination and repair (5.2 %), cell wall/membrane/envelope biogenesis (5.9 %), posttranslational modification, protein turnover, chaperones (6.3 %), general function prediction only (8.4 %), function unknown (7.3 %) and signal transduction mechanisms (5.3 %) (Fig. 4 and Table 6).
There are 601 strain-specific genes for A. donghaensis DSM 18148T which may contribute to species-specific features of this bacterium. Among them, 359 are classified into 20 COG functional categories major belonging to transcription (6.3 %), general function prediction only (8.5 %), function unknown (7.3 %) and signal transduction mechanisms (9.0 %). The remaining 242 unique genes (40.3 %) are not classified into any COG categories (Fig. 4 and Table 7). In addition, the five Arenimonas strains had a pan-genome  size of 7501 genes. The nucleotide diversity (π) was calculated using MAUVE v2.3  and DnaSP v5 . The five genomes of Arenimonas species had a nucleotide diversity (π) value of 0.18, which means an approximate genus-wide nucleotide sequence homology of 82 %.
The clustered regularly interspaced short palindromic repeats (CRISPRs) mediate resistance to foreign genetic material and thus inhibit horizontal gene transfer . Screening the CRISPRs system in the five Arenimonas genomes using CRISPRfinder program online  found that only one CRISPR system (on contig 41) exist in the genome of A. composti KCTC 12666T . This CRISPR length is 5331 bp, with 29 bp direct repeat (DR) sequences be separated by 87 spacers.
Fifteen available genome sequences of the family Xanthomonadaceae were chosen for genomic based phylogenetic analysis, including the five Arenimonas genomes that were sequenced by us. In total, 1014 core protein sequences were extracted using the cluster algorithm tool OrthoMCL with default parameters . The neighbor-joining (NJ) phylogenetic tree showed that the five Arenimonas species clustered into the same branch (Fig. 5), which is in accordance with the 16S rRNA gene-based phylogeny (Fig. 1).
Similar to A. donghaensis DSM 18148T , the TCA cycle is complete and hexokinase is absent in all the five Arenimonas strains. The proteins responsible for the oxidative phase of pentose phosphate pathway are also incomplete in five Arenimonas strains, this may be part of the reasons that the five Arenimonas strains can only use several single carbon sources.
To the best of our knowledge, this report provides the first genomic information of the genus Arenimonas . The genomic based phylogeny is in agreement with the 16S rRNA gene based one indicating the usefulness of genomic information for bacterial taxonomic classification. Analysis of the genome shows certain correlation between the genotypes and the phenotypes especially on utilization of sole carbon sources.
Korean Agricultural Culture Collection
German Collection of Microorganisms and Cell Cultures
Kwon SW, Kim BY, Weon HY, Baek YK, Go SJ. Arenimonas donghaensis gen. nov., sp. nov., isolated from seashore sand. Int J Syst Evol Microbiol. 2007;57:954–8.
Young CC, Kämpfer P, Ho MJ, Busse HJ, Huber BE, Arun AB, et al. Arenimonas malthae sp. nov., a gammaproteobacterium isolated from an oil-contaminated site. Int J Syst Evol Microbiol. 2007;57:2790–3.
Aslam Z, Park JH, Kim SW, Jeon CO, Chung YR. Arenimonas oryziterrae sp. nov., isolated from a field of rice (Oryza sativa L.) managed under a no-tillage regime, and reclassification of Aspromonas composti as Arenimonas composti comb. nov. Int J Syst Evol Microbiol. 2009;59:2967–72.
Chen F, Shi Z, Wang G. Arenimonas metalli sp. nov., isolated from an iron mine. Int J Syst Evol Microbiol. 2012;62:1744–9.
Jin L, Kim KK, An KG, Oh HM, Lee ST. Arenimonas daejeonensis sp. nov., isolated from compost. Int J Syst Evol Microbiol. 2012;62:1674–8.
Huy H, Jin L, Lee YK, Lee KC, Lee JS, Yoon JH, et al. Arenimonas daechungensis sp. nov., isolated from the sediment of a eutrophic reservoir. Int J Syst Evol Microbiol. 2013;63:484–9.
Jin L, Kim KK, Im WT, Yang HC, Lee ST. Aspromonas composti gen. nov., sp. nov., a novel member of the family Xanthomonadaceae. Int J Syst Evol Microbiol. 2007;57:1876–80.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.
SOAPdenovo v1.05. [http://soap.genomics.org.cn/]
Prokaryotic Genome Annotation Pipeline. [http://www.ncbi.nlm.nih.gov/genome/annotation_prok/]
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, et al. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–8.
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42:D199–205.
Li L, Stoeckert Jr CJ, Roos DS. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Res. 2003;13:2178–89.
Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R. The microbial pan-genome. Curr Opin Genet Dev. 2005;15:589–94.
Darling AE, Mau B, Perna NT. ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5(6):e11147.
Librado P, Rozas J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25:1451–2.
Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nat Rev Micro. 2010;8:317–27.
CRISPRfinder program online. [http://crispr.u-psud.fr/Server/]
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26:541–7.
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms. Proposal for the domains Archaea and Bacteria. Proc Natl Acad Sci U S A. 1990;87:4576–9.
Garrity GM, Bell JA, Phylum LT, XIV. Proteobacteria phyl. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. New York: Part B, Springer; 2005. p. 1.
Validation of publication of new names and new combinations previously effectively published outside the IJSEM. List no. 106. Int J Syst Evol Microbiol 2005, 55:2235–2238. http://dx.doi.org/10.1099/ijs.0.64108-0
Garrity GM, Bell JA, Class LT, III. Gammaproteobacteria class. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. Springer, New York: Part B; 2005. p. 1.
Saddler GS, Bradbury JF. Order III. Xanthomonadales ord. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. Volume 2. 2nd ed. New York: Part B, Springer; 2005. p. 63.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–9.
This work was supported by the National Natural Science Foundation of China (31470226).
The authors declare that they have no competing interests.
FC performed the genomic analysis and wrote the draft manuscript. HW and YC performed the comparative genomic analysis. XL helped the bioinformatics analysis. GW organized the study and revised the manuscript. All authors read and approved the manuscript.
About this article
Cite this article
Chen, F., Wang, H., Cao, Y. et al. High quality draft genomic sequence of Arenimonas donghaensis DSM 18148T . Stand in Genomic Sci 10, 59 (2015). https://doi.org/10.1186/s40793-015-0055-4
- Arenimonas donghaensis
- Comparative genomics
- Genome sequence