High quality draft genomic sequence of Arenimonas donghaensis DSM 18148T

Chen, Fang; Wang, Hui; Cao, Yajing; Li, Xiangyang; Wang, Gejiao

doi:10.1186/s40793-015-0055-4

Extended genome report
Open access
Published: 26 August 2015

High quality draft genomic sequence of Arenimonas donghaensis DSM 18148^T

Fang Chen¹,
Hui Wang¹,
Yajing Cao¹,
Xiangyang Li¹ &
…
Gejiao Wang¹

Standards in Genomic Sciences volume 10, Article number: 59 (2015) Cite this article

Abstract

Arenimonas donghaensis is the type species of genus Arenimonas which belongs to family Xanthomonadaceae within Gammaproteobacteria. In this study, a total of five type strains of Arenimonas were sequenced. The draft genomic information of A. donghaensis DSM 18148^T is described and compared with other four genomes of Arenimonas. The genome size of A. donghaensis DSM 18148^T is 2,977,056 bp distributed in 51 contigs, containing 2685 protein-coding genes and 49 RNA genes.

Introduction

Arenimonas donghaensis DSM 18148^T (= HO3-R19^T = KACC 11381^T ) was isolated from seashore sand [1] which belongs to family Xanthomonadaceae . So far, the genus Arenimonas contained seven species, Arenimonas donghaensis (type species) [1], Arenimonas malthae [2], Arenimonas oryziterrae [3], Arenimonas composti [3], Arenimonas metalli [4], Arenimonas daejeonensis [5] and Arenimonas daechungensis [6]. These bacteria were isolated from seashore sand [1], oil-contaminated soil [2], rice rhizosphere [3], compost [3], iron mine [4], compost [5] and sediment of a eutrophic reservoir [6], respectively. The species A. composti [3] was previously classified as Aspromonas composti [7].

The common characteristics of the Arenimonas strains are Gram-staining-negative, aerobic, rod-shaped, non-spore-forming, oxidase-positive, non-indole-producing, non-nitrate-reducing, containing iso-C_16:0 and iso-C_15:0 as the major fatty acids, phosphatidylglycerol and phosphatidylethanolamine as the major polar lipids, Q-8 as the major respiratory quinone, and possessing relatively high DNA G + C content (63.9–70.8 mol %) [1–7].

In order to provide genome information and determine genomic differences of Arenimonas species, we performed genome sequencing of strains A. donghaensis DSM 18148^T , A. composti KCTC 12666^T , A. malthae CCUG 53596^T , A. metalli CF5-1^T and A. oryziterrae KCTC 22247^T . In this study, we report the genomic features of A. donghaensis DSM 18148^T and compare it to the close relatives.

Organism information

Classification and features

Strain A. donghaensis DSM 18148^T shares 93.1–95.7 % 16S rRNA gene identities with the other six type strains of Arenimonas species, A. malthae CC-JY-1^T (DQ239766) (95.7 %), A. daejeonensis T7-07^T (AM229325) (95.7 %), A. metalli CF5-1^T (HQ698842) (94.6 %), A. oryziterrae YC6267^T (EU376961) (94.3 %), A. composti TR7-09^T (AM229324) (94.3 %) and A. daechungensis CH15-1^T (JN033774) (93.1 %). A 16S rRNA gene based neighbor-joining phylogenetic tree of the related strains was obtained using MEGA 5.05 software [8] (Fig. 1).

Cells of A. donghaensis DSM 18148^T are Gram-negative, aerobic, non-spore-forming, straight or slightly curved rods, motile by means of a single polar flagellum. Colonies are yellowish white, translucent and convex on R2A agar after 3 d cultivation (Fig. 2). API ID 32 GN and Biolog GN2 MicroPlate systems (bioMe’rieux) were used to investigate sole carbon source utilization, and β-hydroxybutyric acid, L-alaninamide, L-glutamic acid and glycyl-L-glutamic acid could be utilized by strain DSM 18148^T (Table 1).

Table 1 Classification and general features of A. donghaensis strain DSM 18148^T according to the MIGS recommendations [21]

Full size table

The major fatty acids of A. donghaensis DSM 18148^T are iso-branched types, such as iso-C_16:0, iso-C_15:0 and iso-C_17:1 ω9c [1]. Major isoprenoid quinone of this bacterium is Q-8 [1]. Diphosphatidylglycerol (DPG), PG and PE are the major polar lipids of this strain [1].

Genome sequencing information

Genome project history

Genome sequencing project of A. donghaensis DSM 18148^T was carried out in April, 2013 and was finished in two months. The obtained high-quality draft genome of A. donghaensis DSM 18148^T has been deposited at DDBJ/EMBL/GenBank under accession number AVCJ00000000. The version described in this study is the first version, AVCJ01000000. The genome sequencing project information is summarized in Table 2.

Table 2 Project information

Full size table

Growth conditions and genomic DNA preparation

A. donghaensis DSM 18148^T was cultivated aerobically in LB medium at 28 °C for 3 d. The DNA was extracted, concentrated and purified using the QiAamp kit according to the manufacturer’s instruction (Qiagen, Germany).

Genome sequencing and assembly

The whole-genome sequence of A. donghaensis DSM 18148^T was determined using the Illumina Hiseq2000 [9] with the Paired-End library strategy (300 bp insert size) at Shanghai Majorbio Bio-pharm Technology Co., Ltd. [10] (Shanghai, China). A total of 9,571,421 reads with an average read length of 93 bp (885.9 Mb data) was obtained. The detailed methods of library construction and sequencing can be found at Illumina’s official website [9]. Using SOAPdenovo v1.05 [11], these reads were assembled into 51 contigs (>200 bp) with a genome size of 2,977,056 bp and an average coverage of 332.4 x.

Genome annotation

The draft sequence of strain A. donghaensis DSM 18148^T was submitted to NCBI Prokaryotic Genome Annotation Pipeline [12] for annotation according to the draft WGS annotation guideline at this website. This annotation pipeline combines the GeneMarkS+ algorithm with the similarity-based gene detection approach to calling gene. The function of the predicted genes from the automatic result was manually modified through BlastX analysis against the NCBI protein database with E-value cutoff 1-e²⁰.

Genome properties

The whole genome of A. donghaensis DSM 18148^T is 2,977,056 bp in length, with a G + C content of 68.7 % (Fig. 3 and Table 3), and distributed in 51 contigs (>200 bp). Of the 2735 predicted genes, 2685 (98.17 %) are protein-coding genes, 49 (1.79 %) are RNA genes and 1 (0.04 %) are pseudogenes. A total of 472 (17.26 %) CDSs were assigned with putative functions, while the remaining ones were annotated as hypothetical proteins. The result of protein function classification is shown in Table 4, which was performed by searching all the predicted coding sequences of strain DSM 18148^T against the Clusters of Orthologous Groups protein database [13] using BlastP algorithm with E-value cutoff 1-e¹⁰. A more detailed summary of the genome properties about this strain is provided in Table 3.

Table 3 Genome statistics

Full size table

Table 4 Number of genes associated with general COG functional categories

Full size table

Insights from the genome sequences

Strain A. donghaensis DSM 18148^T can only use several sole carbon sources and cannot assimilate glucose and other sugars [1]. Genome analysis using the Kyoto Encyclopedia of Genes and Genomes (KEGG) [14] orthology and pathway assignment analysis revealed this strain has a complete TCA cycle, but lacks the hexokinase which catalyzes the first step of glycolysis, as well as the glucose-6-phosphate dehydrogenase, gluconolactonase and 6-phosphogluconate dehydrogenase that responsible for the oxidative phase of pentose phosphate pathway. This is in agreement with the experimental result that this bacterium can only use several sole carbon sources.

The general features of the five Arenimonas sequenced genomes are summarized in Table 5. Orthologs clustering analysis was performed using OrthoMCL [15] with Match cutoff of 50 % and E-value Exponent cutoff of 1-e⁵ for the five Arenimonas genomes. These five Arenimonas bacteria share 1014 genes, which are classified into 21 COG functional categories. The major categories are energy production and conversion (8.7 %), amino acid transport and metabolism (8.7 %), coenzyme transport and metabolism (5.8 %), lipid transport and metabolism (5.1 %), translation, ribosomal structure and biogenesis (12.4 %), replication, recombination and repair (5.2 %), cell wall/membrane/envelope biogenesis (5.9 %), posttranslational modification, protein turnover, chaperones (6.3 %), general function prediction only (8.4 %), function unknown (7.3 %) and signal transduction mechanisms (5.3 %) (Fig. 4 and Table 6).

Table 5 General features of the five Arenimonas genomes

Full size table

Table 6 Number of genes in the core genome of the five analyzed Arenimonas genomes associated with general COG functional categories

Full size table

There are 601 strain-specific genes for A. donghaensis DSM 18148^T which may contribute to species-specific features of this bacterium. Among them, 359 are classified into 20 COG functional categories major belonging to transcription (6.3 %), general function prediction only (8.5 %), function unknown (7.3 %) and signal transduction mechanisms (9.0 %). The remaining 242 unique genes (40.3 %) are not classified into any COG categories (Fig. 4 and Table 7). In addition, the five Arenimonas strains had a pan-genome [16] size of 7501 genes. The nucleotide diversity (π) was calculated using MAUVE v2.3 [17] and DnaSP v5 [18]. The five genomes of Arenimonas species had a nucleotide diversity (π) value of 0.18, which means an approximate genus-wide nucleotide sequence homology of 82 %.

Table 7 Number of strain-specific genes of A. donghaensis DSM 18148^T associated with general COG functional categories

Full size table

The clustered regularly interspaced short palindromic repeats (CRISPRs) mediate resistance to foreign genetic material and thus inhibit horizontal gene transfer [19]. Screening the CRISPRs system in the five Arenimonas genomes using CRISPRfinder program online [20] found that only one CRISPR system (on contig 41) exist in the genome of A. composti KCTC 12666^T . This CRISPR length is 5331 bp, with 29 bp direct repeat (DR) sequences be separated by 87 spacers.

Fifteen available genome sequences of the family Xanthomonadaceae were chosen for genomic based phylogenetic analysis, including the five Arenimonas genomes that were sequenced by us. In total, 1014 core protein sequences were extracted using the cluster algorithm tool OrthoMCL with default parameters [15]. The neighbor-joining (NJ) phylogenetic tree showed that the five Arenimonas species clustered into the same branch (Fig. 5), which is in accordance with the 16S rRNA gene-based phylogeny (Fig. 1).

Similar to A. donghaensis DSM 18148^T , the TCA cycle is complete and hexokinase is absent in all the five Arenimonas strains. The proteins responsible for the oxidative phase of pentose phosphate pathway are also incomplete in five Arenimonas strains, this may be part of the reasons that the five Arenimonas strains can only use several single carbon sources.

Conclusions

To the best of our knowledge, this report provides the first genomic information of the genus Arenimonas . The genomic based phylogeny is in agreement with the 16S rRNA gene based one indicating the usefulness of genomic information for bacterial taxonomic classification. Analysis of the genome shows certain correlation between the genotypes and the phenotypes especially on utilization of sole carbon sources.

Abbreviations

KACC:: Korean Agricultural Culture Collection
DSMZ:: German Collection of Microorganisms and Cell Cultures
DPG:: Diphosphatidylglycerol
PG:: Phosphatidylglycerol
PE:: Phosphatidylethanolamine

References

Kwon SW, Kim BY, Weon HY, Baek YK, Go SJ. Arenimonas donghaensis gen. nov., sp. nov., isolated from seashore sand. Int J Syst Evol Microbiol. 2007;57:954–8.
Article CAS PubMed Google Scholar
Young CC, Kämpfer P, Ho MJ, Busse HJ, Huber BE, Arun AB, et al. Arenimonas malthae sp. nov., a gammaproteobacterium isolated from an oil-contaminated site. Int J Syst Evol Microbiol. 2007;57:2790–3.
Article CAS PubMed Google Scholar
Aslam Z, Park JH, Kim SW, Jeon CO, Chung YR. Arenimonas oryziterrae sp. nov., isolated from a field of rice (Oryza sativa L.) managed under a no-tillage regime, and reclassification of Aspromonas composti as Arenimonas composti comb. nov. Int J Syst Evol Microbiol. 2009;59:2967–72.
Article CAS PubMed Google Scholar
Chen F, Shi Z, Wang G. Arenimonas metalli sp. nov., isolated from an iron mine. Int J Syst Evol Microbiol. 2012;62:1744–9.
Article CAS PubMed Google Scholar
Jin L, Kim KK, An KG, Oh HM, Lee ST. Arenimonas daejeonensis sp. nov., isolated from compost. Int J Syst Evol Microbiol. 2012;62:1674–8.
Article CAS PubMed Google Scholar
Huy H, Jin L, Lee YK, Lee KC, Lee JS, Yoon JH, et al. Arenimonas daechungensis sp. nov., isolated from the sediment of a eutrophic reservoir. Int J Syst Evol Microbiol. 2013;63:484–9.
Article CAS PubMed Google Scholar
Jin L, Kim KK, Im WT, Yang HC, Lee ST. Aspromonas composti gen. nov., sp. nov., a novel member of the family Xanthomonadaceae. Int J Syst Evol Microbiol. 2007;57:1876–80.
Article PubMed Google Scholar
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28:2731–9.
Article CAS PubMed PubMed Central Google Scholar
Illumina. [http://www.illumina.com.cn/]
Majorbio. [http://www.majorbio.com/]
SOAPdenovo v1.05. [http://soap.genomics.org.cn/]
Prokaryotic Genome Annotation Pipeline. [http://www.ncbi.nlm.nih.gov/genome/annotation_prok/]
Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, et al. The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res. 2001;29:22–8.
Article CAS PubMed PubMed Central Google Scholar
Kanehisa M, Goto S, Sato Y, Kawashima M, Furumichi M, Tanabe M. Data, information, knowledge and principle: back to metabolism in KEGG. Nucleic Acids Res. 2014;42:D199–205.
Article CAS PubMed PubMed Central Google Scholar
Li L, Stoeckert Jr CJ, Roos DS. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Res. 2003;13:2178–89.
Article CAS PubMed PubMed Central Google Scholar
Medini D, Donati C, Tettelin H, Masignani V, Rappuoli R. The microbial pan-genome. Curr Opin Genet Dev. 2005;15:589–94.
Article CAS PubMed Google Scholar
Darling AE, Mau B, Perna NT. ProgressiveMauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5(6):e11147.
Article PubMed PubMed Central Google Scholar
Librado P, Rozas J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25:1451–2.
Article CAS PubMed Google Scholar
Labrie SJ, Samson JE, Moineau S. Bacteriophage resistance mechanisms. Nat Rev Micro. 2010;8:317–27.
Article CAS Google Scholar
CRISPRfinder program online. [http://crispr.u-psud.fr/Server/]
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26:541–7.
Article CAS PubMed PubMed Central Google Scholar
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms. Proposal for the domains Archaea and Bacteria. Proc Natl Acad Sci U S A. 1990;87:4576–9.
Article CAS PubMed PubMed Central Google Scholar
Garrity GM, Bell JA, Phylum LT, XIV. Proteobacteria phyl. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. New York: Part B, Springer; 2005. p. 1.
Chapter Google Scholar
Validation of publication of new names and new combinations previously effectively published outside the IJSEM. List no. 106. Int J Syst Evol Microbiol 2005, 55:2235–2238. http://dx.doi.org/10.1099/ijs.0.64108-0
Garrity GM, Bell JA, Class LT, III. Gammaproteobacteria class. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. 2. 2nd ed. Springer, New York: Part B; 2005. p. 1.
Chapter Google Scholar
Saddler GS, Bradbury JF. Order III. Xanthomonadales ord. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual of Systematic Bacteriology, vol. Volume 2. 2nd ed. New York: Part B, Springer; 2005. p. 63.
Chapter Google Scholar
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–9.
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgments

This work was supported by the National Natural Science Foundation of China (31470226).

Author information

Authors and Affiliations

State Key Laboratory of Agricultural Microbiology, College of Life Sciences and Technology, Huazhong Agricultural University, Wuhan, 430070, P. R. China
Fang Chen, Hui Wang, Yajing Cao, Xiangyang Li & Gejiao Wang

Authors

Fang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Hui Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yajing Cao
View author publications
You can also search for this author in PubMed Google Scholar
Xiangyang Li
View author publications
You can also search for this author in PubMed Google Scholar
Gejiao Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gejiao Wang.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

FC performed the genomic analysis and wrote the draft manuscript. HW and YC performed the comparative genomic analysis. XL helped the bioinformatics analysis. GW organized the study and revised the manuscript. All authors read and approved the manuscript.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Chen, F., Wang, H., Cao, Y. et al. High quality draft genomic sequence of Arenimonas donghaensis DSM 18148^T . Stand in Genomic Sci 10, 59 (2015). https://doi.org/10.1186/s40793-015-0055-4

Download citation

Received: 22 September 2014
Accepted: 05 August 2015
Published: 26 August 2015
DOI: https://doi.org/10.1186/s40793-015-0055-4

High quality draft genomic sequence of Arenimonas donghaensis DSM 18148^T

Abstract

Introduction