- Short genome report
- Open Access
Genome sequence of the soil bacterium Corynebacterium callunae type strain DSM 20147T
Standards in Genomic Sciences volume 10, Article number: 5 (2015)
Corynebacterium callunae DSM 20147T is a member of the genus Corynebacterium which contains Gram-positive and non-spore forming bacteria with a high G + C content. C. callunae was isolated during a screening for l-glutamic acid producing bacteria and belongs to the aerobic and non-haemolytic corynebacteria. As this is a type strain in a subgroup of industrial relevant bacteria for many of which there are also complete genome sequence available, knowledge of the complete genome sequence might enable genome comparisons to identify production relevant genetic loci. This project, describing the 2.84 Mbp long chromosome and the two plasmids, pCC1 (4.11 kbp) and pCC2 (85.02 kbp), with their 2,647 protein-coding and 82 RNA genes, will aid the Genomic Encyclopedia of Bacteria and Archaea project.
Strain DSM 20147T is the type strain in a subgroup of industrial relevant bacteria originally isolated during a screening for l-glutamic acid producing microorganisms and was classified to belong to the genus Corynebacterium. This genus is comprised of Gram-positive bacteria with a high G + C content. It currently contains 126 validly published members (species and subspecies), 4 of which are synonyms of other species within the genus, 27 that were later reclassified as members of 7 other genera, and 1 member abolished in erratum [2–11]. The remaining 93 were isolated from diverse backgrounds like soil, sea, or ripening cheese, but also from human clinical samples and animals.
Within this diverse genus, C. callunae has been found to be a producer of l-glutamic acid, like one of the most prominent representatives of the corynebacteria, C. glutamicum. The biological context of this species is, unfortunately, basically unknown as it was first described in a patent application  that does neither mention the geographic location nor the exact habitat of the strain. Based on the name and the habitats of its close relatives C. glutamicum, C. deserti, and C. efficiens, the most likely habitat of C. callunae is soil around heather plants. But while the biotechnological uses and capabilities of this subgroup within the genus Corynebacterium has been studied in detail, especially for C. glutamicum, the ability of all these strains to secrete considerable amounts of l-glutamic acid is still not well understood in the context of the environment.
C. callunae DSM 20147T harbors two cryptic plasmids: pCC1 (4,109 bp) which encodes a Rep protein that shows similarity to the corynebacterial plasmid pAG3 and pBL1, and pCC2 (85,023 bp) the Rep protein of which has possible orthologs in many other corynebacteria. Aside from this, DSM 20147T is an alkaline-tolerant bacterium, which grows well at pH 5.0 - 9.0 (optimum pH 6–8) . Here we present a summary classification and a set of features for C. callunae DSM 20147T, together with the description of the genomic sequencing and annotation.
Classification and features
A representative genomic 16S rRNA sequence of C. callunae DSM 20147T was compared to the Ribosomal Database Project database  confirming the initial taxonomic classification. C. callunae shows highest similarity to C. glutamicum and C. deserti (97%, respectively).
Figure 1 shows the phylogenetic neighborhood of C. callunae in a 16S rRNA based tree. C. callunae forms a subgroup containing furthermore the species C. glutamicum ATCC 13032T, C. deserti GIMN1.010T, and C. efficiens YS-314T.
C. callunae DSM 20147T is a Gram-positive rod shaped bacterium, which is 1–2 μm long and 0.4-0.6 μm wide (Figure 2). It is described to be non-motile , which coincides with a complete lack of genes associated with ‘cell motility’ (functional category N in COGs table). Growth of DSM 20147T was shown at temperatures between 25–37°C with optimal l-glutamic acid production between 25–35°C . Carbon sources utilized by strain DSM 20147T include dextrose, fructose, galactose, inulin, inositol, maltose, mannitol, mannose, raffinose, salicin, sucrose and trehalose . DSM 20147T tested positive for citrate, catalase and urease, but shows no nitrate reduction activity . Details on the chemotaxonomy are largely missing, but can be inferred from the close relatives C. glutamicum, C. efficiens, and C. deserti. Based on these relatives, meso-diaminopimelic acid is expected to be the major diamino acid of the cell wall, with arabinose and galactose as the main sugars (chemotype IV). Short-chain mycolic acids (32 ± 36 carbon atoms) are also certain to be present, as all necessary genes were found to be present. The major cellular fatty acids are expected to be hexadecanoic acid (C16:0, 40-50%) and octadecenoic acid (C18:1 ω9c, 40-50%) with small amounts of octadecanoic acid (C18:0, ~1%) and possible others. MK-9(H2) is thought to be the major menaquinone, although MK-8(H2) might also be present in significant amounts. Phosphatidylinositol, diphosphatidylglycerol, and phosphatidylglycerol as well as their glycosides are expected to be the main components of the polar lipids (Table 1).
Genome sequencing and annotation
Genome project history
Due to its phylogenetic position in the near neighborhood of industrial relevant species of the genus Corynebacterium, C. callunae was selected for sequencing as part of a project to define production relevant loci in corynebacteria. While not being part of the GEBA project, sequencing of the type strain will nonetheless aid the GEBA effort. The genome project is deposited in the Genomes OnLine Database  and the complete genome sequence is deposited in GenBank. Sequencing, finishing and annotation were performed at the CeBiTec. A summary of the project information is shown in Table 2.
Growth conditions and DNA isolation
C. callunae DSM 20147T was grown aerobically in CASO bouillon (Carl Roth GmbH, Karlsruhe, Germany) at 30°C. DNA was isolated from ~ 108 cells using the protocol described by Tauch et al. .
Genome sequencing and assembly
Two libraries were prepared: a WGS library using the Illumina-Compatible Nextera DNA Sample Prep Kit (Epicentre, WI, U.S.A) and a 6 k MatePair library using the Nextera Mate Pair Sample Preparation Kit, both according to the manufacturer's protocol. Both libraries were sequenced in a 2× 250 bp paired read run on the MiSeq platform, yielding 1,747,266 total reads, providing 99.51× coverage of the genome. Reads were assembled using the Newbler assembler v2.8 (Roche). The initial Newbler assembly consisted of 29 contigs in four scaffolds. Analysis of the four scaffolds revealed two to be an extrachromosomal element (plasmid pCC1 and pCC2), one to make up the chromosome and the remaining one containing the seven copies of the RRN operon.
The Phred/Phrap/Consed software package [30–33] was used for sequence assembly and quality assessment in the subsequent finishing process, gaps between contigs were closed by manual editing in Consed (for repetitive elements).
Gene prediction and annotation were done using the PGAP pipeline . Genes were identified using GeneMark , GLIMMER , and Prodigal . For annotation, BLAST searches against the NCBI Protein Clusters Database  are performed and the annotation is enriched by searches against the Conserved Domain Database  and subsequent assignment of coding sequences to COGs. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE , Infernal , RNAMMer , Rfam , TMHMM , and SignalP .
The genome (on the scale of 2,928,683 bp) includes one circular chromosome of 2,839,5514 bp (52.39% G + C content) and two plasmids of 4,109 bp (54.42% G + C content) and 85,023 bp (54.38% G + C content, [Figure 3]). For chromosome and plasmids, a total of 2,729 genes were predicted, 2,647 of which are protein coding genes. 2,085 (76.40%) of the protein coding genes were assigned to a putative function, the remaining were annotated as hypothetical proteins. 1,937 protein coding genes belong to 314 paralogous families in this genome corresponding to a gene content redundancy of 41.52%. The properties and the statistics of the genome are summarized in [Tables 3, 4 and 5].
Insights from the genome sequence
The complete genome sequence of C. callunae was already mined for biotechnological purposes to define the core genome of the C. glutamicum - C. efficiens - C. callunae subgroup to define the chassis genome for C. glutamicum. Comparison of the three genomes using EDGAR  reveals that the core genome of this group comprises just 1,873 genes and the number of genes that are found only in C. callunae is also relatively small (366), especially when compared to number of singletons found in the other two (926 and 773 in C. glutamicum and C. efficiens, respectively; Figure 4). As C. callunae was shown to produce l-glutamate in an amount comparable to C. glutamicum, C. callunae might be considered as a potential candidate for future genome reduction efforts since the chromosome is already considerably smaller than that of C. glutamicum and C. efficiens (2.84 Mbp versus 3.21 Mbp and 3.15 Mbp, respectively). This future approach is aided by the observation that many of the singletons are clustered in just three regions (I: H924_2045-H924_02230, 37 genes, 25.2 kbp; II: H924_03630-H924_03880, 50 genes 52.5 kbp; III: H924_07070-H924_07380, 61 genes, 48.2 kbp) which constitutes ~ 4.4% of the genome size. As at least region II and region III are likely prophages, loss of these regions should be neutral or even beneficial, as demonstrated for C. glutamicum.
One central prerequisite for future rational strain development is the genetic accessibility of the prospective strain. Knowledge of the complete genome sequence of C. callunae helps to overcome at least two of the main obstacles: the construction of plasmids usable as vectors and removal of elements that hinder DNA transfer. For the former, the knowledge of the sequences of the two plasmids pCC1 and pCC2 allows use of plasmid-tagging approaches via a counter-selectable marker  to cure them, should conventional approaches like heat-shock curing fail. Once cured, the sequence of the plasmids help to identify the minimal gene set necessary for replication to build shuttle vectors, as demonstrated for pCC1 . For the latter, the genome sequence helps to identify restriction-modification systems. A preliminary analysis revealed the presence of at least 4 such systems, one of which is located in the potential prophage region II. Removal of such systems has been shown to significantly enhance the stability of foreign DNA introduced and thus facilitating genetic engineering approaches .
The complete genome sequence of C. callunae is the third genome sequence of the C. glutamicum - C. deserti - C. efficiens - C. callunae subgroup of L-glutamic acid producing corynebacteria within the genus Corynebacterium. Knowledge of the complete genome sequence has already contributed to identify the core genome of this group. With a size of 2.84 Mbp and an a total of 2,647 protein coding genes, the genome of C. callunae is by far the smallest within this group. Therefore, this bacterium might be an ideal choice for future development of a platform strain as the otherwise high degree of similarity of its genome content to the well studied C. glutamicum would allow an easy transfer of knowledge to the new host. Furthermore, knowledge of the complete genome sequence also facilitates the identification of possible targets to improve the accessibility to genetic engineering (like restriction-modification systems) and to enhance genome stability (like phages and transposases).
MP prepared and wrote the manuscript, AA performed library preparation and sequencing, HB and KN performed electron microscopy, JK coordinated the study, and CR assembled and analyzed the genome sequence.
Center for Biotechnology
Genomic Encyclopedia of Bacteria and Archaea.
Lee WH, Good RC: Amino Acid Synthesis. In Book Amino Acid Synthesis (Editor ed.^eds.). City: International Minerals & Chemical Corporation; 1963:1–14.
Wu C-Y, Zhuang L, Zhou S-G, Li F-B, He J: Corynebacterium humireducens sp. nov., an alkaliphilic, humic acid-reducing bacterium isolated from a microbial fuel cell. Int J Syst Evol Microbiol 2011, 61:882–887. 10.1099/ijs.0.020909-0
Akasaka H, Akimov VN, Anderson RC, Ariskina EV, Austin B, Behrendt U, Benno Y, Benson DR, Bernard KA, Berry AM, Biavati B, Buczolits S, Busse H-J, Butler WR, Carro L, Cavaletti L, Chen W-F, Collins MD, Costa MSd, Cui X-L, Denner EBM, Dewhirst FE, Donadio S, Dorofeeva LV, Euzéby JP, Evtushenko LI, Fernández-Garayzábal JF, Franco C, Funke G, Garrity GM: The Actinobacteria. 2nd edition. New York: Springer Verlag; 2012.
Aravena-Román M, Spröer C, Siering C, Inglis T, Schumann P, Yassin AF: Corynebacterium aquatimens sp. nov., a lipophilic Corynebacterium isolated from blood cultures of a patient with bacteremia. Syst Appl Microbiol 2012, 35:380–384. 10.1016/j.syapm.2012.06.008
Validation List No. 148: List of new names and new combinations previously effectively, but not validly, published. Int J Syst Evol Microbiol 2012, 62:2549–2554.
Zhou Z, Yuan M, Tang R, Chen M, Lin M, Zhang W: Corynebacterium deserti sp. nov., isolated from desert sand. Int J Syst Evol Microbiol 2012, 62:791–794. 10.1099/ijs.0.030429-0
Frischmann A, Knoll A, Hilbert F, Zasada AA, Kämpfer P, Busse H-J: Corynebacterium epidermidicanis sp. nov., isolated from skin of a dog. Int J Syst Evol Microbiol 2012, 62:2194–2200. 10.1099/ijs.0.036061-0
Wiertz R, Schulz SC, Müller U, Kämpfer P, Lipski A: Corynebacterium frankenforstense sp. nov. and Corynebacterium lactis sp. nov., isolated from raw cow milk. Int J Syst Evol Microbiol 2013, 63:4495–4501. 10.1099/ijs.0.050757-0
Shin N-R, Jung M-J, Kim M-S, Roh SW, Nam Y-D, Bae J-W: Corynebacterium nuruki sp. nov., isolated from an alcohol fermentation starter. Int J Syst Evol Microbiol 2011, 61:2430–2434. 10.1099/ijs.0.027763-0
Hoyles L, Ortman K, Cardew S, Foster G, Rogerson F, Falsen E: Corynebacterium uterequi sp. nov., a non-lipophilic bacterium isolated from urogenital samples from horses. Vet Microbiol 2013, 165:469–474. 10.1016/j.vetmic.2013.03.025
Oren A, Garrity GM: List of new names and new combinations previously effectively, but not validly, published. Int J Syst Evol Microbiol 2013, 63:3931–3934.
Cole JR, Wang Q, Cardenas E, Fish J, Chai B, Farris RJ, Kulam-Syed-Mohideen AS, McGarrell DM, Marsh T, Garrity GM, Tiedje JM: The ribosomal database project: improved alignments and new tools for rRNA analysis. Nucleic Acids Res 2009, 37:D141-D145. 10.1093/nar/gkn879
Bruno WJ, Socci ND, Halpern AL: Weighted neighbor joining: a likelihood-based approach to distance-based phylogeny reconstruction. Mol Biol Evol 2000, 17:189–197. 10.1093/oxfordjournals.molbev.a026231
Cole JR, Chai B, Farris RJ, Wang Q, Kulam-Syed-Mohideen AS, McGarrell DM, Bandela AM, Cardenas E, Garrity GM, Tiedje JM: The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data. Nucleic Acids Res 2007, 35:D169-D172. 10.1093/nar/gkl889
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, Ashburner M, Axelrod N, Baldauf S, Ballard S, Boore J, Cochrane G, Cole J, Dawyndt P, De Vos P, DePamphilis C, Edwards R, Faruque N, Feldman R, Gilbert J, Gilna P, Glockner FO, Goldstein P, Guralnick R, Haft D, Hancock D: The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008, 26:541–547. 10.1038/nbt1360
Woese CR, Kandler O, Wheelis ML: Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A 1990, 87:4576–4579. 10.1073/pnas.87.12.4576
Garrity GM, Holt JG: The Road Map to the Manual. In Bergey´s Manual of Systematic Bacteriology. Volume 1. 2nd edition. Edited by: Garrity GM, Boone DR, Castenholz RW. New York: Springer; 2001:119–169.
Stackebrandt E, Rainey FA, Ward-Rainey NL: Proposal for a New hierarchic classification system, actinobacteria classis nov. Int J Syst Bacteriol 1997, 47:479–491. 10.1099/00207713-47-2-479
Euzéby JP, Tindall BJ: Nomenclatural type of orders: corrections necessary according to Rules 15 and 21a of the Bacteriological Code (1990 Revision), and designation of appropriate nomenclatural types of classes and subclasses. Request for an opinion. Int J Syst Evol Microbiol 2001, 51:725–727.
Zhi XY, Li WJ, Stackebrandt E: An update of the structure and 16S rRNA gene sequence-based definition of higher ranks of the class Actinobacteria, with the proposal of two new suborders and four new families and emended descriptions of the existing higher taxa. Int J Syst Evol Microbiol 2009, 59:589–608. 10.1099/ijs.0.65780-0
Buchanan RE: Studies in the nomenclature and classification of the bacteria: II. The primary subdivisions of the schizomycetes. J Bacteriol 1917, 2:155–164.
Skerman VBD, McGowan V, Sneath PHA: Approved lists of bacterial names. Int J Syst Bacteriol 1980, 30:225–420. 10.1099/00207713-30-1-225
Lehmann KB, Neumann RO: Lehmann's Medizin, Handatlanten. X Atlas und Grundriss der Bakteriologie und Lehrbuch der speziellen bakteriologischen Diagnostik. 4th edition. München: J.F. Lehmann; 1907.
Bernard KA, Wiebe D, Burdz T, Reimer A, Ng B, Singh C, Schindle S, Pacheco AL: Assignment of Brevibacterium stationis (ZoBell and Upham 1944) Breed 1953 to the genus Corynebacterium , as Corynebacterium stationis comb. nov., and emended description of the genus Corynebacterium to include isolates that can alkalinize citrate. Int J Syst Evol Microbiol 2010, 60:874–879. 10.1099/ijs.0.012641-0
Lehmann KB, Neumann RO: Atlas und Grundriss der Bakteriologie und Lehrbuch der speziellen bakteriologischen Diagnostik. München: J.F. Lehmanns Verlag; 1896.
Yamada K, Komagata K: Taxonomic studies on coryneform bacteria. V. Classification of coryneform bacteria. J Gen Appl Microbiol 1972, 18:417–431. 10.2323/jgam.18.417
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet 2000, 25:25–29. 10.1038/75556
Liolios K, Chen IM, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM, Kyrpides NC: The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2009, 38:D346-D354.
Tauch A, Kassing F, Kalinowski J, Pühler A: The erythromycin resistance gene of the Corynebacterium xerosis R-plasmid pTP10 also carrying chloramphenicol, kanamycin, and tetracycline resistances is capable of transposition in Corynebacterium glutamicum . Plasmid 1995, 33:168–179. 10.1006/plas.1995.1018
Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998, 8:186–194.
Ewing B, Hillier L, Wendl MC, Green P: Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 1998, 8:175–185. 10.1101/gr.8.3.175
Gordon D: Viewing and editing assembled sequences using Consed. Curr Protoc Bioinform 2003,11(2):1–43.
Gordon D, Abajian C, Green P: Consed: a graphical tool for sequence finishing. Genome Res 1998, 8:195–202. 10.1101/gr.8.3.195
Prokaryotic Genomes Automatic Annotation Pipeline (PGAAP). [http://www.ncbi.nlm.nih.gov/books/NBK174280/]
Borodovsky M, Mills R, Besemer J, Lomsadze A: Prokaryotic gene prediction using GeneMark and GeneMark.hmm. Curr Protoc Bioinform 2003,4(5):1–16.
Delcher AL, Harmon D, Kasif S, White O, Salzberg SL: Improved microbial gene identification with GLIMMER. Nucleic Acids Res 1999, 27:4636–4641. 10.1093/nar/27.23.4636
Hyatt D, Chen GL, Locascio PF, Land ML, Larimer FW, Hauser LJ: Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinform 2010, 11:119. 10.1186/1471-2105-11-119
Klimke W, Agarwala R, Badretdin A, Chetvernin S, Ciufo S, Fedorov B, Kiryutin B, O'Neill K, Resch W, Resenchuk S, Schafer S, Tolstoy I, Tatusova T: The national center for biotechnology Information's protein clusters database. Nucleic Acids Res 2009, 37:D216-D223. 10.1093/nar/gkn734
Marchler-Bauer A, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, He S, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Liebert CA, Liu C, Lu F, Lu S, Marchler GH, Mullokandov M, Song JS, Tasneem A, Thanki N, Yamashita RA, Zhang D, Zhang N, Bryant SH: CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res. 2009, 37:D205–10. 10.1093/nar/gkn845
Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997, 25:955–964. 10.1093/nar/25.5.0955
Eddy SR: A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure. BMC Bioinform 2002, 3:18. 10.1186/1471-2105-3-18
Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW: RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007, 35:3100–3108. 10.1093/nar/gkm160
Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A: Rfam: annotating non-coding RNAs in complete genomes. Nucleic Acids Res 2005, 33:D121-D124.
Krogh A, Larsson B, von Heijne G, Sonnhammer EL: Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001, 305:567–580. 10.1006/jmbi.2000.4315
Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 2004, 340:783–795. 10.1016/j.jmb.2004.05.028
Unthan S, Baumgart M, Radek A, Herbst M, Siebert D, Brühl N, Bartsch A, Bott M, Wiechert W, Marin K, Hans S, Kramer R, Seibold G, Frunzke J, Kalinowski J, Rückert C, Wendisch VF, Noack S: Chassis organism from Corynebacterium glutamicum - a top-down approach to identify and delete irrelevant gene clusters. Biotechnol J in press. http://dx.doi.org/10.1002/biot.201400041
Blom J, Albaum SP, Doppmeier D, Puhler A, Vorholter FJ, Zakrzewski M, Goesmann A: EDGAR: a software framework for the comparative analysis of prokaryotic genomes. BMC Bioinform 2009, 10:154. Chapter 4 10.1186/1471-2105-10-154
Baumgart M, Unthan S, Rückert C, Sivalingam J, Grünberger A, Kalinowski J, Bott M, Noack S, Frunzke J: Construction of a prophage-free variant of Corynebacterium glutamicum ATCC 13032 - a platform strain for basic research and industrial biotechnology. Appl Environ Microbiol 2013, 79:6006–6015. 10.1128/AEM.01634-13
Jäger W, Schäfer A, Pühler A, Labes G, Wohlleben W: Expression of the Bacillus subtilis sacB gene leads to sucrose sensitivity in the gram-positive bacterium Corynebacterium glutamicum but not in Streptomyces lividans . J Bacteriol 1992, 174:5462–5465.
Venkova-Canova T, Pátek M, Nešvera J: Characterization of the cryptic plasmid pCC1 from Corynebacterium callunae and its use for vector construction. Plasmid 2004, 51:54–60. 10.1016/j.plasmid.2003.09.002
Christian Rückert acknowledges funding through a grant by the Federal Ministry for Education and Research (0316017A) within the BioIndustry2021 initiative. We acknowledge support for the Article Processing Charge by the Deutsche Forschungsgemeinschaft and the Open Access Publication Funds of Bielefeld University Library.
The authors declare that they have no competing interests.