- Extended genome report
- Open Access
Draft genome sequence of Fermentimonas caenicola strain SIT8, isolated from the human gut
Standards in Genomic Sciences volume 13, Article number: 8 (2018)
We report the properties of a draft genome sequence of the bacterium Fermentimonas caenicola strain SIT8 (= CSUR P1560). This strain, whose genome is described here, was isolated from the fecal flora of a healthy 28-month-old Senegalese boy. Strain SIT8 is a facultatively anaerobic Gram-negative bacillus. Here we describe the features of this organism, together with the complete genome sequence and annotation. The 2,824,451-bp long (1 chromosome but no plasmid) contains 2354 protein-coding and 46 RNA genes, including four rRNA genes.
Fermentimonas caenicola strain SIT8 (= CSUR P1560) was isolated from the stool of a healthy 28-month-old Senegalese boy as part of a study aiming at cultivating all species within the human gastro-intestinal microbiota. It is a Gram-negative, facultatively anaerobic, indole-negative bacillus. Initially, we had named this bacterium “Lascolabacillus massiliensis” as it exhibited unique features among members of the family Porphyromonadaceae . However, concomitantly to our work, Hahnke et al. formally described the genus Fermentimonas in 2016 . To date, this genus contains only one species, F. caenicola , the type strain of which, ING2-ESBT, exhibits a 100% 16S rRNA sequence identity with strain SIT8. As a consequence, strain SIT8 belongs to the species F. caenicola . Strain ING2-ESB2T was isolated from a mesophilic laboratory-scale biogas reactor . To the best of our knowledge, we report here the first isolation of F. caenicola from the fecal flora of a human being .
Herein, we present a set of features for F. caenicola strain SIT8 together with the description of the complete genomic sequence and annotation.
Classification and features
Fermentimonas caenicola strain SIT8 was isolated from the stool of a healthy 28-month-old Senegalese boy (Table 1). The patient’s parents gave informed signed consent, and the agreement of the National Ethics Committee of Senegal and the ethics committee of the IFR48 (Marseille, France, agreement numbers 11–017 and 09–022) were obtained. Strain SIT8 was initially grown after 10 days of culture in a medium enriched with 5% sheep blood and sterile-filtered sheep rumen, in an aerobic atmosphere at 37 °C. The bacterium was sub-cultured on 5% sheep blood-enriched Columbia agar (bioMérieux, Marcy l’Etoile, France) and grew in 24 h at 37 °C in both aerobic and anaerobic conditions.
Using our systematic matrix-assisted laser desorption-ionization time-of-flight screening on a MicroFlex spectrometer (Bruker Daltonics, Bremen, Germany) , strain SIT8 exhibited no significant score, suggesting that it was not a member of any known species (Fig. 1). We added the spectrum from strain SIT8 to our database (Fig. 1). Strain SIT8 exhibited a 100% 16S rRNA sequence identity with Fermentimonas caenicola strain ING2-E5BT (GenBank accession KP233810), the phylogenetically closest species with a validly published name in nomenclature (Fig. 2). The 16S rRNA sequence of strain SIT8 has been deposited in GenBank under number LN827535.
Growth at different temperatures (29, 37 and 55 °C) was tested. Growth of the strain was tested in 5% sheep blood-enriched Columbia agar (bioMérieux) and Tryptic Soy agar (Becton–Dickinson, Le Pont-de-Claix, France) under anaerobic and microaerophilic conditions using the GENbag anaer and GENbag microaer systems, respectively (bioMérieux), and under aerobic conditions, with or without 5% CO2. Growth was tested for salt tolerance, with 0–5, 50 and 100% (w/v) NaCl. The pH range for growth was tested at pH 6.5 and 8.5 using Tryptic Soy agar. Phenotypic tests were performed using API ZYM, API 20NE and API 50CH strips (bioMérieux). In vitro susceptibility to antibiotics was determined using the disk-diffusion method on 5% sheep blood-enriched Mueller–Hinton agar (bioMérieux).
Electron microscopy was performed with detection Formvar coated grids which were deposited on a 40 μL bacterial suspension drop and incubated at 37 °C for 30 min, followed by a 10 s incubation on ammonium molybdate 1%. Grids were then observed using a Morgagni 268D transmission electron microscope (Philips) at an operating voltage of 60 kV.
Different growth temperatures (29 °C, 37 °C, 55 °C), pH and salinity were determined. Growth was obtained at 29 and 37 °C, with optimal growth at 37 °C, at pH 6.5–8.5 and at NaCl concentration of 0 to 5 g/L. Strain growth was observed in both aerobic and anaerobic conditions and with or without 5% CO2. Colonies were pale grey and 1.5 mm in diameter on 5% sheep blood-enriched Columbia agar (bioMérieux). A motility test was negative. Cells were Gram-negative, rod-shaped, polymorphic (Fig. 3), unable to form spores and exhibited a mean diameter of 0.35 μm (range 0.3–0.4 μm) and a mean length of 3.8 μm (range 1–8.8 μm) (Fig. 4).
Strain SIT8 exhibited neither catalase nor oxidase activities. Using an API ZYM strip (bioMérieux), positive reactions were observed for alkaline phosphatase, acid phosphatase, and N-acetyl-β-glucosaminidase. Negative reactions were noted for esterase, esterase-lipase, lipase, leucine arylamidase, β-glucosidase, β-galactosidase, α-mannosidase, α-fucosidase, cystine arylamidase, valine arylamidase, trypsin, α-chymotrypsin, α-glucosidase, α-galactosidase, β-glucuronidase, and Naphthol-AS-BI-phosphohydrolase.
Using an API 50 CH strip (bioMérieux), positive reactions were observed after 48 h of incubation for the fermentation of D-arabinose, D-galactose, D-glucose, D-mannose, N-acetylglucosamine, amygdalin, arbutin, salicin, D-cellobiose, D-maltose, D-lactose, D-trehalose, D-melezitose, amidon, glycogen, gentiobiose, D-turanose, and potassium-5-ketogluconate. Negative reactions were observed for the fermentation of glycerol, erythritol, L-arabinose, D-ribose, D-xylose, L-xylose, D-adonitol, methyl-β-D-xylopyranoside, D-fructose, L-sorbose, L-rhamnose, dulcitol, inositol, D-mannitol, D-sorbitol, methyl-αD-xylopyranoside, methyl-αD-glucopyranoside, D-mellibiose, D-saccharose, inulin, D-raffinose, xylitol, D-lyxose, D-tagatose, D-fucose, L-fucose, D-arabitol, L-arabitol, potassium gluconate, potassium 2-ketogluconate.
Using an API 20NE strip (bioMérieux), a positive reaction was obtained only for esculin hydrolysis while negative reactions were observed for nitrate reduction, urease, indole production, arginine dihydrolase, glucose fermentation, arabinose, mannose, mannitol, N-acetyl-glucosamine, maltose, gluconate, caprate, adipate, malate, citrate, phenyl-acetate assimilation, and gelatin hydrolysis.
Strain SIT8 was susceptible to penicillin, amoxicillin, amoxicillin/clavulanic acid, ticarcillin, ceftriaxone, cefalotin, imipenem, gentamicin, trimethoprim/sulfamethoxazole, erythromycin, doxycycline, metronidazole, vancomycin, rifampicin, ciprofloxacin, nitrofurantoin, and colistin, but resistant to kanamycin.
Cellular fatty acid methyl ester analysis was performed by GC/MS. Two samples were prepared with approximately 30 mg of bacterial biomass per tube harvested from several culture plates. Fatty acid methyl esters were prepared as described by Sasser . GC/MS analyses were carried out as described before . Briefly, fatty acid methyl esters were separated using an Elite 5-MS column and monitored by mass spectrometry (Clarus 500 - SQ 8 S, Perkin Elmer, Courtaboeuf, France). Spectral database search was performed using MS Search 2.0 operated with the Standard Reference Database 1A (NIST, Gaithersburg, USA) and the FAMEs mass spectral database (Wiley, Chichester, UK).
Hexadecanoic acid is the most abundant fatty acid (45%). 9-Octadecenoic acid and 9,12-Octadecadienoic acid are also abundant unsaturated fatty acids (23 and 20% respectively) (Additional file 1: Table S1).
Genome sequencing information
Genome project history
The strain was selected for sequencing on the basis its 16S rRNA similarity, phylogenetic position, and phenotypic differences with the other members of the family Porphyromonadaceae , and is part of a culturomics study of the human microbiome. It is the second published genome from the F. caenicola species. Table 2 shows the project information and its association with MIGS version 2.0 compliance . The genome Genbank accession number is CTEJ01000000. The genome consists of 2 scaffolds.
Growth conditions and DNA preparation
Strain SIT8 (CSUR P1560) was sub-cultured on 5% sheep blood-enriched Columbia agar (bioMérieux) and grew in 24 h at 37 °C in anaerobic atmosphere. Eight Petri dishes were harvested and resuspended in 4x100μl of G2 buffer (EZ1 DNA Tissue kit, Qiagen). A first mechanical lysis was performed by glass powder on the Fastprep-24 device (MP Biomedicals, Santa Ana, California, USA) using 2 × 20 seconds cycles. DNA was then treated with 2.5 μg/μL lysozyme (30 min at 37 °C) and extracted using the BioRobot EZ 1 Advanced XL (Qiagen). DNA was then concentrated and purified with the Qiamp kit (Qiagen). DNA concentration was 70.7 ng/μl as determined by the Genios Tecan fluorometer, using the Quant-it Picogreen kit (Invitrogen).
Genome sequencing and assembly
The genomic DNA of F. caenicola strain SIT8 was sequenced on a MiSeq sequencer (Illumina Inc., San Diego, CA, USA) with the Mate-Pair strategy. The gDNA was barcoded in order to be mixed with 9 other projects with the Nextera Mate-Pair sample prep kit (Illumina).
The gDNA was quantified by a Qubit assay with the high sensitivity kit (Life technologies, Carlsbad, CA, USA) to 82.6 ng/μl. The Mate-Pair library was prepared with 1.5 μg of gDNA using the Nextera mate pair Illumina guide. The gDNA was simultaneously fragmented and tagged with a Mate-Pair junction adapter. The fragmentation pattern was validated on an Agilent 2100 BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA) with a DNA 7500 labchip. DNA fragments ranged in size from 1.5 kb up to 11 kb with an optimal size at 4.33 kb. No size selection was performed and 662 ng of tagmented fragments were circularized. The circularized DNA was mechanically sheared to small fragments with an optimal at 1200 bp on the Covaris device S2 in T6 tubes (Covaris, Woburn, MA, USA). The library profile was visualized on a High Sensitivity Bioanalyzer LabChip (Agilent Technologies) and the final concentration library was measured at 61.4 nmol/l.
The libraries were normalized at 2 nM and pooled. After a denaturation step and dilution at 15 pM, the pool of libraries was loaded. Automated cluster generation and sequencing run were performed in a single 39-h run in a 2 × 251-bp.
Total information of 7.84 Gb was obtained from an 884 K/mm2 cluster density with a cluster passing quality control filters of 92.7% (15,478,025 passing filter paired reads). Within this run, the index representation for F. caenicola strain SIT8 was determined to be 13.25%. The 2,050,529 paired reads were trimmed and then assembled in 2 scaffolds.
Open Reading Frames were predicted using Prodigal  with default parameters but the predicted ORFs were excluded if they were spanning a sequencing gap region. The predicted bacterial protein sequences were searched against the GenBank database  and the Clusters of Orthologous Groups databases  using BLASTP. The tRNAScanSE tool  was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer  and BLASTn against the GenBank database. Signal peptides and numbers of transmembrane helices were predicted using SignalP  and TMHMM  respectively. ORFans were identified if their BLASTP E-value was lower than 1e-03 for alignment lengths greater than 80 amino acids. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05. Such parameter thresholds have already been used in previous works to define ORFans. Artemis  was used for data management, and DNA Plotter  was used for visualization of genomic features. The Mauve alignment tool was used for multiple genomic sequence alignment . To identify putative orthologues and estimate the pan/core-genome composition, comparative genomic analysis was carried out between the two F. caenicola strains SIT8 and ING2-E5BT using bidirectional Best Blast from the BLASTClust algotithm , and then specific genes were checked by tBLASTN. We estimated the mean level of nucleotide sequence similarity at the genome level using the digital DNA-DNA hybridization and the genome-to-genome distance calculator Web server as previously reported .
The genome of strain SIT8 is 2,824,451-bp long with a 37% G + C content (Table 3; Fig. 5). Of the 2400 predicted genes, 2354 are protein-coding genes, and 46 encode rRNAs. Four rRNA genes (one 16SrRNA, one 23S rRNA and two 5S rRNA) and 42 predicted tRNA genes were identified in the genome. A total of 1668 genes (69.5%) were assigned a putative function. Twenty-eight genes were identified as ORFans (1.7%). The remaining genes were annotated as hypothetical proteins (732 genes, 30.5%). The properties and the statistics of the genome are summarized in Table 3.
The distribution of genes into COGs functional categories is presented in Table 4.
Insights from the genome sequence
To date, one genome from the Fermentimonas genus has been published. Here, we compared the genome sequence of F. caenicola strains SIT8 (Genbank accession number CTEJ01000000) and ING2-E5BT (Genbank accession number NZ_LN515532).
The draft genome of strain SIT8 (2.87 Mb) has a larger size than that of strain ING2-E5BT (2.85 Mb). The G + C content of strains SIT8 and ING2-E5BT are comparable (37% vs 37.3%, respectively). The gene content of strain SIT8 is lower than that of strain ING2-E5BT (2400 vs 2455, respectively). The ratio of genes per Mb of strain SIT8 is lower than that of strain ING2-E5BT (836 vs 861, respectively).
The distribution of genes into COGs functional categories is comparable between strains SIT8 and ING2-E5BT (Table 4). The genomic comparison identified a pangenome of 2681 genes and core genome of 2096 genes. Strains SIT8 and ING2-E5BT harboured 273 and and 312 specific genes, respectively. Functional annotation of the unique genes from strain SIT8 revealed that 48.35% are found into COGs functional categories against 52.56% for strain ING2-E5BT (Additional file 1: Table S2). The COG functional classification of the specific genes from strain SIT8 showed that 10.62% play a role in cell wall, membrane biogenesis and 6.59% in inorganic ion transport and metabolism (Additional file 1: Table S2). In contrast, 16.99% of specific genes from strain ING2-E5BT are involved in replication, recombination and repair and 6.73% in carbohydrate transport and metabolism (Additional file 1: Table S2).
Strains SIT8 and ING2-E5BT share a mean 95.5% dDDH value.
Clusters of Orthologous Groups
Collection de Souches de l’Unité des Rickettsies
Fatty Acid Methyl Ester
Gas Chromatography/Mass Spectrometry
Genome-to-Genome Distance Calculator
- MALDI-TOF MS:
Matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry
Open Reading Frame
Unité de Recherche sur les Maladies Infectieuses et Tropicales Emergentes
Krieg NR. Family IV. Porphyromonadaceae fam. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB (eds), Bergey’s manual of systematic bacteriology, Second Edition, Volume 4, Springer, New York, 2011, p. 61.
Hahnke S, Langer T, Koeck DE, Klocke M. Description of Proteiniphilum saccharofermentans sp. nov., Petrimonas mucosa sp. nov. and Fermentimonas caenicola gen. Nov., sp. nov., isolated from mesophilic laboratory-scale biogas reactors, and emended description of the genus Proteiniphilum. Int J Syst Evol Microbiol. 2016;66:1466–75.
Beye M, Bakour S, Traore SI, Raoult D, Fournier P-E. “Lascolabacillus massiliensis”: a new species isolated from the human gut. New Microbes New Infect. 2016;11:91–2.
Seng P, Abat C, Rolain JM, Colson P, Lagier J-C, Gouriet F, et al. Identification of rare pathogenic Bacteria in a clinical microbiology laboratory: impact of matrix-assisted laser desorption ionization-time of flight mass spectrometry. J Clin Microbiol. 2013;51:2182–94.
Sasser M. Bacterial identification by gas chromatographic analysis of fatty acids methyl esters (GC-FAME). Newark: Microbial ID Inc; 2006.
Dione N, Sankar SA, Lagier JC, Khelaifia S, Michele C, Armstrong N, et al. Genome sequence and description of Anaerosalibacter massiliensis sp. nov. New Microbes New Infect. 2016;10:66–76.
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26:541–7.
Hyatt D, Chen G-L, LoCascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics. 2010;11:119.
Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, GenBank WDL. Nucleic Acids Res. 2005;33:D34–8.
Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 2000;28:33–6.
Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:0955–64.
Lagesen K, Hallin P, Rødland EA, Stærfeldt H-H, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100–8.
Nielsen H, Engelbrecht J, Brunak S, von Heijne G. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 1997;10:1–6.
Krogh A, Larsson B, von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305:567–80.
Carver T, Harris SR, Berriman M, Parkhill J, McQuillan JA. Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data. Bioinforma Oxf Engl. 2012;28:464–9.
Carver T, Thomson N, Bleasby A, Berriman M, Parkhill J. DNAPlotter: Circular and linear interactive genome visualization. Bioinformatics. 2009;25:119–20.
Darling ACE, Mau B, Blattner FR, Perna NT. Mauve: Multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14:1394–403.
Alva V, Nam S-Z, Söding J, Lupas AN. The MPI bioinformatics toolkit as an integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Res. 2016;44:W410–5.
Auch AF, Klenk H-P, Göker M. Standard operating procedure for calculating genome-to-genome distances based on high-scoring segment pairs. Stand Genomic Sci. 2010;2:142–8.
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87:4576–9.
Euzéby J. Validation list no. 143. List of new names and new combinations previously effectively, but not validly, published. Int J Syst Evol Microbiol. 2012;62:1–4.
Krieg NR, Ludwig W, Euzéby J, Whitman WB, Phylum XIV. Bacteroidetes phyl. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB, editors. Bergey’s manual of systematic bacteriology, second edition, vol. 4. New York: Springer; 2011. p. 25.
Krieg NR. Class I. Bacteroidia class. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB (eds), Bergey’s manual of systematic bacteriology, Second Edition, Volume 4, Springer, New York, 2011, p. 25.
Krieg NR. Order I. Bacteroidales ord. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB (eds), Bergey’s manual of systematic bacteriology, Second Edition, Volume 4, Springer, New York, 2011, p. 25.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. Nat Genet. 2000;25:25–9.
This work was funded by the Mediterranee-Infection foundation.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.