Skip to main content


Complete genome sequence of Enterobacter sp. IIT-BT 08: A potential microbial strain for high rate hydrogen production


Enterobacter sp. IIT-BT 08 belongs to Phylum: Proteobacteria, Class: Gammaproteobacteria, Order: Enterobacteriales, Family: Enterobacteriaceae. The organism was isolated from the leaves of a local plant near the Kharagpur railway station, Kharagpur, West Bengal, India. It has been extensively studied for fermentative hydrogen production because of its high hydrogen yield. For further enhancement of hydrogen production by strain development, complete genome sequence analysis was carried out. Sequence analysis revealed that the genome was linear, 4.67 Mbp long and had a GC content of 56.01%. The genome properties encode 4,393 protein-coding and 179 RNA genes. Additionally, a putative pathway of hydrogen production was suggested based on the presence of formate hydrogen lyase complex and other related genes identified in the genome. Thus, in the present study we describe the specific properties of the organism and the generation, annotation and analysis of its genome sequence as well as discuss the putative pathway of hydrogen production by this organism.


Hydrogen has great promise in contributing substatially to the renewable energy demands of the future. It is considered a dream fuel by virtue of the fact that it is renewable, does not evolve green house gases, has the highest energy content per unit mass of any known fuel (143 GJ t−1), is easily converted to electricity by fuel cells and upon combustion, gives water as the only byproduct [1]. Moreover, hydrogen is the third most abundant element on Earth. However, finding simple, inexpensive ways to extract hydrogen and produce it in a pure gaseous form is a crucial step toward making the “hydrogen economy” a reality. Considering this, hydrogen production using microbes is thought to be a promising technique to produce economical, abundant hydrogen without utilizing fossil fuels. Many microbial species have been reported for hydrogen production [2]. Among them, Enterobacter sp. IIT-BT 08 (MTCC 5373, DSM 24603) was reported as a high rate hydrogen producer [3]. It is a Gram negative, facultative anaerobe that can grow and produce hydrogen from a wide range of simple sugars and complex polysaccharides [4]. In the past decade, the group at the Bioprocess Engineering Laboratory at IIT Kharagpur, India, has extensively worked on this organism using various fermentative approaches and established it as one of the highest yielding hydrogen producers [5]. The novelty of the organism lies in the amount of hydrogen (2.2 mol H2 mol−1 glucose) it can produce at ambient temperature (37 °C) and atmospheric pressure as compared to other closely related species reported in literature. Besides, high rate of continuous hydrogen production has been reported using immobilized Enterobacter sp. IIT-BT 08 and waste as substrate using 20 L and 800 L reactors [5]. Therefore, whole genome sequencing of this potential strain was considered to determine the genes responsible for the high rate hydrogen production. In this report we present a summary of the properties and features of Enterobacter sp. IIT-BT 08 genome and also suggest a putative pathway for hydrogen production.

Classification and features

E. sp. IIT-BT 08 was isolated from the leaves of a local plant near the Kharagpur railway station, Kharagpur, West Bengal, India [4]. The bacterium is a Gram negative, small, motile, catalase positive rod [4,6,7] belonging to the family Enterobacteriaceae (Table 1). To characterize the strain, a set of standard tests were carried out according to Bergey’s Manual and the results showed that the strain belongs to Enterobacter species. 16S rRNA sequencing by Microbial Type Culture Collection (MTCC), Chandigarh further confirmed the strain identity. The genetic complexity of the organism is illustrated in the phylogenetic tree of the 16S RNA region (Figure 1). Initially the strain was classified as Enterobacter sp. IIT-BT 08, however, whole genome sequencing of the strain revealed sequence variation in the six 16S rRNA copies of the strain. We presume that this may have been the source of difficulty in the initial mis-identification of the strain. Currently, without a complete set of type strain genome sequences available for a more detailed taxonomic identification, the name of the strain has been changed to Enterobacter sp. IIT-BT 08.

Figure 1.

Phylogenetic tree high-lighting the position of “Enterobacter sp. IIT-BT 08 (•)” relative to other type and non-type strains within the Enterobacteriaceae. Strains shown are those within the Enterobacteriaceae having corresponding NCBI genome project ids. The tree was constructed using Mega4 software. The tree based on Jukes-Cantor distance was constructed using neighbor-joining algorithm with 1,000 bootstrapping. Acetobacterium woodii strain DSM 1030 (♦) and Desulfocaldus sp. (■) was considered as the out group. The scale bar represents 0.1 substitutions per nucleotide position. Numbers at the nodes are the bootstrap values.

Table 1. Classification and general features of Enterobacter sp. IIT-BT 08 according to the MIGS recommendation [8]

Genome project history

Genome sequencing information

Enterobacter sp. IIT-BT 08 is a promising hydrogen producer and can utilize waste as substrate for hydrogen production [4]. Therefore, it was considered essential to sequence the whole genome of the organism to determine the genes that contributed towards hydrogen production. Besides, complete genome information was also critical to facilitate studies on genetic engineering of the organism for further enhancement of its hydrogen production potential. Therefore, the group applied for the Community Sequencing Program-2010 (CSP-2010) offered by DoE-JGI.

One of the DOE missions is to address the critical question of depleting energy reserves by creating a new generation of biological research enabled by the genome revolution. This organism therefore appeared relevant to this mission and was selected for sequencing. The genome sequence was completed on May 21, 2012. Quality assurance was done by the DSMZ (Braunschweig, DE), finishing and annotation was completed at Joint Genome Institute. A summary of the project information is shown in Table 2, which also presents the project information and its association with MIGS version 2.0 compliance [8].

Table 2. Genome sequencing project information

Growth conditions and DNA isolation

For genomic DNA isolation, Enterobacter sp. was cultivated overnight in nutrient broth at 37 °C and 200 rpm in a gyratory incubator shaker. DNA isolation was carried out by Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ) institute. For DNA isolation, the strain was grown in DSMZ medium 381 (Luria-Bertani Medium) at 37°C. DNA was isolated from 1–1.5 g of cell paste using Jetflex Genomic DNA Purification Kit (Genomed_600100) following the manufacturer’s recommendations for Gram-positive bacteria (which were more efficient than the conditions recommended for Gram-negative cells). The identity of the DNA was confirmed via 16S rRNA gene sequencing and the quality was analyzed following the recommendations of the sequencing center (JGI), including pulse-field gel electrophoresis.

Genome sequencing and assembly

The draft genome of Enterobacter sp. IIT-BT 08 was generated at the DOE Joint Genome Institute (JGI) using Illumina data [22]. For this genome, JGI constructed and sequenced an Illumina short-insert paired-end library with an average insert size of 231 +/− 59 bp which generated 24,130,984 reads and an Illumina long-insert paired-end library with an average insert size of 8,267 +/− 2,204 bp which generated 13,553,468 reads totaling 5,653 Mbp of Illumina data. (unpublished, Feng Chen). All general aspects of library construction and sequencing performed at the JGI can be found at the JGI website. The initial draft assembly contained 21 contigs in 3 scaffold(s). The initial draft data was assembled with Allpaths, version 39750, and the consensus was computationally shredded into 10 Kbp overlapping fake reads (shreds). The Illumina draft data was also assembled with Velvet, version 1.1.05 [23], and the consensus sequences were computationally shredded into 1.5 Kbp overlapping fake reads (shreds). The Illumina draft data was assembled again with Velvet using the shreds from the first Velvet assembly to guide the next assembly. The consensus from the second Velvet assembly was shredded into 1.5 Kbp overlapping fake reads. The fake reads from the Allpaths assembly and both Velvet assemblies and a subset of the Illumina CLIP paired-end reads were assembled using parallel phrap, version 4.24 (High Performance Software, LLC). Possible mis-assemblies were corrected with manual editing in Consed [2426]. Gap closure was accomplished using repeat resolution software (Wei Gu, unpublished), and sequencing of bridging PCR fragments with Sanger and/or PacBio (unpublished, Cliff Han) technologies. For improved high quality draft and noncontiguous finished projects, one round of manual/wet lab finishing may have been completed. Primer walks, shatter libraries, and/or subsequent PCR reads may also be included for a finished project. A total of 0 additional sequencing reactions, 6 PCR PacBio consensus sequences, and 0 shatter libraries were completed to close gaps and to raise the quality of the final sequence. The total estimated size of the genome is 4.7 Mb and the final assembly is based on 5,653 Mbp of Illumina draft data, which provides an average 1,203× coverage of the genome.

Genome annotation

Genes were identified using Prodigal [27] as part of the Oak Ridge National Laboratory genome annotation pipeline, followed by a round of manual curation using the JGI GenePRIMP pipeline [28]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) non-redundant database, Uni-Prot, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. Additional gene prediction analysis and functional annotation were performed within the Integrated Microbial Genomes Expert Review (IMG-ER) platform [29].

Genome properties

The genome of E. sp. IIT-BT 08 consists of one linear chromosome of 4,672,040 bp (Figure 2). The average G+C content for the genome is 56.01% (Table 3). There are 78 tRNA genes and 6 rRNA operons each consisting of a 16S, 23S, and 5S rRNA gene. There are 4,393 predicted protein-coding regions and 43 pseudogenes in the genome. A total of 3,881 protein-coding genes (85.64%) have been assigned a predicted function while the rest have been designated as hypothetical proteins (Table 4). The numbers of genes assigned to each COG functional category are listed in Table 4. About 2% of the annotated genes were not assigned to COGs and have an unknown function.

Figure 2.

Graphical linear map of the genome. From left to right: Genes on forward strand (color by COG categories), Genes on reverse strand (color by COG categories), RNA genes (tRNAs green, rRNAs red, other RNAs black), GC content, GC skew.

Table 3. Nucleotide content and gene count levels of the genome
Table 4. Number of genes associated with the general COG functional categories

Biohydrogen production pathway

The complete genome sequencing of the organism helps provide a preliminary idea of the genes involved in the hydrogen production pathway. The genome revealed the presence of formate hydrogen lyase (EntIIITBT8_2511) and its maturation operons HycH (EntIIITBT8_2678), NiFe hydrogenase III small and large subunit (EntIIITBT8_2679, EntIIITBT8_2681), their maturation operons and the FeS cluster containing hydrogenase components 1 and 2 (EntIIITBT8_0331, EntIIITBT8_2684). A complete list of all the genes predicted to be involved in the hydrogen production pathway is listed in Table 5. The whole genome information of the organism suggests that hydrogen production in Enterobacter sp. IIT-BT 08 is carried out through the formate hydrogen lyase (FHL) complex which consists of formate dehydrogenase (FDH-H), hydrogenase (Hyd-3) and the electron transfer mediators [30].

Table 5. Preliminary genes involved in the hydrogen production pathway according to the MIGS recommendations [8]

However, in the future the hypothetical pathway must be verified with wet lab experiments. Based on the previous reported literature it may be that formate dehydrogenase and hydrogenase 3 together form a membrane protein complex that is responsible for hydrogen production in facultative anaerobes [3032]. Rossmann et al. suggested that in facultative anaerobes hydrogen production was determined by the concentration of formate in the cell, which in turn determined the formation of the FHL complex [32]. A putative model (Figure 3) has been suggested based on the biochemistry of the reactions involved in the pathway [34]. Formate dehydrogenase is suggested to catalyze the oxidation of formate into carbon dioxide. The electrons released in the process are transferred to Hyd3 encoded by hycABCDEFGH to generate molecular hydrogen under anaerobic conditions [33]. The model suggests a plausible scheme of electron transfer from FdhF to the catalytic subunit of hycE via hycBCFG subunits. Among these, hycB and hycF have been determined to be [4Fe-4S] ferredoxin type electron transfer proteins [35]. On the other hand, hycE and hycG shares homology with NADH ubiquinone oxidoreductase (NUO) subunits of the mitochondria and chloroplast [35]. In the model, hycC and hycD have been suggested to act as transmembrane proteins.

Figure 3.

Putative mechanism of hydrogen production by Enterobacter sp. IIT-BT 08 based on the genes identified in the genome. Figure is adapted from [35].

Electron acceptors, like oxygen or nitrate, generally inhibit the expression of the FHL complex, whereas its biosynthesis is controlled by the concentration of formate in the cell [32]. Further, it has been suggested that the micro elements selenium and molybdenum are involved at the active site of FDH-H, while nickel is a component of the Hyd-3 active site [30,36]. Accordingly, it has been suggested that the FHL complex can be induced by regulating the presence of formate and metal ions in slightly acidic pH under anaerobic conditions.

Transcription of the FHL complex is under the control of several genes, including fhlA, which codes for the FHL activator protein FHLA, a tetramer that binds to the upstream region of the DNA encoding the FHL complex and promotes the transcription of the FHL complex [34,37]. Moreover, hycA codes for the FHL repressor protein that binds to FHLA or to the FHLA-formate complex. Since fhlA and hycA control the transcription of the FHL complex, it is theoretically possible to control the specific FHL activity and the specific hydrogen production rate by manipulating these genes or their genetic controls [38].


The genome of Enterobacter sp. IIT-BT 08 was sequenced and annotated by the DOE Joint Genome Institute. The genomic properties of the organism were analyzed using various IMG tools, and, based on the genome sequence, a putative pathway of hydrogen production based on formate hydrogen lyase complex was discussed.


  1. 1.

    Das D, Khanna N, Veziroglu TN. Recent developments in biohydrogen production processes. Chem Indus Chem Eng Quaterly 2008; 14:57–67.

  2. 2.

    Nandi R, Sengupta S. Microbial production of hydrogen: an overview. Crit Rev Microbiol 1998; 24:61–84.

  3. 3.

    Kumar N, Monga PS, Biswas AK, Das D. Modeling and simulation of clean fuel production by Enterobacter cloacae IIT-BT 08. Int J Hydrogen Energy 2000; 25:945–952.

  4. 4.

    Kumar N, Das D. Enhancement of hydrogen production by Enterobacter cloacae IIT-BT 08. Process Biochem 2000; 35:589–593.

  5. 5.

    Das D. Advances in biohydrogen production processes: An approach towards commercialization. Int J Hydrogen Energy 2009; 34:7349–7357.

  6. 6.

    Kumar N, Das D. Studies on molecular hydrogen production by Enterobacter cloacae IIT-BT 08. Paper presented at 9th European Congress on Biotechnology, Brussels, Belgium, 11–15 July, 1999.

  7. 7.

    Kumar N, Das D. The production of pollution free gaseous fuel. Patent application filed. 191/Cal/99, India.

  8. 8.

    Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008; 26:541–547.

  9. 9.

    Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 1990; 87:4576–4579.

  10. 10.

    Garrity GM, Bell JA, Lilburn T. Phylum XIV. Proteobacteria phyl. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 2, Part B, Springer, New York, 2005, pp. 1.

  11. 11.

    Validation of publication of new names and new combinations previously effectively published outside the IJSEM. List no. 106. Int J Syst Evol Microbiol 2005; 55:2235–2238.

  12. 12.

    Garrity GM, Bell JA, Lilburn T. Class III. Gammaproteobacteria class. nov. In: Garrity GM, Brenner DJ, Krieg NR, Staley JT (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 2, Part B, Springer, New York, 2005, p. 1.

  13. 13.

    Williams KP, Kelly DP. Proposal for a new class within the phylum Proteobacteria, Acidithiobacillia classis nov., with the type order Acidithiobacillales, and emended description of the class Gammaproteobacteria. Int J Syst Evol Microbiol 2013; 63:2901–2906.

  14. 14.

    Garrity GM, Holt JG. Taxonomic Outline of the Archaea and Bacteria. In: Garrity GM, Boone DR, Castenholz RW (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 1, Springer, New York, 2001, p. 155–166.

  15. 15.

    Skerman VBD, McGowan V, Sneath PHA. Approved lists of bacterial names. Int J Syst Bacteriol 1980; 30:225–420.

  16. 16.

    Rahn O. New principles for the classification of bacteria. Zentralbl Bakteriol Parasitenkd Infektionskr Hyg 1937; 96:273–286.

  17. 17.

    Judicial Commission. Conservation of the family name Enterobacteriaceae, of the name of the type genus, and designation of the type species OPINION NO. 15. Int Bull Bacteriol Nomencl Taxon 1958; 8:73–74.

  18. 18.

    Hormaeche E, Edwards PR. A proposed genus Enterobacter. Int Bull Bacteriol Nomencl Taxon 1960; 10:71–74.

  19. 19.

    Sakazaki R. Genus VII. Enterobacter Hormaeche and Edwards 1960, 72; Nom. cons. Opin. 28, Jud. Com m. 1963, 38. In: Buchanan RE, Gibbons NE (eds), Bergey’s Manual of Determinative Bacteriology, Eighth Edition, The Williams and Wilkins Co., Baltimore, 1974, p. 324–325.

  20. 20.

    OPINION. 28 Rejection of the Bacterial Generic Name Cloaca Castellani and Chalmers and Acceptance of Enterobacter Hormaeche and Edwards as a Bacterial Generic Name with Type Species Enterobacter cloacae (Jordan) Hormaeche and E dwards. Int Bull Bacteriol Nomencl Taxon 1963; 13:38.

  21. 21.

    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25–29.

  22. 22.

    Bennett S. Solexa Ltd. Pharmacogenomics 2004; 5:433–438.

  23. 23.

    Zerbino DR, Birney E. Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008; 18:821–829.

  24. 24.

    Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res 1998; 8:186–194.

  25. 25.

    Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 1998; 8:175–189.

  26. 26.

    Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8:195–202.

  27. 27.

    Anonymous. Prodigal Prokaryotic Dynamic Programming Genefinding Algorithm. Oak Ridge National Laboratory and University of Tennessee 2009

  28. 28.

    Pati A, Ivanova N, Mikhailova, N, Ovchinikova G, Hooper SD, Lykidis A, Kyrpides NC. GenePRIMP: A Gene Prediction Improvement Pipeline for microbial genomes

  29. 29.

    Markowitz V, Mavromatis K, Ivanova N, Chen IM, Chu K, Kyrpides N. Expert Review of Functional Annotations for Microbial Genomes. Bioinformatics 2009; 25:2271–2278.

  30. 30.

    Sauter M, Böhm R, Böck A. Mutational analysis of the operon (hyc) determining hydrogenase 3 formation in Escherichia coli. Mol Microbiol 1992; 6:1523–1532.

  31. 31.

    Zinoni F, Birkmann A, Stadtman TC, Bo’ck A. Nucleotide sequence and expression of the selenocysteine-containi ng polypeptide of formate dehydrogenase (formate-hydrogen-lyase-linked) from Escherichia coli. Proc Natl Acad Sci USA 1986; 83:4650–4654.

  32. 32.

    Rossmann RG, Sawers G, Böck A. Mechanism of regulation of the formate-hydrogen lyase pathway by oxygen, nitrate, and pH: definition of the formate regulon. Mol Microbiol 1991; 5:2807–2814.

  33. 33.

    Leonhartsberger S, Korsa I, Böck A. A Review The molecular biology of formate metabolism in enterobacteria. J Mol Microbiol Biotechnol 2002; 4:269–276.

  34. 34.

    Leonhartsberger S, Korsa I, Böck A. The molecular biology of formate metabolism in enterobacteria. J Mol Microbiol Biotechnol 2002; 4:269–276.

  35. 35.

    Böhm R, Sauter M, Böck A. Nucleotide sequence and expression of an operon in Escherichia coli coding for formate hydrogenlyase components. Mol Microbiol 1990; 4:231–243.

  36. 36.

    Jacobi A, Rossmann R, Böck A. The hyp operon gene products are required for the maturation of catalytically active hydrogenase isoenzymes in Escherichia coli. Arch Microbiol 1992; 158:444–451.

  37. 37.

    Schlensog V, Böck A. Identification and sequence analysis of the gene encoding the transcriptional activator of the formate hydrogenlyase system of Escherichia coli. Mol Microbiol 1990; 4:1319–1327.

  38. 38.

    Yoshida A, Nishimura T, Kawaguchi H, Inui M. Enhanced Hydrogen Production from Formic Acid by Formate Hydrogen Lyase-Overexpressing Escherichia coli Strains Enhanced Hydrogen Production from F ormic Acid by Formate Hydrogen Lyase-Overexpressing Escherichia coli Strains. Appl Environ Microbiol 2005; 71:6762–6768.

Download references


The work was conducted by the U.S. Department of Energy Joint Genome Institute and is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Authors (DD and NK) are also thankful to MNRE for their financial assistance. NK also gratefully acknowledges Department of Biotechnology (DBT), Government of India, for senior research fellowship. The authors from IITKgp, India submitted the JGI-CSP project, analyzed the data and wrote the manuscript. The authors from DSMZ confirmed the strain identity and extracted high quality genomic DNA for sequencing. The authors from DoE-JGI, WC, USA, and LLNL, Livermore CA USA carried out the sequencing and annotation of the genome.

Author information

Correspondence to Debabrata Das.

Rights and permissions

Reprints and Permissions

About this article


  • Enterobacter sp. IIT-BT 08
  • genome sequence
  • facultative anaerobe
  • biohydrogen