Genome sequence of the organohalide-respiring Dehalogenimonas alkenigignens type strain (IP3-3T)

Dehalogenimonas alkenigignens IP3-3T is a strictly anaerobic, mesophilic, Gram negative staining bacterium that grows by organohalide respiration, coupling the oxidation of H2 to the reductive dehalogenation of polychlorinated alkanes. Growth has not been observed with any non-polyhalogenated alkane electron acceptors. Here we describe the features of strain IP3-3T together with genome sequence information and its annotation. The 1,849,792 bp high-quality-draft genome contains 1936 predicted protein coding genes, 47 tRNA genes, a single large subunit rRNA (23S-5S) locus, and a single, orphan, small unit rRNA (16S) locus. The genome contains 29 predicted reductive dehalogenase genes, a large majority of which lack cognate genes encoding membrane anchoring proteins.


Introduction
Strain IP3-3 T (=JCM 17062, =NRRL B-59545) is the type strain of the species Dehalogenimonas alkenigignens [1]. Currently, two pure cultures of D. alkenigignens have been described, namely, D. alkenigignens strains IP3-3 T and SBP-1 [1]. Both strains were isolated from chlorinated alkane-and alkene-contaminated groundwater collected at a Superfund Site near Baton Rouge, Louisiana (USA) [1]. Construction of 16S rRNA gene libraries indicated that bacteria closely related or identical to D. alkenigignens were present at high relative abundance in the groundwater where strains IP3-3 T and SBP-1 were first isolated [1].
Strains of D. alkenigignens possess the unique trait of growing via organohalide respiration, a process in which halogenated organic compounds are utilized as terminal electron acceptors. In particular, they are able to reductively dehalogenate a variety of polychlorinated alkanes that are of environmental concern on account of their potential to cause adverse health effects and their widespread occurrence as soil and groundwater pollutants [1][2][3][4]. In this report, we present a summary classification and a set of features for D. alkenigignens IP3-3 T together with the description of the draft genomic sequence and annotation.

Classification and features
Dehalogenimonas alkenigignens is a member of the order Dehalococcoidales, class Dehalococcoidia, of the phylum Chloroflexi (Table 1). Based on 16S rRNA gene sequences, the closest related type strains are Dehalogenimonas lykanthroporepellens BL-DC-9 T [1,5] and Dehalococcoides mccartyi 195 T [6], with sequence identities of 96.2 and 90.6 %, respectively [1]. Figure 1 shows the phylogenetic neighborhood of D. alkenigignens strain IP3-3 T in a 16S rRNA gene based phylogenetic dendrogram. The sequence of the lone 16S rRNA gene copy in the draft genome is identical to the previously published 16S rRNA gene sequence (JQ994266).
The cells of D. alkenigignens IP3-3 T are Gram negative staining, non-spore forming, irregular cocci to diskshaped with a diameter of 0.4-1.1 μm [1] (Fig. 2). The strain was isolated in liquid medium using a dilution-to-extinction approach. Growth of the strain was not observed on agar plates even after long term (2 months) incubation [1]. The temperature range for growth of strain IP3-3 T is between 18°C and 42°C with an optimum between 30°C and 34°C [1]. The pH range for growth is 6.0 to 8.0 with an optimum of 7.0 to 7.5 [1]. The strain grows in the presence of <2 % (w/v) NaCl and is resistant to ampicillin and vancomycin at concentrations of 1.0 and 0.1 g/l, respectively [1].

Genome sequencing information
Genome project history D. alkenigignens IP3-3 T was chosen for genome sequencing because it is the type strain of the species and , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [59] because of the importance of organohalide respiration in the field of environmental biotechnology and bioremediation. A summary of the project information is shown in Table 2. The D. alkenigignens strain IP3-3 T genome project is deposited in the Genomes OnLine Database [7] and the genome sequence is available from GenBank.
Growth conditions and genomic DNA preparation D. alkenigignens strain IP3-3 T (=JCM 17062, =NRRL B-59545) was cultured in liquid anaerobic basal medium [1] supplemented with 2 mM 1,2-dichloropropane. Cells were harvested from 9.9 L culture medium by centrifugation after at least 50 % of the starting 1,2-dichloropropane was dehalogenated. Total DNA was extracted using a GenElute Bacterial Genomic DNA kit (Sigma-Aldrich) following the manufacturer's recommended protocol. Eluted DNA was concentrated using ethanol precipitation, air

Genome sequencing and assembly
The genome of D. alkenigignens IP3-3 T was sequenced using a combination of Illumina [8] and 454 technologies [9]. The 454 Titanium standard data and the 454 pairedend data were assembled with gsAssembler ver. 2.6 (Roche). Illumina data were assembled with CLC Genomics Workbench ver. 6.5.1 (CLCbio). Each of the resulting scaffolds and contigs were integrated using CodonCode Aligner ver. 3.7.1 (CodonCode Corporation). Also, Illumina sequencing reads were mapped to the final contigs to correct misassembles and base errors. The final assembly generated one scaffold including two contigs representing 1,849,792 bp based on 655.71× coverage of 454 and Illumina sequencing data.

Genome annotation
Genes were identified using Prodigal [10] as part of the JGI's microbial annotation pipeline [11] followed by a round of manual curation using the JGI GenePRIMP pipeline [12]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. These data sources were combined to assert a product description for each predicted protein. Non-coding genes and miscellaneous features were predicted using tRNAscan-SE [13], RNAMMer [14], Rfam [15], TMHMM [16], ARAGORN [17], bSECISearch [18], and signal [19]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes -Expert Review platform [20].

Genome properties
The draft genome of D. alkenigignens strain IP3-3 T has a total length of 1,849,792 bp with 55.88 % G + C content (Table 3 and Fig. 3). Of the 1988 genes predicted, 1936 were protein-coding genes and 52 were RNAs. The majority of the protein-coding genes (74.9 %) were assigned a putative function, and the remaining were annotated as hypothetical proteins. The distribution of the predicted protein coding genes into COG functional categories is presented in Table 4.   [21], and Dehalococcoides mccartyi strains [22][23][24] contain similar number of rRNA and tRNA encoding genes, they lack overall synteny and differ in their GC content, gene density, and percentage of sequence that encodes proteins. BLAST comparisons of protein sets of D. alkenigignens IP3-3 T and D. lykanthroporepellens BL-DC-9 T revealed that the two strains contain 1154 protein coding genes in common (bidirectional best hits, 20-95 % identity at the predicted protein level). Bidirectional best-hit comparisons indicated that D. alkenigignens IP3-3 T contains 782 proteincoding genes with no homologs in D. lykanthroporepellens BL-DC-9 T . The latter contained 566 protein-coding genes with no homologs in D. alkenigignens IP3-3 T . Genomespecific genes identified in D. alkenigignens IP3-3 T and D. lykanthroporepellens BL-DC-9 T included those that encoded transposases, restriction endonucleases, acetyltransferases, permeases, reductases, hydrogenases, and dehalogenases. Some of these strain-specific genes were associated with IS elements.
The genome of D. alkenigignens IP3-3 T contains 47 tRNA genes, including those for all 20 standard amino acids as well as the less common amino acid selenocysteine. Consistent with the presence of a selC gene (DEALK_t00110) encoding a selenocysteine-inserting tRNA (tRNA sec ), D. alkenigignens strain IP3-3 T also contains genes that are putatively involved in synthesis of selenocysteine (DEALK_04960-04970) and a GTPdependent selenocysteine-specific elongation factor (DEALK_04950) that forms a quaternary complex with selenocysteine-tRNA sec and the selenocysteine inserting sequence (SECIS), a hairpin loop found immediately downstream of the UGA codon in selenoproteinencoding mRNA [26]. This complex facilitates reading through the UGA codon and incorporation of selenocysteine instead of translation termination [27]. Also consistent with the presence of the genes encoding the synthesis and incorporation of selenocysteine, D. alkenigignens strain IP3-3 T contains multiple genes encoding putative selenocysteine-containing proteins including a selenophosphate synthase (DEALK_04975) and formate dehydrogenase (DEALK_19115) that have internal inframe UGA stop codons followed by putative SECIS elements [18].
A number of microorganisms accumulate low molecular weight organic compounds commonly referred to as "compatible solutes" that help the microorganisms survive osmotic stress but do not interfere with metabolism [28]. Ectoine is a compatible solute of many mesophilic bacteria capable of survival at high salt concentrations [28], while mannosylglycerate is a compatible solute accumulated by many thermophilic organisms [29]. Homologs of a gene encoding a bifunctional mannosylglycerate synthase The total is based on the total number of protein coding genes in the genome (mgsD) are found in Dehalococcoides mccartyi strains (e.g., DET1363) and D. lykanthroporepellens BL-DC-9 T (Dehly_0877), an unusual occurrence for mesophilic bacteria [21,29]. Comparative analysis revealed that D. alkenigignens IP3-3 T contains a homologous gene (DEALK_12650, 55-70 % protein identity). This expands the range of mesophilic species containing genes putatively involved in the biosynthesis of mannosylglycerate. D. alkenigignens IP3-3 T , however, lacks the operon (ectABC) encoding putative homologs of the enzymes involved in ectoine biosynthesis and regulation that were found to be present in D. lykanthroporepellens BL-DC-9 T (Dehly_1306, Dehly_1307, Dehly_1308). The presence of these ectoine encoding genes in D. lykanthroporepellens BL-DC-9 T but not D. alkenigignens IP3-3 T may account for the ability of the former to reductively dechlorinate polychlorinated alkanes in the presence of higher NaCl concentrations than was observed for D. alkenigignens IP3-3 T [1].

Reductive dehalogenases
Genes encoding the enzymes characterized to date that are involved in catalyzing the reductive dehalogenation of chlorinated solvents are organized in rdhAB operons encoding a~500 aa protein (RdhA) that functions as a reductive dehalogenase and a~90 aa hydrophobic protein with transmembrane helices (RdhB) that is thought to anchor the RdhA to the cytoplasmic membrane [30][31][32][33][34][35][36][37][38][39][40][41]. D. alkenigignens IP3-3 T contains several loci, accounting for 2.38 % of the genome, related to rdhA and/or rdhB genes scattered throughout the genome. The multiple rdhA and rdhB ORFs of D. alkenigignens IP3-3 T have 32-97 % and 32-43 % identities at the predicted protein level, respectively. The closest homologs for most of the D. alkenigignens IP3-3 T rdhA ORFs (Table 5) are found among Dehalogenimonas lykanthroporepellens BL-DC-9 T , Dehalococcoides mccartyi strains, or uncultured bacteria. A twin-arginine motif followed by a stretch of hydrophobic amino acids, was identified in the N-terminus of a large majority (27 of 29) of the predicted RdhA sequences (Table 5). Consistent with the presence of the twin-arginine sequence in the N-terminus of most of its RdhA sequences, D. alkenigignens IP3-3 T contains an operon encoding proteins that constitute a putative twin-arginine translocation (TAT) system (DEALK_04830-04860). This specialized system is involved in the secretion of folded proteins across the bacterial inner membrane into the periplasmic space [42,43]. Dehalogenimonas lykanthroporepellens BL-DC-9 T also contains an operon encoding an analogous TAT system that is related to the TAT system of D. alkenigignens IP3-3 T (55-86 % protein identity). Two conserved motifs each containing three or four cysteine residues, a feature associated with binding ironsulfur clusters [44], were identified near the C-terminus of 28 of the 29 predicted RdhA sequences of D. alkenigignens IP3-3 T . The first of these motifs had a consistent number of cysteine residues and consistent number of amino acids between the cysteine residues (CX 2 CX 2 CX 3 C), while the second motif was variable (Table 5). If a "fulllength" rdhA is predicted to encode a protein containing a twin-arginine sequence in the N-terminus, two iron-sulfur cluster binding motifs in the C-terminus, and an intervening sequence of~450 aa, then D. alkenigignens IP3-3 T contains 27 such genes, a number appreciably larger than the 17 such genes found in Dehalogenimonas lykanthroporepellens BL-DC-9 T [21]. One of the non-full length rdhA genes (DEALK_17180) contains a predicted internal stop codon that putatively prevents complete translation of what would otherwise be a 458 aa protein containing two iron-sulfur binding clusters. rdhA genes with internal stop codons have been reported previously among the genomes of other organohalide respiring strains of the genera Dehalococcoides [24] and Dehalobacter [45,46].
Within D. alkenigignens IP3-3 T , only three of the rdhA ORFs (DEALK_11290, DEALK_17200, and DEALK_19050) have a cognate rdhB (Table 6). Two additional rdhB genes (DEALK_00250 and DEALK_05730) appear to be orphans with no cognate rdhA ORF. In at least one locus (DEALK_00250), it appears that transposon insertion has truncated the rdhA gene (annotated as pseudogene DEALK_00260). The predicted RdhB sequences of strain IP3-3 T each contain two or three transmembrane helices (Table 6), similar to the features of the predicted RdhB sequences of Dehalogenimonas lykanthroporepellens BL-DC-9 T and Dehalococcoides mccartyi strains [21,22,24,47]. The predicted RdhB sequences of D. alkenigignens IP3-3 T are most closely related to the RdhB of D. lykanthroporepellens strain BL-DC-9 T , Dehalococcoides mccartyi strain GY, and an uncultured bacterium designated as Dehalogenimonas sp. WBC-2 [48] (45-96 % identity at the protein level, Table 6). As was observed for D. lykanthroporepellens BL-DC-9 T [21], genes putatively involved in the regulation of rdhAB operons in Dehalococcoides mccartyi strains (e.g., MarR-type or two-component transcriptional regulators [22,24]) were not present in a majority of the rdhA loci of D. alkenigignens IP3-3 T . Thus, it appears that regulation of rdh gene expression within the genus Dehalogenimonas may generally differ from that of the genus Dehalococcoides.
The predicted RdhA protein encoded by the rdhAB operon comprised of DEALK_17200-17210 shares 95 % identity with the 1,2-dichloropropane reductive dehalogenases (dcpAs) recently identified in Dehalococcoides mccartyi strains KS and RC and 92 % identity with the related dcpA in D. lykanthroporepellens BL-DC-9 T [39]. The putative membrane anchoring protein encoded by the rdhB (DEALK_17210) adjacent to the dcpA gene is  also related (92-96 % identity at the protein level) to the RdhB cognate dcpA of D. lykanthroporepellens BL-DC-9 T and Dehalococcoides mccartyi strains KS and RC [39]. Interestingly, the putative dcpA gene present in D. alkenigignens IP3-3 T had mismatches with all four primers/ probes that were reported [39] for use in PCR or qPCR for detection and quantification of this gene (1 mismatch with dcpA-360 F, 3 mismatches with dcpA-1257 F, and two mismatches each with dcpA-1426R and dcpA-1449R).
The presence of insertion sequence elements adjacent to some rdhA/rdhB loci in D. alkenigignens IP3-3 T indicates their acquisition from an unknown host. Previous studies of D. lykanthroporepellens BL-DC-9 T and Dehalococcoides strains have also suggested horizontal transfer of reductive dehalogenase genes [21,49,50]. Additionally, the genomic region downstream of the ssrA gene (DEALK_tm00010) in D. alkenigignens IP3-3 T shares some synteny with the mobile genetic elements reported for vinyl chloride reductases in Dehalococcoides strains [49]. A 22 bp direct repeat of the 3' end of the ssrA gene adjacent to one of the rdhA loci in D. alkenigignens IP3-3 T (DEALK_11430) suggests that integration at the ssrA gene may have played a role in shaping the genome of D. alkenigignens IP3-3 T .
It remains to be determined if D. alkenigignens IP3-3 T rdhA genes lacking an rdhB ORF downstream encode functional reductive dehalogenases and whether or how they are membrane-bound. It is possible that a noncontiguous rdhB (e.g., the orphan DEALK_005730) could complement one or more of the strain IP3-3 T rdhA genes lacking an rdhB ORF downstream. Alternatively, some of these genes may encode reductive dehalogenases that are not membrane bound or that are bound by an unknown mechanism. The finding that many of the D. lykanthroporepellens BL-DC-9 T rdhA genes lacking cognate rdhB genes are simultaneously transcribed during the reductive dechlorination of 1,2-dichloroethane, 1,2-dichloropropane, and 1,2,3-trichloropropane [51] suggests that rdhA genes lacking a cognate rdhB may serve a purpose. An enzyme involved in the reductive dehalogenation of tetrachloroethene by Sulfurospirillum multivorans (basonym Dehalospirillum multivorans [52,53]) was found in the cytoplasmic fraction [54], suggesting that some reductive dehalogenases are either loosely membrane-bound or soluble entities. The same may be the case for the majority of reductive dehalogenases of D. alkenigignens IP3-3 T .

Conclusions
Genomic analysis of D. alkenigignens IP3-3 T revealed the presence of components associated with synthesis of selenocysteine-containing proteins as well as numerous reductive dehalogenase homologous genes not previously studied. As with the related species D. lykanthroporepellens but in contrast to other dechlorinating genera, a large majority of the reductive dehalogenase homologous genes in D. alkenigignens IP3-3 T lack apparent cognate genes encoding membrane anchoring components. The sequences of these diverse genes may aid future studies aimed at elucidating the strain's mechanisms for transforming polychlorinated alkanes. The absence of genes encoding enzymes involved in ectoine biosynthesis in the genome of D. alkenigignens IP3-3 T may account for the strain's inability to dehalogenate chlorinated alkanes at higher NaCl concentrations that were observed for strains of the related species D. lykanthroporepellens.