Complete genome sequence of Microbulbifer sp. CCB-MM1, a halophile isolated from Matang Mangrove Forest, Malaysia

Microbulbifer sp. CCB-MM1 is a halophile isolated from estuarine sediment of Matang Mangrove Forest, Malaysia. Based on 16S rRNA gene sequence analysis, strain CCB-MM1 is a potentially new species of genus Microbulbifer. Here we describe its features and present its complete genome sequence with annotation. The genome sequence is 3.86 Mb in size with GC content of 58.85%, harbouring 3313 protein coding genes and 92 RNA genes. A total of 71 genes associated with carbohydrate active enzymes were found using dbCAN. Ectoine biosynthetic genes, ectABC operon and ask_ect were detected using antiSMASH 3.0. Cell shape determination genes, mreBCD operon, rodA and rodZ were annotated, congruent with the rod-coccus cell cycle of the strain CCB-MM1. In addition, putative mreBCD operon regulatory gene, bolA was detected, which might be associated with the regulation of rod-coccus cell cycle observed from the strain.


Introduction
Microbulbifer sp. CCB-MM1 is a halophile isolated from an estuarine sediment sample taken from Matang Mangrove Forest, Malaysia. The genus Microbulbifer was proposed by González [1] with the description of Microbulbifer hydrolyticus which was isolated from marine pulp mill effluent. Microbulbifer are typically found in high-salinity environments including marine sediment [2], salt marsh [3], costal soil [4] as well as mangrove soil [5]. They were known for their capability to degrade a great variety of polysaccharides including cellulose [1,5], xylan [1,5,6], chitin [1,5,6], agar [3,6] and alginate [7]. Microbulbifer strains are potential sources of carbohydrate active enzymes with biotechnological interest. One of the species, Microbulbifer mangrovi had been reported with the ability to degrade more than 10 different polysaccharides [7].
Polysaccharides have a broad range of industrial applications. The most common storage polysaccharide, starch, can be used as food additives [8], excipients [9] and substrates in fermentation process to produce bioethanol [10]. Structural polysaccharides such as cellulose, chitosan and chitin, on the other hand, can be used to develop high-performance materials due to their renewability, biodegradability, biological inertness and low cost [11][12][13]. However, polysaccharides from natural sources are often not suitable for direct application. Chemical modifications involving the reactive groups (carboxyl, hydroxyl, amido, and acetamido groups) on the backbone of polysaccharide are required to alter their chemical and physical properties to suit the application purposes [14]. In the past years, explorations and researches are in favor of enzymatic method using carbohydrate active enzymes [15]. This alternative method offers the advantages of substrate specificity, stereospecificity, and environment friendly [16]. Hence, the discovery of novel carbohydrate active enzymes has great biotechnological interest and Microbulbifer strains are potential sources of these enzymes. Therefore, we sequenced the genome of Microbulbifer sp. CCB-MM1 with primary objective to identify potential carbohydrate active enzyme coding genes. The genome insights will serve as baseline for downstream analyses including enzyme activity assays and functional elucidation of these genes. To date, there are seven genomes of Microbulbifer publicly available from GenBank, namely Microbulbifer agarilyticus S89 (NZ_AFPJ00000000.1) [17], Microbulbifer variabilis ATCC 700307 T (NZ_AQYJ0 0000000.1), Microbulbifer elongatus HZ11 (NZ_JELR 00000000.1) [18], Microbulbifer sp. ZGT114 (LQBR0000 0000.1), Microbulbifer thermotolerans DAU221 (CP0148 64.1) [19], Microbulbifer sp. Q7 (LROY00000000.1) and Microbulbifer sp. WRN-8 (LRFG00000000.1). All of the Microbulbifer genomes are assembled to draft assembly only except the Microbulbifer thermotolerans DAU221 genome. Here we present the complete genome of Microbulbifer sp. CCB-MM1 and some insights from comparative analysis with seven other Microbulbifer genomes.

Classification and features
Microbulbifer sp. strain CCB-MM1 was isolated from mangrove sediment obtained from Matang Mangrove Forest. The isolation was done using the method previously described [20] with the use of H-ASWM (2.4% artificial sea water, 0.5% tryptone, 10 mM HEPES, pH 7.6) [21]. CCB-MM1 is a Gram-negative, aerobic, non-spore-forming and halophilic bacterium (Table 1). Its shape appears to be associated with its growth phases where it is rod-shaped at exponential phase (Fig. 1a) and cocci-shaped at stationary phase (Fig. 1b). The rod-shaped cell size ranges from approximately 1.3 to 2.5 μm in length and 0.3 μm in width while the diameter of coccus cells is approximately 0.6 μm. The colonies observed on agar plate are white in colour, circular, and raised with entire edge.

Genome sequencing information
Genome project history Genome of CCB-MM1 was sequenced in October 2015. The whole genome sequencing and annotation were done by Centre for Chemical Biology (Universiti Sains Malaysia). The complete genome sequence is Evidence codes -IDA inferred from direct assay, TAS traceable author statement (i.e., a direct report exists in the literature), NAS non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from http://www.geneontology.org/GO.evidence.shtml of the Gene Ontology project [75] available in GenBank under the accession number CP014143. The project information is summarized in Table 2.
Growth conditions and genomic DNA preparation CCB-MM1 was cultured aerobically in 100 mL of H-ASWM for overnight (16 h) at 30°C with shaking. The genomic DNA was extracted using modified phenolchloroform method [30]. The integrity of extracted genomic DNA was assessed by gel electrophoresis using 0.7% agarose gel and the quantification was done using NanoDrop 2000 Spectrophotometer (Thermo Scientific, USA).

Genome sequencing and assembly
The whole genome of CCB-MM1 was sequenced using PacBio RS II platform with P6-C4 chemistry (Pacific Biosciences, USA). Two SMRT Cells were used and 2,674,097,380 pre-filter polymerase read bases were obtained, which was approximately 692X coverage of the genome. The reads were assembled using HGAP3 protocol [31] on SMRT Portal v2.3.0 with reads more than 25,000 bp in length being used as seed bases. The assembly result was a circular chromosome with the size of 3,864,326 bp, average base coverage of 431X and 100% base calling. The assembled sequence was polished twice using the resequencing protocol until the consensus concordance reached 100%.

Genome properties
CCB-MM1 only contains one circular chromosome and no plasmid. The size of the chromosome is 3,864,326 bp with an overall of 58.85% G + C content ( Table 3). The complete genome consists of 3313 ORFs, 79 tRNA, 12 rRNA and 1 tmRNA genes. Of all the 3313 predicted ORFs, 2030 of them can be assigned with functional prediction and 2563 of them can be assigned to COG functional categories ( Table 4). The circular map of the genome generated using CGView Comparison Tool [44] is depicted in Fig. 3.

Comparative genomics
There are seven genomes of Microbulbifer strains publicly available in GenBank to date. To assess the relatedness between CCB-MM1 and publicly available Microbulbifer genomes, ANI values between the genomes were calculated using method based on MUMmer alignment [45]. Based on the results (   The total is based on the total number of protein coding genes in the annotated genome scan using HMMs profile downloaded from dbCAN (version: dbCAN-fam-HMMs.txt.v4) with an e-value cut off of 1e-18 and coverage cut off of 0.35. A total of 71 carbohydrate-active genes were detected and further analysis of these genes using SignalP predicted that 25 of them contain signal peptides. As shown in Table 6, we had found 29 genes associated with GH families including GH3, GH5, GH13, GH16, GH20, GH23, GH31, GH38, GH103 and GH130, however, we found no genes associated with PL families in the genome. Annotation of the GH genes revealed that CCB-MM1 genome possesses genes encoding cellulase (GH5), alpha-amylase, pullulanase (GH13) and beta-glucanase (GH16) with potential interest for biotechnological applications. While gene coding for beta-hexosaminidase, one of the chitinolytic enzymes [48], is present in the genome of CCB-MM1, gene that codes for chitinase was not detected. This suggests that CCB-MM1 lacks the ability to degrade chitin, although further assays are required to confirm the phenotype.

Rod-coccus cell cycle
Microbulbifer were found to demonstrate rod-coccus cell cycle, in association with different growth phases [49].    This cell cycle was also observed in CCB-MM1. In CCB-MM1 genome, we found genes which are known to be involved in determining and maintaining the rod shape of bacteria, including mreBCD [50] (AUP74_00016, AUP74_00017 and AUP74_00018), rodA [51] (AUP74_01706) and rodZ [52] (AUP74_01850). BLAST analysis showed that these genes are present in all other Microbulbifer genomes. In addition, we detected the presence of general stress response gene, bolA, in all Microbulbifer genomes. It has been demonstrated that the overexpression of bolA in E.coli inhibited cell elongation and reduced the transcription of mreBCD operon [53]. The gene, mreB, and its product, actin homolog have been studied for their functions in several species of bacteria. This protein lies beneath the cell surface, forming actin-like cables which function as guidance for the synthesis of longitudinal cell wall [54]. While MreB is not essential in E. coli [55], it is found to be essential for Streptomyces coelicolor [56], Rhodobacter sphaeroides [57] and Bacillus subtilis [58]. In E. coli, depletion of MreB caused cells to change from rod-like to spherical shape but these cells were able to survive [59]. In contrast, the spherical-shaped B. subtilis cells eventually lyse. For CCB-MM1, the spherical-shaped cells do not lyse but grow into rod-shaped again after being transferred into fresh medium. We infer that mreB gene may have important functions in determining Microbulbifer cell shape and the rod-coccus cycle of Microbulbifer is likely regulated by BolA through inhibition of mreB transcription when triggered by stress.

Secondary metabolites, ectoine
Ectoine and hydroxyectoine are compatible solutes found primarily in halophilic bacteria. When triggered by osmotic stress, bacteria produce and accumulate them intracellularly to balance the osmotic pressure [60]. Apart from osmotic stress, they were also protectants against temperature stress [61]. A cluster of genes responsible for the biosynthesis of ectoine [62] has been identified in CCB-MM1 genome using antiSMASH 3.0 [42]. These genes encode for aspartate kinase (Ask_Ect) (AUP74_00280), Lectoine synthase (EctC) (AUP74_00281), diaminobutyrate-2-oxoglutarate transaminase (EctB) (AUP74_00282), L -2,4diaminobutyric acid acetyltransferase (EctA) (AUP74_0 0283) and HTH transcriptional regulator (AUP74_00284). The lack of the gene ectD, ectoine hydroxylase, in CCB- MM1 genome suggests that it only has the ability to synthesize ectoine but not hydroxyectoine. By using BLASTP, we searched and found similar gene cluster in other Microbulbifer genomes except Microbulbifer variabilis ATCC 700307 T . While the reason for the absence of these genes in Microbulbifer variabilis ATCC 700307 T is unknown, our findings suggest that Microbulbifer utilized only ectoine instead of ectoine/hydroxyectoine mixture. The transcriptional regulator of ectoine operon, EctR, found in Methylophaga thalassica belongs to MarR family [63]. HTH transcriptional regulator (AUP74_00284) in CCB-MM1 also contains the conserved domain of MarR family. This implies that the HTH transcriptional regulator is likely the putative transcriptional regulator of ectoine operon in Microbulbifer. Ectoine has attracted considerable biotechnological interest due to its stabilizing effects that extend from proteins [64], nucleic acids [65] to whole cells [66]. Such properties allow it to be used in skin care product as cell protectants [66], protein stabilizers [67] and medical application as cryoprotectants in cryopreservation of human cells [68].

Conclusion
In this study we presented the complete genome sequence of Microbulbifer sp. CCB-MM1 with genome size of 3.86 Mb and G + C content of 58.85%. We discussed some insights on its phenotypic characteristics from the genomic perspective, covering carbohydrate active enzymes, rod-coccus cell cycle and secondary metabolite, ectoine. The genome sequence provides valuable information for functional elucidations of novel enzymes for both biotechnological application and fundamental research purposes.