Strategies for high-altitude adaptation revealed from high-quality draft genome of non-violacein producing Janthinobacterium lividum ERGS5:01

A light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water of Sikkim Himalaya was affiliated to Janthinobacterium lividum based on 16S rRNA gene sequence identity and phylogenetic clustering. Whole genome sequencing was performed for the strain to confirm its taxonomy as it lacked the typical violet pigmentation of the genus and also to decipher its survival strategy at the aquatic ecosystem of high elevation. The PacBio RSII sequencing generated genome of 5,168,928 bp with 4575 protein-coding genes and 118 RNA genes. Whole genome-based multilocus sequence analysis clustering, in silico DDH similarity value of 95.1% and, the ANI value of 99.25% established the identity of the strain ERGS5:01 (MCC 2953) as a non-violacein producing J. lividum. The genome comparisons across genus Janthinobacterium revealed an open pan-genome with the scope of the addition of new orthologous cluster to complete the genomic inventory. The genomic insight provided the genetic basis of freezing and frequent freeze-thaw cycle tolerance and, for industrially important enzymes. Extended insight into the genome provided clues of crucial genes associated with adaptation in the harsh aquatic ecosystem of high altitude.


Introduction
The genus Janthinobacterium was derived from the genus Chromobacterium (mesophilic, fermentative bacteria producing purple and violet colonies) to separate nonfermentative and psychrophilic bacteria producing violet colonies [1]. Hence, the most common feature of this genus is psychrophilic bacteria producing violet pigment violacein [1]. However, there have also been reports on partly pigmented and non-pigmented bacteria within this genus [1,2]. In the present study, a light pink coloured bacterial strain ERGS5:01 isolated from glacial stream water sample in Sikkim Himalaya was affiliated to Janthinobacterium lividum by 16S rRNA gene sequence identity and phylogeny. The lack of typical violet pigmentation intrigued us to establish its taxonomic identity using whole genome sequencing. MLSA using multiple concatenated housekeeping genes was applied to investigate the phylogenetic position of the strain within Janthinobacterium. This method has been widely used to resolve the taxonomic position of closely related prokaryotic species within a genus [3]. The availability of whole genome sequences of multiple strains further allowed in silico DDH and ANI to confirm the taxonomic position of the strain with higher certainty [4].
The genus Janthinobacterium has a wide occurrence ranging from soil, aquatic sites, marine habitats, high altitude environments with a unique ability to survive and colonise new environments [5,6].With the revolution in the field of microbial genomics and analyses such as pan-genome, it becomes handy to compare many strains of a species or genus to obtain a complete inventory of genes [7]. We used the genome sequence of strain ERGS5:01 and other strains to study the genomic diversity within this genus. The bacterial strain was isolated from an aquatic ecosystem of a high altitude region (4718 masl) [8]. Organisms in such environment sustain temperature fluctuation and are exposed to strong ultraviolet-B radiation, with low nutrient availability. [9,10]. Bacterial cold associated adaptive traits to withstand such harsh conditions includes proteins required to maintain molecular central dogma and membrane fluidity at low temperature. [11]. Other associated proteins are those which response to osmotic, oxidative and cold stress [12]. The copy number of these proteins have often been reported to increase to accelerate the number of active sites to neutralise the lowered enzymatic rates at low temperatures by the cold-active organisms [13]. In the present study, we present an extended genomic insight of strain ERGS5:01 to explore their taxonomic position and to identify potentially important proteins for their survival in harsh environments of the high altitude aquatic ecosystem.

Classification and features
The East Rathong glacier falls in the survey of India toposheet no. 78A/2 within the Khangchendzonga National Park area in the Sikkim Himalaya. It lies between 27°33′ and 27°36´N latitude and 88°04 and 88°08′ E longitude in the West district of the state Sikkim in India [14]. During the isolation of psychrotrophs to explore for bioprospection, this aerobic chemoheterotrophic bacterial strain ERGS5:01 was isolated from a glacial stream located in the ablation zone of East Rathong glacier at an altitude of 4718 masl [8].The bacteria was isolated on ABM agar plates [peptone (0.5%, w/v), yeast extract (0.2%, w/v) and agar (2%, w/v)] [15] by incubating at 10°C for 15 days. ERGS5: 01 is a gram-negative, aerobic bacteria with optimum growth at 10°C. The strain produced light pink colour colonies after a 72 h incubation at temperature 15°C, 10°C, and 4°C. The colonies were found to be round, convex and entire. This bacteria could grow at the temperature range of 4-28°C, NaCl concentration range of 1% to 4%, and pH range of 3-10 pH (Table 1). Scanning electron microscopy revealed the shape of the bacteria as short rods with an average length of 0.8 to 1.1 μm (Fig. 1).

Extended feature descriptions
16S rRNA gene analysis Sequence identity search based on 16S rRNA gene sequence (1341 bp/ NCBI Accession No. KT766048) of strain ERGS5:01 with a database of type strains as available in NCBI [16] exhibits closest sequence identity of 99% with J. lividum PAMC 25724. Phylogenetic clustering constructed using Neighbor-Joining tree using Jukes-Cantor model of sequence evolution with 1000 bootstrap replications using Molecular Evolutionary Genetics Analysis version 7.0 [17] also clustered the strain ERGS5:01 with J. lividum PAMC 25724 (Fig. 2).
Biochemical profiling, extracellular enzyme assay, freezing and freeze-thaw tolerance The strain ERGS5: 01 was tested for various biochemical activities such as catalase, oxidase, triple sugar iron, citrate utilisation, urease, indole, MR-VP, motility and carbohydrate utilisation (KB009 HiCarbohydrate™ kit, HiMedia).The strain was observed as gram-negative short rods, motile, non- , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [70] fermentative, positive in oxidase, catalase and urease and negative in MR-VP test. Out of the 35 sugars tested, this strain could utilize xylose, maltose, fructose, dextrose, raffinose, trehalose, o-nitrophenyl-β-D-galactoside, esculin while it could not utilize lactose, galactose, melibiose, sucrose, L-arabinose, mannose, inulin, sodium gluconate, glycerol, dulcitol, inositol, sorbitol, mannitol, adonitol, arabitol, erythritol, α-methyl-Dglucoside, ribose, rhamnose, cellobiose, melezitose, α-methyl-D-mannoside, xylitol, D-arabinose, malonate and, sorbose. The extracellular enzymatic activities namely amylase, lipase, protease and cellulase for strain ERGS5:01 were analysed using standard plate assay at 10°C. The strain showed positive results for amylase, lipase, and protease activities. Survival percentage for freezing and frequent freeze-thaw cycle tolerance was tested by colony count method considering count on day 0 as 100% as described by Shivaji et al. [15]. For freeze tolerance, 27 tubes of 1 ml culture were allowed to reach stationary phase using ABM broth, and 24 of them were placed at − 20°C. At each time point (1,3,5,7,9,11,13,15 days of freezing), three tubes were removed, thawed for 1 h at 10°C and 100 μl were serially diluted in 900 μl of 0.9% saline. Three unfrozen tubes served as zero time point. The diluted culture was plated on ABM agar and incubated for 3-5 days at 10°C. The mean from triplicate colony counts results were used for determining the survival percentage considering the cell count on day 0 as 100%. For freeze-thaw cycle tolerance, a similar procedure as described for freezing tolerance was followed, but freezing and thawing were in continuous cycles (1,3,5,7,9,11,13,15 cycles

Genome sequencing information
Genome project history The whole genome of the strain ERGS5:01 was sequenced owing to its lack of usual violet pigmentation, typical to the genus Janthinobacterium, and their ability to survive harsh aquatic ecosystem of the high altitude region. The work was carried out as a part of a project to understand the genetic basis of survival of psychrotrophs and its bioprospection from East Rathong Glacier in the Sikkim Himalaya. The sequencing was completed at CSIR-Institute of Himalayan Bioresource Technology, Palampur using PacBio RS II platform (Microsynth AG, Switzerland). The draft genome has been deposited in GenBank under the accession MAQB00000000 while the version described in this paper is MAQB02000000.
The project summary with minimum information about a genome sequence [18] is shown in Table 2.

Growth conditions and genomic DNA preparation
The strain ERGS5:01 was regularly grown at 10°C in ABM agar. Genomic DNA from the strains was extracted using GenElute™ Bacterial Genomic DNA Kit (Sigma-Aldrich, US).The obtained genomic DNA was evaluated for its quality and quantity using 1% agarose gel electrophoresis and Qubit 2.0 Fluorometer (Invitrogen, USA).

Genome sequencing and assembly
Shearing of genomic DNA (10 μg) was done using g-TUBE™ (Covaris, US) and DNA library was prepared using 10 kb insert size with PacBio SMRTbell library preparation kit v1.0 [6]. Quantification of the prepared library was done using Qubit 2.0 Fluorometer (Invitrogen, USA). Sequencing was performed using PacBio RSII system (Pacific Biosciences, US) as described previously  [19,20]. Assembly of the generated subreads was performed de novo using RS hierarchical genome assembly process protocol version 3.0 (HGAP.3) in SMRT Analysis version 2.3.0 (Pacific Biosciences, US).

Genome annotation
Annotation of the high-quality draft genome was performed using the JGI Prokaryotic Automatic Annotation Pipeline [21] with the additional analysis and the manual review being done within the IMG platform [22,23]. The functions of the predicted protein-coding genes and genes with Pfam domains were assigned using the Interpro platform [24]. Genes assigned to COGs were assigned by searching against COG database (from the NCBI conserved domain database [25]) using rpsblast with significant E-value of 0.0001.BLASTclust with thresholds of 70% covered length and 30% sequence identity was used to obtain the number of genes in internal clusters [26]. Signal peptides and transmembrane helices were predicted using SignalP [27] and TMHMM [28] respectively. CRISPR database was used to identify CRISPR repeats in the genome [29].

Genome properties
The strain ERGS5:01 was assembled into 16 contigs containing the genome of total 5,168,928 bp with a G + C content of 60.48% (N50 contig length of 3,372,370 bp with average reference coverage of 38.09 X). A total of 4693 genes were predicted out of which 4575 were protein-coding genes, 118 were RNA genes (25 rRNAs, 90 tRNAs, and three non-coding RNAs) and 600 pseudo genes ( Table 3). The circular chromosomal map for the draft genome is presented in Fig. 4 using ClicO FS, an online service based on Circos [30]. From COG database, 2559 genes were assigned to biological functions and 3160 genes (67.33%) were reported to be assigned to protein families. Table 3 summarises the genome properties and statistics, and Table 4 presents the distribution of genes into COG functional categories. Insights from the genome sequence and comparative genomics The strain ERGS5:01 appeared sister to J.lividum PAMC 25724 based on its 16S rRNA gene sequence identity and phylogeny (Fig. 2). However, the 16S rRNA gene sequence identity between other species of Janthinobacterium also showed an identity above the threshold value (> 98.7%) ( Table 5) as recommended for species identity by Meier-Kolthoff et al. [31]. The insufficiency of 16S rRNA genes in resolving species for many genera [3] led us further explore the phylogenetic position of the strain ERGS5:01 using six housekeeping genes namely, rpoB, aroE, gmk, RecA, gyrB and tpi. These genes were retrieved from whole genome sequence available from strain ERGS5:01 and other 20 Janthinobacterim strains. Multiple alignments were performed using MAFFT, statistics for each locus was summarised using MEGA 7, and phylogenetic tree of concatenated six housekeeping genes was constructed using maximum likelihood method based on the JTT matrixbased model in MEGA 7 [32]. Neighbour-joining tree constructed with six concatenated housekeeping genes for MLSA analysis agreed with the data generated by the maximum likelihood method described above (Additional file 1: Figure S1).The MLSA clustering revealed monophyly of strain ERGS5:01 and J. lividum PAMC 25724 (Cluster II)   Pseudogenes may also be counted as protein coding or RNA genes, so is not additive under total gene count coherent to the 16S rRNA phylogeny (Fig. 5). This group formed a sister clade with other J. lividum strains with strong bootstrap support of 98% (Cluster I) (Fig. 5). Such distinct separation among J. lividum strains prompted us to carry out an exhaustive automatic BLAST as well as manual searches to elucidate the presence of vioABCDE operon genes among genomes available from J. lividum strains. Interestingly, the distinct violacein pigment (from which the genus Janthinobacterium derives its name) producing genes were absent in both the strains of cluster II and all the strains among cluster I contained vioABCDE operons (Fig. 5). Hence, the separation of two clusters among J. lividum strains was based on possession of vioABCDE operons. We then performed whole genome sequence-based in silico DDH using the online genome-to-genome calculator with the GGDC 2.0 BLAST+ model [33] and ANI using nucleotide fasta sequences of each genome compared to the genome of strain ERGS5:01 as a reference with the Perl script [34]. The observed DDH value was 95.15% and, ANI value was 99.25% between strain ERGS5:01 and PAMC 25724 (Table 5 and Additional file 2: Table S1). Both the values qualify above the cut-off value for species boundary [33,34], and hence the results were consistent with the MLSA clustering of strain ERGS5:01 with PAMC 25724. In a recent study, strain PAMC 25724 has been reported as the strain of the species J. lividum [35] and is a validly published species with the availability of culture at Polar and Alpine Microbial Collection with accession number 25724.
The genome-wide amino-acid analysis of all 27 psychrotolerant Janthinobacterium strains revealed broad similarities in the usage profiles of Ala, Leu, Gly, and Val as the most frequently used amino acids. An ultra-fast computational pipeline Bacterial Pan Genome Analysis Tool [46] was used to assess all 27 genomes for comprehensive pangenome studies based on power law model. The pangenome curve perfectly fits a power law function with an Multilocus sequence analysis (MLSA) clustering based phylogenetic tree of six concatenated housekeeping genes as derived from the whole genome sequence from the strains of Janthinobacterium. The tree was constructed using the maximum likelihood method based on the JTT matrix-based model using MEGA7.Bootstrap values over 50% (1000 replications) were shown at each node. All positions containing gaps and missing data were eliminated. Among J. lividum, two clusters were formed; cluster I showed the presence of vioacelin-containing genes whereas cluster II lacked it exponent of 0.447968 indicating that the pan-genome of the genus Janthinobacterium is open (Additional file 5: Figure S3). Greater than zero exponents and, open pangenome correspond to the incomplete gene inventory with the scope of the additions of new orthologous clusters [7,47]. The orthologous gene cluster for the pangenome (complete gene family) was observed to be 21,349 out of which 1066 (~5%) were core genome. Core genomes represent the list of gene families shared by all 27 Janthinobacterium genomes. All the strains reported under the genus were psychrotolerant. Interestingly, we obtained various categories of genes associated with cold adaptation within the list of core genomes (namely, twocomponent histidine kinase, cold-shock proteins, coldactive chaperone, DNA repair, carbon storage/starvation, membrane/cell wall alteration and oxidative stress) (Additional file 6: Table S3). Among the core genomes of 27 strains,~95% of the genes could be assigned to COG categories (Additional file 7: Table S4). The highest percentage of the genes (17.5%) in these COG categories were associated with signal transduction mechanism (Additional file 7: Table S4). Likewise, a recent report on Pseudoalteromonas haloplanktis TAC125 has discussed the role of major stimulus signalling transduction cascades-TCS histidine kinase on the bacterial adaptation to cold and deep water [48]. Total numbers of accessory genes observed were 20,283 which include the speciesspecific unique genes ranging from 0 to 2860 genes (Additional file 5: Figure S3). The open pan-genome with the discrepancy in the number of unique genes among strains strongly supports the high diversity in the genomic cluster of the genus Janthinobacterium. Importantly, strain CG23_2 which showed a maximum number of unique genes also has the largest genome in the genus Janthinobacterium [45]. All reported strains are psychrotolerant and have diverse habitat range as supported by the diversity in genomic structure revealed by the pangenome analysis.

Extended genomic insights into adaptation to the high altitude aquatic environment
Exhaustive data mining across the genome of strain ERGS5:01 was carried out to identify potential genes responsible for its assistance in the survival of aquatic high altitude environment. Multiple copies of genes for cold adaptation and other stress response proteins were observed as discussed below.
Two-component systems (TCS) histidine kinase and, signal transduction pathways TCS are widespread in bacteria, and used for monitoring and adapting to changes in their extra-or the intracellular environment. Various chemical and physical stimuli including pH, temperature, oxidative stress induce differential expression of TCSs in bacteria [49]. Furthermore, this two-component histidinekinase system has been reported for their role in the bacterial survival at cold [50,51]. The genome of strain ERGS5:01 contained 58 copy numbers of such TCS. The report on blockage of cold-sensitive secretion pathway in E. coli has revealed the critical role of signal peptide/ secretion route for growth at low temperature with the aquatic environment [50]. Flagellin-specific chaperone (FliS) which binds to flagellin and facilitates bacterial transport was also observed in strain ERGS5:01. This observation further supports the presence of signal transduction and secretory pathways essential for survival at cold-temperature aquatic conditions.
Pigmentation Pigmentation of bacteria is reported to play an important role in cold and radiation adaptations [7].The strain ERGS5:01 lacked the usual violet pigment of the genus, and likewise there was no observation of violacein -producing gene (vioA, vioB, vioC, vioD, and vioE) in the genome. However, it produced the light pink pigment that intrigued us to explore the genome for genes involved in carotenoid/ terpenoids biosynthesis pathway. Two copies of phytoene synthase genes and, one copy each of phytoene desaturase, phytoene dehydrogenase, lycopene beta-cyclase, octaprenyl diphosphate synthase, and dimethyl alanine transferase were observed. The presence of carotenoid/terpenoids biosynthesis pathway genes may assist this strain in providing tolerance to UV-B radiations, maintaining homeostasis during temperature fluctuations and adaptability in harsh condition of glacial ecosystems [9]. Multiple copies of genes like UvrD helicase (5), UvrABC helicase (1) and, UvrB/UvrC (1) were also observed which may assist against UV damage.
Oxidative stress response High exposure to UV radiations causes damage to bacteria surviving in extremely high altitude conditions by generating free radicals [52]. Increase in oxidative stress in Pseudomonas fluorescens MTCC 667 grown at low temperature was reported by the enhanced level of enzyme (namely, superoxide dismutase and catalase) activities and free radicals. [53]. Elevated activity of another antioxidant enzyme thioredoxin reductase in Listeria monocytogenes growing at 10°C as compared to a reference culture grown at 37°C was also reported [54]. We observed numerous copies of putative oxidases genes in strain ERGS5:01 that leads to the production of large quantities of intrinsic H 2 O 2 and other reactive oxygen species. The genome includes multiple copies of the thioredoxin (10), peroxiredoxin (5), alkyl hydroperoxide reductase (3), organic hydroperoxide reductase (2) and, a copy each of thioredoxin reductase, superoxide dismutase, catalase-peroxidase genes.
DNA repair and cold-shock chaperones It has been well-reviewed and demonstrated that CSPs are strongly induced in bacteria in response to a rapid decrease in growth temperature [55][56][57]. CSPs are involved in RNA metabolism which prevents secondary structure formation and facilitates degradation of structured RNA, hence functioning as RNA chaperones. Two copies of genes inducing CSPs have been observed in the strain ERGS5:01 which potentially assist in the tuning of RNA metabolism in the cold adaptation. Increased expression of the HtpG and GroEL gene have been observed in response to low temperatures in cyanobacterial strains, Synechococcus sp. PCC 7942 [58]. Similarly, strain ERGS5:01 also possess a single copy of HtpG, GroEL, and GroES genes which may involve in the acclimation to low temperatures.
Horizontal gene transfer (HGT) supporting adaptation to cold Numerous bacteria, unlike eukaryotes, have acquired a significant portion of DNA from distantly related organisms [59]. Such acquisitions have been reported to be prevalent in the prokaryotic genome with a low frequency of recombination and have greatly increased the genomic diversity, enabling bacteria to adapt and colonise in extreme and hostile conditions [60]. This phenomenon prompted us to investigate the occurrence of any such horizontally acquired genes in strain ERGS5:01 that may confer help in its adaptation to extreme conditions. Entire genes of strain ERGS5:01 were queried against the locally constructed database of other 19 Janthinobacterium genomes using BLAST with significant E-value of 1e-15. These blast results provided the list of 12 genes with no match against any available Janthinobacterium genomes indicating possibilities of HGT ( Table 6). The G + C compositions of these genes were also indicative of HGT as it varied from the usual 60% G + C of genus Janthinobacterium (Table 6) [60,61]. The HGT-acquired genes include two copies of glycosyl transferase family that participates in peptidoglycan biosynthesis involved in providing the protective shell around bacterial cell membranes and in cell elongation and cell division [62]. This enzyme has been reported to have increased expression in Shewanella oneidensis at low temperature [63] and might be considered as one of the crucial genes in the strain ERGS5:01for cold adaptation. Another important gene encoding for tellurium resistance protein namely Ter C, a general stress response protein was also observed. In spite of the fact that some other Janthinobacterium genomes also possess genes encoding for TerC, yet the gene observed in strain ERGS5:01 suggested different amino-acid composition that was more closely related to Variovorax paradoxus with a sequence similarity of 84%. Further studies are necessary to ascertain their specific role(s) for cold adaptation in strain ERGS5:01.

Conclusion
Sikkim Himalaya possesses untapped microbial resources with the tremendous scope of bioprospection [64]. Strain ERGS5:01 is one such light pink pigmented bacteria identified as J. lividum. The taxonomic identity of the strain remained uncertain as it lacked the usual violet pigmentation typical of the genus Janthinobacterium. Whole genome sequencing of the strain was performed owing to the discordance between unusual pigmentation and taxonomy and, survival at the harsh aquatic ecosystem. A high-quality draft genome of 5.1 Mb was generated and deposited at GenBank under accession No. MAQB00000000. MLSA clustering allowed better phylogenetic resolution while genome based GGDH and ANI supported the clustering and confirmed the identity of strain as a non-violecin producing J. lividum. Further, strain ERGS5:01 was studied for its biochemical and physiological features for adaptational strategies such as freeze and freeze-thaw tolerance. The comparative pan-genome analysis revealed an open-pan genome with the scope of the addition of new orthologous cluster to complete the inventory of genes of Janthinobacterium and, the discrepancy in the number of unique genes among strains strongly supported the high diversity in the genomic cluster of this genus. The genomic insight of strain ERGS5:01 provided a genetic basis for its tolerance to freezing and frequent freeze-thaw cycles and the presence of industrially important enzymes. Extended genomic insights further provided a glimpse on crucial genes likely to be associated with the strategies to adapt harsh environment of high elevation.