Draft genome sequence of Mycobacterium rufum JS14T, a polycyclic-aromatic-hydrocarbon-degrading bacterium from petroleum-contaminated soil in Hawaii

Mycobacterium rufum JS14T (=ATCC BAA-1377T, CIP 109273T, JCM 16372T, DSM 45406T), a type strain of the species Mycobacterium rufum sp. . belonging to the family Mycobacteriaceae, was isolated from polycyclic aromatic hydrocarbon (PAH)-contaminated soil in Hilo (HI, USA) because it harbors the capability of degrading PAH. Here, we describe the first genome sequence of strain JS14T, with brief phenotypic characteristics. The genome is composed of 6,176,413 bp with 69.25 % G + C content and contains 5810 protein-coding genes with 54 RNA genes. The genome information on M. rufum JS14T will provide a better understanding of the complexity of bacterial catabolic pathways for degradation of specific chemicals.


Introduction
Polycyclic aromatic hydrocarbons, defined as organic molecules consisting of two or more fused aromatic rings in linear, angular, or cluster arrangement, mostly result from coke production, petroleum refining, fossil fuel combustion, and waste incineration [1]. Although the physical and chemical properties of PAHs vary depending on the number of rings, the characteristics such as hydrophobicity, recalcitrance, and mutagenic and carcinogenic potentials have been considered the main factors for the toxic effects on environmental ecosystems and human beings [1,2].
For removal of PAHs from contaminated environments, the bioremediation process based on microbial activities has attracted interest and has been actively studied [3]. Various bacteria, such as Sphingomonas spp., Pseudomonas spp., Rhodococcus spp., Burkholderia spp., and Mycobacterium spp., have been investigated regarding whether they can metabolize PAHs. In particular, several Mycobacterium species have been reported to effectively degrade high-molecular-weight PAHs [4,5]. Moreover, genomic studies on these bacterial species have contributed to the understanding of whole regulatory mechanisms of bacterial PAH degradation, for example for M. vanbaalenii PYR-1 [6], M. gilvum Spyr1 [7], and M. gilvum PYR-GCK [8] as well as the most recently reported M. aromaticivorans JS19b1 T [9].
M. rufum JS14 T (=ATCC BAA-1377 T , CIP 109273 T , JCM 16372 T , DSM 45406 T ) is the type strain of the species Mycobacterium rufum sp. nov. [10]. This bacterium was isolated from petroleum-contaminated soil at a former oil gasification company site in Hilo (HI, USA). The bacterium was identified because of PAH degradation activities, especially toward a four-ring-fused compound, fluoranthene [11]. Although the PAH-degrading ability has been demonstrated through metabolic and proteomic assays [12], genetic studies on the whole bacterial system with a PAH degradation pathway have not been conducted. Here, we present a brief summary of the characteristics of this strain and a genetic description of its genome sequence.

Classification and features
The 16S ribosomal RNA gene sequence of M. rufum JS14 T was compared with those from other Mycobacterium species using the BLAST software of NCBI [13]. The highest similarity was found with M. chlorophenolicum PCP-1 (99 % identity) [14,15] followed by M. gilvum Spyr1 (99 % identity) [7], M. gilvum PYR-GCK (99 % identity) [8], M. vanbaalenii PYR-1 (98 % identity) [16], and M. fluoranthenivorans FA4T (97 % identity) [17]. Species identified by the BLAST search and represented by full-length 16S rRNA gene sequences were included in the phylogenetic analysis. The phylogenetic tree was generated by the neighbor-joining method [18], and bootstrapping was set to 1000 times for random replicate selections. The consensus phylogenetic neighborhood of M. rufum JS14 T within the genus Mycobacterium is shown in Fig. 1.
M. rufum JS14 T is a non-motile, aerobic, Gram-positive bacterium belonging to the family Mycobacteriaceae [10].  [10] (shown in boldface with an asterisk) relative to the other species within the genus Mycobacterium. In this genus, species carrying the full length of 16S rRNA gene sequence were selected from the NCBI database [45]. The collected nucleotide sequences were aligned using ClustalW [46], and the phylogenetic tree was constructed using software MEGA version 6 [47] by the neighbor-joining method with 1000 bootstrap replicates [18] The cell shape is medium-to-long thin rods, and cell size is approximately 1.0-2.0 μm in length with the width of 0.4-0.6 μm as shown in Fig. 2. Generally, large, round, raised, smooth orange-pigmented colonies form within 7 days [10]. As one of the rapidly growing members of the genus Mycobacterium, the strain grows optimally at 28°C, reduces nitrate, but does not tolerate salinity (over 2.5 % NaCl, w/v) [10]. Strain JS14 T shows positive reactions in tests for catalase, α-glucosidase, aesculin hydrolysis, and urease, but negative reactions regarding β-glucuronidase, β-galactosidase, N-acetyl-β-glucosaminidase, gelatin hydrolysis, alkaline phosphatase, and pyrrolidonyl arylamidase activities [10]. Substrate oxidation was noticed for Tween 40, Tween 80, D-gluconic acid, D-glucose, Dfructose, D-xylose, D-mannose, D-psicose, trehalose, dextrin, glycogen, and D-mannitol, but not for α-/β-cyclodextrin, D-galactose, α-D-lactose, maltose, sucrose, mannan, or maltotriose [10]. When cultured in the minimal medium (per liter: 8.8 g of Na 2 HPO 4°2 H 2 O, 3.0 g of KH 2 PO 4 , 1.0 g of NH 4 Cl, 0.5 g of NaCl, 1. , not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [42] concentration of 40 mg/L), M. rufum JS14 T showed an effective degrading action on the added compound by utilizing it completely during 10 days as a sole source of carbon and energy [11].

Genome project history
Strain M. rufum JS14 T was selected for sequencing because of its effective ability to degrade PAH, as a model organism for a recalcitrant organic-pollutant-degrading bacterium. The genome sequencing was performed in September, 2014, and the Whole Genome Shotgun project was deposited in the DDBJ/EMBL/GenBank databases under the accession number JROA00000000. The version described in this study is the first version, labeled JROA00000000.1. The sequencing project information and its association with the Minimum Information about a Genome Sequence version 2.0 compliance [22] are described in Table 2.

Genome sequencing and assembly
The genome of M. rufum JS14 T was sequenced using the single-molecule real-time DNA sequencing platform on the Pacific Biosciences RS II sequencer with P5 polymerase -C3 sequencing chemistry (Pacific Biosciences, Menlo Park, CA) [23]. A 20-kb insert SMRT-bell library was prepared from the sheared genomic DNA and loaded onto two SMRT cells. During the single 180-min run-time, 1,020,750,498 read bases were generated with 300,584 reads. Reads of less than 100 bp or with low accuracy (below 0.8) were removed. In total, 111,515 reads produced 823,795,879 bases with a read quality of 0.831. All post-filtered reads were assembled de novo using the RS hierarchical genome assembly process, version 3.3 in SMRT analysis software, version 2.2.0 (Pacific Biosciences) [24] and resulted in 4 contigs corresponding to 4 scaffolds, with 113.03-fold coverage. The maximal contig length and N50 contig length had the same size of 5,760,162 bp. The whole genome was found to be 6,176,413 bp long.

Genome annotation
The protein-coding sequences were predicted by Prokaryotic Genome Annotation Pipeline, version 2.8, on the NCBI website (rev. 447580) [25]. Additional gene prediction and functional annotation were performed in the Rapid Annotation using Subsystems Technology server [26] and Integrated Microbial Genomes-Expert Review pipeline [27], respectively.

Genome properties
The genome size of M. rufum JS14 T was found to be 6,176,413 bp with the average G + C content of 69.25 %. The genome was predicted to contain a total of 5864 genes, which include 5810 protein-coding genes with 54 RNA genes (6 rRNAs, 47 tRNAs, and 1 ncRNA). Of these, 4498 genes were assigned to putative functions, and 3669 genes (approximately 62.57 %) were assigned to the COG functional categories. The genome statistics are presented in Table 3 and Fig. 3, respectively. The gene distribution within the COG functional categories is presented in Table 4.

Insights from the genome sequence
Regarding the specific degradation capability toward the four-aromatic-ring-fused compound, fluoranthene [10][11][12], Fig. 3 A graphical circular map of the M. rufum JS14 T genome. The circular map was generated using the BLAST Ring Image Generator software [68]. From the inner circle to the outer circle: Genetic regions; GC content (black), and GC skew (purple/green), respectively the genome of M. rufum JS14 T was found to contain corresponding genes encoding proteins for the aromaticcompound degradation.
Generally, research on bacteria degrading PAHs holds great promise for biotechnological applications to decontamination of pollutants [10]. In this regard, understanding of PAH degradation by indigenous microbes is important for evaluation of ecological effects of these microbes [31]. On Hawaiian islands, PAH contamination has occurred through various activities such as the petroleum industry, waste incineration, and fossil fuel The total is based on the total number of protein coding genes in the annotated genome combustion, even via natural causes such as volcanic activity [10]. Mycobacterium is a well-known genus capable of mineralizing PAHs [12]. Considering the Hawaiian delicate island ecosystem, several native bacteria belonging to the genus Mycobacterium were isolated, M. rufum JS14 T is one of them [10]. One of native isolates from the petroleum-contaminated Hawaiian soil in Hilo (HI, USA), M. aromaticivorans JS19b1 T [10], is known to have rapid degrading capabilities toward various PAHs such as fluorene, phenanthrene, pyrene, and fluoranthene [10,11,29]. Similarly, M. rufum JS14 T was found as an effective degrader of a fouraromatic-ring-fused compound, fluoranthene, not showing degrading capacity toward other high-molecularweight PAHs (e.g., pyrene, benzo[a]pyrene) or toward low-molecular-weight PAHs (e.g., fluorene, phenanthrene) [11,12]. The gene annotation profiles for the genome of M. rufum JS14 T may provide important clues to the identity of the whole metabolic pathway for fluoranthene degradation. Just as a recent study on the functional pangenome analysis of the genus Mycobacterium capable of degrading PAHs [32], our data can also help to explain the complexity of bacterial catabolic pathways for degradation of specific chemicals, from the standpoint of microbial ecology.

Conclusions
M. rufum JS14 T was isolated from PAH-contaminated soil of a former oil gasification company site in Hilo (HI, USA) and was designated as a novel species that was named Mycobacterium rufum (ru'fum. L. neut. adj. rufum ruddy or red, pertaining to the colony pigmentation of the type strain) [10]. In this study, we presented the genome sequence of the strain. This genetic information may provide new insights that will help to extend the application potential of bacterial bioremediation of various toxic compounds and to elucidate the features of metabolic degradation pathways for PAHs. Comparison of the selected five genome sequences was conducted using function profile categories in the IMG-ER pipeline [27],