Complete genome sequence of lytic bacteriophage RG-2014 that infects the multidrug resistant bacterium Delftia tsuruhatensis ARB-1

A lytic bacteriophage RG-2014 infecting a biofilm forming multidrug resistant bacterium Delftia tsuruhatensis strain ARB-1 as its host was isolated from a full-scale municipal wastewater treatment plant. Lytic phage RG-2014 was isolated for developing phage based therapeutic approaches against Delftia tsuruhatensis strain ARB-1. The strain ARB-1 belongs to the Comamonadaceae family of the Betaproteobacteria class. RG-2014 was characterized for its type, burst size, latent and eclipse time periods of 150 ± 9 PFU/cell, 10-min, <5-min, respectively. The phage was found to be a dsDNA virus belonging to the Podoviridae family. It has an isometric icosahedrally shaped capsid with a diameter of 85 nm. The complete genome of the isolated phage was sequenced and determined to be 73.8 kbp in length with a G + C content of 59.9%. Significant similarities in gene homology and order were observed between Delftia phage RG-2014 and the E. coli phage N4 indicating that it is a member of the N4-like phage group.


Introduction
The occurrence and spread of antibiotic resistant bacteria in the environment are regarded as environmental challenges of highest concern in the twenty-first century. ARB bacteria are becoming common, and the Centers for Disease Control and Prevention in the United States estimates more than 23,000 patients die annually due to ARB infections in the US alone [1]. With diminishing opportunities to discover new drugs to combat ARB infections, there is an urgent need to develop alternative therapeutic methods. Phage therapy has been regarded as an alternative to the need of synthesizing new antibiotics [2].
The Delftia genus resides in the Comamonadaceae family of the Betaproteobacteria class and is a Gram negative, short rod-shaped bacterium. Delftia species are widely distributed in the environment and have significant biodegradation capability [3,4]. A recently described species, closely related to Delftia acidovorans, Delftia tsuruhatensis, has been reported to cause biofouling of bioreactor membranes [5], reverse osmosis membrane filters [6] and heating systems [7]. In addition, D. tsuruhatensis has been reported to be the causative agent of catheter-related nosocomial human infections [8,9]. Previously, we isolated a multi-drug resistant D. tsuruhatensis strain ARB-1 from a municipal wastewater treatment plant along with the lytic bacteriophage. We demonstrated phage based therapy to combat biofouling caused by D. tsuruhatensis strain ARB-1 with the newly isolated lytic phage as the therapeutic agent [10].
Here, we report the complete genome sequence of the lytic phage specific to D. tsuruhatensis ARB-1 that we named RG-2014 (it does not infect Delftia Cs1-4 or Delftia acidovorans SPH-1 (our unpublished results) [10]. The RG-2014 sequence is annotated and analyzed in order to explore its potential application as an antibiofilm bio-agent. The host of RG-2014 is multi-drug resistant, using it as a control agent can be an especially appropriate application. The present study is not part of a larger genomic survey.

Classification and features
The lytic bacteriophage RG-2014 belongs to the Podoviridae family in the order Caudovirales. It is a doublestranded DNA virus that forms 1-2 mm diameter clear plaques when infecting the multidrug resistant bacterium Delftia tsuruhatensis strain ARB-1.
A sample of sludge was obtained from a local wastewater treatment plant, the Central Valley Water Reclamation Facility in Salt Lake City UT, USA. A lytic phage infecting D. tsuruhatensis ARB-1 was isolated from this sample following a previously described protocol [11,12]. To remove bacteria and debris the sample was sequentially filtered through 0.45 and 0.2 μm filter membranes [10]. The resulting phage-containing liquid was spotted (without further concentration) on an R2A agar (0.5 g/L protease peptone, 0.5 g/L yeast extract, 0.3 g/L K 2 HPO 4 , 0.05 g/ L MgSO 4 ·7H 2 O, pH 7) plate containing a lawn of D. tsuruhatensis ARB-1 [10]. Following incubation of the plates at 37°C overnight, a clear plaque was picked, followed by the isolation of a second well-separated single plaque on a fresh D. tsuruhatensis ARB-1 lawn.
As shown in Fig. 1(a) the head of phage RG-2014 virion has a diameter of 85 nm and displays a hexagonal outline implying that it likely possesses icosahedral symmetry. It can also be seen from this transmission electron micrograph, that the virion has a very short tail, indicating that it is a member of the Podoviridae class of viruses. Figure 1(b) shows a micrograph with RG-2014 phage particles attached to a D. tsuruhatensis bacterial cell pili; it is not known if such pili may serve as receptor for this phage. Table 1 gives the classification and general features of RG-2014 phage. The genome of the phage is linear double-stranded DNA (dsDNA) that is about 70 kb in length as measured by its mobility during pulsed-field gel electrophoresis ( Fig. 1(c)).
A one step growth curve was performed with the phage RG-2014 following previously described protocols [10]. The burst size, latent and eclipse period were found to be 150 ± 9 PFU/cell, 10-min, and <5-min, respectively, at 37°C [10].
The complete genome sequence of the phage RG-2014 was determined. The analysis of the genome clearly shows that it is a member of the N4-like phage group (see below). Grose and Casjens [11] showed that the major capsid proteins (MCPs) of virulent tailed phages parallel the evolution of the nucleotide sequence of the whole phage genome. Phylogeny of the MCPs of selected N4-like phages and other tailed phages shows that the phage RG-2014's major capsid protein (identified by its similarity that of E. coli phage N4, accession no. EF056009) falls robustly within the N4-like phage group (Fig. 2).

Genome sequencing information
Genome project history Phage RG-2014 was isolated in February of 2011, with D. tsuruhatensis strain ARB-1 as its host, The genome sequencing and analysis of phage RG-2014 was completed in December of 2016. It is the first genome sequence reported for a lytic phage infecting D. tsuruhatensis. The purified phage DNA was sequenced with a MiSeq Bench-top DNA sequencer (Illumina, CA) in the High-throughput Genomic Core Facility at the University of Utah. A summary of the phage RG-2014 genome sequencing information is presented below and in the Table 2.

Growth conditions and genomic DNA preparation
Phage RG-2014 virions were purified from infected D. tsuruhatensis ARB-1 lysates. Briefly, 0.5 L of cells were grown to 1 × 10 8 cells per mL in R2A medium at 37°C with shaking at 150 RPM [10]. The culture was then infected with five RG-2014 phages per cell, followed by incubation for 12 h. After clear cell lysis was observed leading to a cleared culture (the cells lysed), cell debris was removed by centrifugation for 30 mins at 5500×g. Phage virions were then pelleted by centrifugation overnight (>12 h) at 8890×g at 4°C, and the pellet was re- Purified phage virions were obtained by CsCl step gradient centrifugation as described by Earnshaw et al. [12]. The purified phages were stored in SM buffer with gelatin until further use.
The purified RG-2014 virion preparation was used for phage DNA extraction according to the protocol described by Casjens and Gilcrease [13]. Briefly, 400 μL of the CsCl purified phage particles was mixed with 75 μL of lysis buffer (5 μL of 20% SDS, 50 μL 1 M Tris. Cl, 20 μL 0.5 M EDTA, pH = 8) and incubated at 65°C for 15 min. 50 μL of 5 M potassium acetate was added to the sample and incubated on ice for 1 h. The sample was then centrifuged at 8000×g for 15 min at 4°C, and the supernatant was carefully transferred into a new 1.5 mL micro-centrifuge tube. After adding 0.9 mL of absolute ethanol to the supernatant and inverting several times, the DNA precipitate was collected by winding it onto the tip of a sterile Pasteur pipette. The DNA precipitate was transferred into a new micro-centrifuge tube, washed with 70% ethanol by inverting a few times, and subsequently pelleted by centrifugation in a microfuge. The DNA pellet was allowed to dry at room temperature for 10-20 min and resuspended in 100 μL of TE buffer (10 mM Tris-Cl pH 7.5 and 1 mM EDTA pH 8.0). About 0.1 μg of the phage DNA was mixed with 5 μL of loading dye and separated by 1% agarose pulsed-field gel electrophoresis (PFGE), with a 1-25-s pulse ramp, a voltage of 6.0 V/cm with an angle of 120°for 24 h at a constant temperature of 14°C on a CHEF DR III system (Bio-Rad, USA). After completion of electrophoresis the gel was stained with ethidium bromide (Molecular Probes, USA) and visualized under CHEM DOC gel documentation system (Bio-Rad, USA).

Genome sequencing and assembly
Approximately 8 million paired-end reads with an average length of 300 bp were generated using a MiSeq Bench-top DNA sequencer (Illumina, CA). The reads were interleaved and trimmed based on a Phred score of 28 and a minimum post-trimming average length of 290 bp on the CLC Genomics Workbench 7.0.4 (CLC Bio, Denmark). The trimmed reads were de novo assembled on the CLC Genomics Workbench 7.0.4 with the following criteria: word size, 20 bp; automatic bubble size, 50 bp; minimum contig length, 200 bp as described in Bhattacharjee et al. [10]. The termini of the virion chromosome were determined by dideoxynucleotide Sanger sequencing [14] using the virion DNA as a template using the following primers which direct sequencing runs off the two ends as follows; right end, 5′-TGCTTCATGATCTTC AGTCC-3′ and left end, 5′-GAAGGCATCAGC ATGTTCAG-3′.

Genome annotation
Glimmer [15] was used to identify the open reading frames and GeneMarkS [16] for predicting genes. The predicted genes were used to search the NCBI non- redundant database, the conserved domain database, the Cluster of Orthologous Groups database and the Inter-Pro database and were annotated using Blast2GO 2.5.0 [17]. Automated annotation performed by Blast2GO 2.5.0 was manually curated by individually analyzing each predicted gene using BLAST against NCBI nr database with minimum e-value cut off of 10 −3 [18]. ARA-GORN [19] and tRNAScanSE [20] were used for detection of transfer RNA genes. The complete annotated genome sequence is available in Genbank under the accession number KM879221.

Genome properties
The lytic phage RG-2014's complete genome size was found to be 73,882 bps that includes 450 bp direct terminal repeats (we note that, when it has been examined, the genomes of other N4-like phages invariably have several hundred bp terminal repeats)with a G + C content of 59.9%. The annotation includes 88 putative protein coding ORFs and no tRNAs (Table 3). Predicted proteins were classified in COG functional categories [21,22] using the WebMGA web server for metagenome analysis [23]. The number of predicted genes and the relative percentage of phage genes associated with the 25 general functional COG categories are described in Table 4. Twenty-eight (31.8%) of the 88 genes in the RG-2014 phage genome were assigned a putative function based on significant sequence similarity to genes of known functionality in the NCBI database. Twenty-one (23.8%) genes encode putative proteins that were assigned to the conserved hypothetical protein category.
Additionally, 40 predicted genes (44.3%) had no similarity to genes in the current database, and their products were classified as hypothetical proteins (Table 5). Annotation using the CDD on the NCBI server was also performed and is presented in Table 6.

Insights from the genome sequence
The phylogenetic tree of MCPs in Fig. 2 indicates that phage RG-2014 is most closely related to the group of phages typified by Escherichia coli phage N4 (NC_008720) [13,[24][25][26][27][28]. In addition their hosts, E. coli K-12 and D. tsuruhatensis strain ARB-1 belong to the same phylum Proteobacteria. Table 1 summarizes the classification and general features of the phage RG-2014. BLAST searches using the Delftia phage RG-2014 genome as a probe was undertaken to confirm this notion. Genome comparisons with E. coli phage N4 (NC_008720) were performed, and significant similarities in gene homology and order were observed between phages RG-2014 and N4 (Table 5 and Fig. 3). The phage RG-2014 genome shows mosaicism that is typical of tailed phages, with (for example) some regions displaying close relatedness to phage N4 (Fig. 3). Mosaicism in bacteriophage genomes is a well-known phenomenon wherein regions of high similarity are interspersed with less related or unrelated regions. These mosaic patterns in bacteriophage genomes corroborate the theory that horizontal gene transfer plays a significant role in phage evolution [29][30][31].
E. coli phage N4 does not depend upon its host's RNA polymerase to transcribe its early and middle genes. But encodes its own set of two RNAPs. These are encoded   Table 5).
Most of the N4 like phages have been shown to harbor between 1 and 3 genes encoding tRNA. Paepe et al. [33] and Bailey-Bechet et al. [34] suggesting, virulent phages harbor more tRNA genes than temperate phages to ensure optimal translation leading to faster replication. However, the phage RG-2014 genome lacks transfer RNA genes, suggesting that the phage is highly adapted to its host D. tsuruhatensis ARB-1, with regard to codon usage, allowing it to translate its genes efficiently without the need of synthesizing its own tRNAs [24]. To support our finding average codon usage bias was calculated for the phage RG-2014 and D. tsuruhatensis CM13 (NZ_CP017420), a close representative of the host D. tsuruhatensis ARB-1. The average codon usage bias calculation was performed using CodonO web server (http://sysbio.cvm.msstate.edu/CodonO/) [35]. D. tsuruhatensis CM13 (NZ_CP017420) and phage RG-2014 had similar average codon usage bias of 0.440141 and 0.406048, respectively, suggested the phage was adapted to its host.
There are two known types of virion assembly gene arrangements in the N4-like phages. First, those like phage N4 that have a single contiguous gene cluster that encodes all of the known structural genes and lysis proteins except the head decoration protein (N4 gene 17). Second, typified by Pseudomonas phage LIT1 in which several tail genes are present inside the replication gene cluster [25,36]. Phage RG-2014 carries a set of homologous genes, including the separate decoration protein gene (RG-2014 gene 24), that have the phage N4 type organization. By homology to those of N4 [36], RG-2014 genes 24, 68, 69, 71-78, 83 and 85 encode virion structural proteins.
Phage RG-2014 makes clear plaques and carries no genes that encode proteins (such as integrase or protelomerase) that might suggest a temperate lifestyle. In addition, we also recently showed that the database of bacterial genome sequences has grown to a point where relatives of essentially all known temperate phages can be found as prophages present in the reported genome sequences of their hosts [37]. Thus, absence of closely related homologous genes (the MCP gene was used in that study) in closely related host genomes of the same bacterial family is strong evidence that a phage is virulent; related prophages would be found to encode such a gene if the phage in question were temperate. In fact no genes that are closely related to MCP of the phage RG-2014 are present in the current bacterial sequence database. The closest MCP gene relatives in prophages are from the distantly related bacterial genera Mesorhizobium, Pantoea and Acinetobacter whose encoded homologous proteins are only 47-56% identical to the amino acid sequence of phage RG-2014 MCP. The latter gene matches are found (when the sequence contigs are The total is based on the total number of protein coding genes in the annotated genome          [38]. Genomes were aligned using Easyfig [38]. The functions of genes in phage N4 are shown above and predicted functions of RG-2014 genes are indicated below the maps sufficiently large for such a determination) to be present in rather distantly related prophages that have other similarities to the N4-like phages including a prophage encoded vRNAP, suggesting that there are currently undescribed temperate phages that are very distantly related to the N4-like phage group (our unpublished observation). Nonetheless, among the 143 currently available genomes from the Comamonadaceae bacterial family (including eight Delftia genomes) the best-encoded protein matches have only 22% identity to the phage RG-2014 MCP. We conclude that phage RG-2014 is virulent. The N4-like phage group is clearly well separated from the other known tailed bacteriophages [11,28], but the taxonomic status of different phages within the group remains less understood. Unlike some other tailed phage types, the N4-like phages include members that infect a wide range of bacterial hosts in the Alphaproteobacteria, Betaproteobacteria and Gammaproteobacteria classes [25,28]. Fig. 4 shows a dotplot of a diverse sample of N4-like phage genomes that illuminates several aspects of the phages in this group (no diagonal lines are present when comparison is with other tailed phage types, data not shown). First, phage RG-2014 is not particularly closely related to any of the other currently known N4like phages; its closest, but nonetheless rather distant, relatives are Achromobacter phages JWDelta, JWAlpha and øAxp-1. We note that these four phages infect members of the Βetaproteobacteria. A second conclusion that can be drawn from fig. 4 is that genome similarity within this group of phages generally parallels the relatedness of their hosts. The various subtypes of the N4-like phage group (separated by thick red lines in the figure) are usually restricted to single genus; the one current exception to this rule is the relatively close relationship between Vibrionaceae phage VPB47 and Fig. 4 Dotplot of N4-like phage genomes. Phage genomes were arranged in the same orientation and a dot plot was constructed by Gephard [39] with a word length setting of 11. The phages in the figure include the current extant diversity among the N4-like phages; those that are not included are very similar to one of the phages that is included (their sequences are all in GenBank and can be retrieved by searching with their names). In the plot thin red lines separate the phage genomes, and thick red lines separate the most clearly delineated subtypes. At the right, the genus (red text), family (black text) and class (blue text) of each phage's host bacteria are indicated; vertical very thick red lines on the right indicate phages that infect the same host genus, and very thick blue lines mark host families Pseudoalteromonadaceae phage pYD6-A. It thus appears that recent "jumping" of these phages between taxonomically distant hosts is not common. On the other hand, more than one N4-like phage subtype can infect a given host genus; for example, Escherichia and Erwinia N4-like phages are clearly present as two subtypes (e.g. the Escherichia N4/EcP1 and Erwinia Ea9-2/S6 pairs). More distant host relationships are complex. Very weak diagonal similarity lines are present when the Escherichia (phage N4 subtype), Erwinia and Achromobacter N4-like phages are compared. These could tentatively correspond to members of the proposed Enquatravirinae subfamily [28].

Conclusions
The D. tsuruhatensis infecting phage RG-2014 belongs to the Podoviridae viral family. The phage RG-2014 genome sequence shows significant synteny and sequence similarity to E. coli bacteriophage N4 and other members of the N4-like group of tailed phages; this clearly demonstrates phage RG-2014's membership in this group. Our analysis confirms that phages in the virulent N4-like group are widely present in the wild. The members of the N4-like group infect bacterial hosts in several classes within the Proteobacteria phylum. Their virulent nature, widespread distribution and efficient infection suggest that members of this group will be useful in many bacterial control situations.