Draft genome sequences of three fungal-interactive Paraburkholderia terrae strains, BS007, BS110 and BS437

Here, we report the draft genome sequences of three fungal-interactive 10.1601/nm.27008 strains, denoted BS110, BS007 and BS437. Phylogenetic analyses showed that the three strains belong to clade II of the genus 10.1601/nm.1619, which was recently renamed 10.1601/nm.26956. This novel genus primarily contains environmental species, encompassing non-pathogenic plant- as well as fungal-interactive species. The genome of strain BS007 consists of 11,025,273 bp, whereas those of strains BS110 and BS437 have 11,178,081 and 11,303,071 bp, respectively. Analyses of the three annotated genomes revealed the presence of (1) a large suite of substrate capture systems, and (2) a suite of genetic systems required for adaptation to microenvironments in soil and the mycosphere. Thus, genes encoding traits that potentially confer fungal interactivity were found, such as type 4 pili, type 1, 2, 3, 4 and 6 secretion systems, and biofilm formation (PGA, alginate and pel) and glycerol uptake systems. Furthermore, the three genomes also revealed the presence of a highly conserved five-gene cluster that had previously been shown to be upregulated upon contact with fungal hyphae. Moreover, a considerable number of prophage-like and CRISPR spacer sequences was found, next to genetic systems responsible for secondary metabolite production. Overall, the three 10.1601/nm.27008 strains possess the genetic repertoire necessary for adaptation to diverse soil niches, including those influenced by soil fungi. Electronic supplementary material The online version of this article (10.1186/s40793-017-0293-8) contains supplementary material, which is available to authorized users.


Introduction
The genus Burkholderia was proposed in 1993 by Yabuuchi et al. [1]. Following this, continuing emendation of the genus has occurred, mainly as a result of the addition of new species. Recent molecular and phylogenetic analysis of the genus divided it into two clades, with clade I containing the pathogenic Burkholderia spp. and clade II mainly environmental bacteria. The latter clade was reclassified as a novel genus, named Paraburkholderia [2,3]. This genus encompasses a suite of highly diverse and environmentally adaptable bacteria that are able to occupy various ecological niches, ranging from soil [4,5] to plants and humans [6]. Members of the genus Paraburkholderia are also known to harbor some of the largest genomes among all known bacteria [7,8].
Paraburkholderia terrae strain BS001, which was isolated as a co-migrator in soil with the saprotrophic fungus Lyophyllum sp. strain Karsten [9], has been extensively described, and it is used here as a reference organism. P. terrae strain BS110 was isolated from the mycosphere of the ecotomycorrhizal fungus Laccaria proxima [5,9] and also showed comigration capacity with the aforementioned fungus. The other two Paraburkholderia terrae strains (BS007, BS437) were isolatedsimilarlyas mycosphere dweller / comigrator, from soils collected in Gieterveen and Wageningen, the Netherlands, respectively [5,9]. Being avid mycosphere inhabitants, all these Paraburkholderia strains might play essential roles in the ecology of soil fungi and so in (degradative) ecosystem functions. Several studies have been performed to address such interactions and understand the mechanisms involved. An in-depth study of the genome of P. terrae strain BS001 revealed its remarkable genetic potential, including genetic systems that presumably enable it to interact with saprotrophic fungi like Lyophyllum sp. strain Karsten [5,8]. Moreover, the strain BS001 genome was found to contain numerous regions of genomic plasticity that are typified by different plasmid-and prophage-like genes [8]. We took this finding as a token of the remarkable ability of P. terrae to adaptvia horizontal gene transferto fluctuating local challenges, including the presence of fungal counterparts. The strategies that are presumably used in this fungal interactivity include (but are not limited to): (i) biofilm formation on fungal surfaces [9,10], (ii) a type-3 secretion system (T3SS) with a subtle role in the cellular migration along fungal hyphae and adherence [10,11] and (iii) chemotaxis towards growing fungal hyphae and subsequent adherence to fungal surfaces [10]. In a recent study, it was shown that P. terrae strain BS001 differentially expresses genes involved in chemotaxis, flagellar motility and metabolic and stress response mechanisms in response to fungal hyphae [12].
Given the fact that the three novel P. terrae strains BS110, BS437 and BS007 were isolated by virtue of their capacity to interact with soil fungi, we hypothesized that their physiological responses to fungi, as reflected in their genomic make-up, might be similar across them and akin to those of the well-studied strain BS001. To further explore this tenet, analyses of sequenced genomes constitute a necessary first step. Here, we present a summary of the draft genome sequences, and their annotation, of the three novel P. terrae strains. Furthermore, we examine the traits that allow to build hypotheses with respect to the ecological relevance of these strains in the mycosphere, coupled to analyses of phenotypes. Based on these characteristics, we thus shed light on the potential strategies that these strains may use in the interplay with their fungal counterparts.

Organism information
Classification and features P. terrae BS110 and BS007 were isolated from the base of fruiting bodies of the ectomycorrhizal fungus Laccaria proxima, sampled in Gieterveen, the Netherlands [9]. Like the reference strain BS001, strain BS437 was isolated as a comigrator with L. sp strain Karsten (in this case it was isolated from soil from Droevendaal, Wageningen, the Netherlands). The collected samples were treated as previously described [5,9]. Briefly, for isolation of P. terrae BS110 and BS007, mycosphere samples were carefully collected from soil adhering to the dense L. proxima hyphae just below the fruiting body. Strains BS001 and BS437 were isolated as 'winners' of microbiome co-migration experiments [5,9]. All isolated Paraburkholderia strains were grown on LB medium at 28°C. Phylogenetic analyses based on alignment of seven concatenated core genome genes (aroE, dnaE, groeL, gyrB, mutL, recA, and rpoB) (Fig. 1) showed that P. terrae strains BS110, BS007 and BS437 clustered within the Paraburkholderia genus (akin to the former Burkholderia clade II), as reported previously for strain BS001 [8]. Based on these analyses, our four P. terrae strains were also found to be closely related to Paraburkholderia phytofirmans and P. xenovorans.
Gram staining of freshly-grown cells of P. terrae strains BS007, BS110 and BS437 revealed all three strains to be Gram-negative. Transmission electron microscopy of freshly-grown cultures showed that each strain population consisted mainly of single cells that were rod-shaped (cell lengths 1 to 2 μm), with predominantly polar flagella (Fig. 2).
The growth of all strains was tested at different temperatures (4, 12, 15, 18, 24, 37, 42 and 50°C). For all strains, the temperature range that allowed the formation of detectable CFUs on plates was 15-37°C, with optimum growth being recorded at 28°C within 3 days. The pH tolerance of strains was tested by assessing the growth of colonies of each of the strains on R2A plates under different pH (specifically 4.0, 5.0, 6.0, 7.0, 8.0, 9.0 and 10.0) at 28°C. All strains were able to grow in the pH range 5.0-10.0, with optimum growth at pH 6.0-7.0. No growth was recorded at pH 4.0. Salt tolerance assays were done by placing cells on R2A plates supplemented with different NaCl concentrations (specifically zero, 0.5, 1.0, 2.0, 2.5, 5.0 and 10%), and incubating for up to five days, with regular observation of colony formation. Strains BS007, BS110 and BS437 were able to grow at up to 1% NaCl in the R2A medium, being strongly inhibited at 2% NaCl. Hence, all three strains tested are quite salt-sensitive.
The capacities of the strains to utilize an array of carbon sources were tested using BIOLOG GN2 assays (Biolog Inc., Hayward, CA). The results revealed that most strains are able to utilize a suite of different carbonaceous compounds (Tables 1, 2, and 3) (as in Nazir et al. [5]). Some of the carbonaceous compounds could only be utilized by some, but not all, strains. That is, strains BS007 and BS110 (but not BS437) could utilize d-trehalose, phenyl ethylamine, 2,3-butanediol and gentiobiose. The compound dcellobiose was utilized only by strains BS007 and BS437, while γ-hydroxybutyric acid was utilized only by strains BS110 and BS437. There was also substrate specificity, in that some compounds could only be utilized by one strain each. For instance, strain BS007 utilized itaconic acid, whereas d-serine and α-d-lactose were uniquely utilized by strain BS110, and d-melibiose, β-methyl-d-glucoside and αketoglutaric acid by strain BS437.

Genome sequencing information
Genome project history P. terrae BS110 and BS007 were isolated from the base of fruiting bodies of Laccaria proxima, in Gieterveen, the Netherlands and strain BS437 was isolated -as a comigrator with L. sp strain Karsten -from Droevendaal, Wageningen, The Netherlands. The three strains were selected for sequencing, as they showed migration proficiency in soil along with the fungus Lyophyllum sp. strain Karsten, similar to the closely related P. terrae strain BS001 [5]. Moreover, there is a current lack of knowledge on the mechanisms behind the behavior of such fungal-interactive P. terrae strains. Sequencing of the draft genomes was completed in 2012, and the sequences of strain BS007, BS110 and BS437 have been deposited for public release at NCBI under the accession numbers NFVE00000000, NFVD00000000 and NFVC00000000, respectively. A summary of the project information is shown in Table 4.  1 Phylogenetic tree of selected Burkholderiaand Paraburkholderia strains based on 16S rRNA gene sequences (a) and on alignment of seven concatenated core genes (aroE, dnaE, groeL, gyrB, mutL, recA, and rpoB) (b). Evolutionary distance were computed with MEGA7 using the maximumlikelihood method. The bootstrap values above 50% (from 1000 replicates) are indicated at the nodes. P. terrae strains BS007, BS110 and BS437 were all found to belong to clade II. Clade I mainly consists of pathogenic Burkholderiaspecies, while clade II, mainly consisting of environmental strains, was assigned to the new genus Paraburkholderia. See Sawana et al. [3] Growth conditions and genomic DNA preparation All strains were grown aerobically on LB medium at 28°C (180 rpm, shaking, overnight). The genomic DNA of the overnight cultures was then extracted using a modified (Powersoil) DNA isolation kit (MOBio Laboratories Inc., Carlsbad, CA, USA). The modification consisted of adding glass beads to the cultures to spur mechanical cell lysis. This extraction method is a rapid way to produce highly pure DNA from bacterial cultures. The extracted gDNAs were purified with the Wizard DNA cleanup system (Promega, Madison, USA). The quality and quantity of the extracted DNAs were assessed using electrophoresis in 1% agarose.

Genome sequencing and assembly
The genomic DNAs of P. terrae strains BS110, BS007 and BS437 were sequenced on the Illumina HiSeq2000 platform by LCG Genomics (Berlin, Germany). The libraries for the strains were prepared using Illumina TruSeq libraries with Covaris-sheared DNA or TruSeq® Nano DNA Library Prep. Totals of approximately 18, 16 and 17 million paired reads were produced for the P. terrae BS007, BS110 and BS437 strains, respectively. Illumina's CASAVA data analysis software was used for further processing, such as adapter trimming and quality trimming using the fastX toolkit. K-mer error correction analysis was done using Quake Version 0.3; the K-mer corrected paired reads were 16, 15 and 15 million for BS007, BS110 and BS437. Genome assembly was then carried out using Velvet version 1.2.05, by LCG Genomics (statistics of the sequencing is provided in Additional file 1: Table S1). Totals of 788, 658 and 843 contigs were formed following assembly, for strains BS007, BS110 and BS437, respectively.
The 16S rRNA genes were extracted and added as a separate scaffold. The extraction of 16S rRNA genes was done using SortMeRNA and assembly using SPAdes version 3.9.0.

Genome annotation
The sequence information of the P. terrae BS007, BS110 and BS437 genomes was submitted to the MicroScope platform that is hosted at Genoscope [13] for analysis. The gene annotation editor in MicroScope was used; it includes the use of TrEMBL, SwissProt alignments, the PubMed and InterPro databases and SignalP. The MicroScope platform is also integrated with a metabolic profiling platform that includes the PkGDB database, as well as MicroCyc that is designed to extract genomic and metabolic data from the Pathway Genome Databases, KEGG and the secondary metabolite detection program antiSMASH [13].

Genomic properties
The genome of strain BS007 has an estimated size of 11,025,273 bp, with 61.89% G + C content, that of strain BS110 11,178,081 bp (61.84% G + C), and that of strain BS437 11,303,071 bp (61.84% G + C) (Fig. 3). The three genomes contain 10,411 (86.83%), 10,288 (85.85%) and 10,610 (86.03%) protein-encoding regions, respectively. The properties and statistics of the genomes are summarized in Table 5, and the numbers of genes associated with general COG functional categories in Table 6.
Comparative genomics based analyses of the pan and core genomes of strains BS007, BS110 and BS437 revealed that these -across the three strains -comprised 17,404 coding regions, whereas the core genome  Table S2).

Insights into the genome sequences
Each of the genomes of P. terrae strains BS007, BS110 and BS437 was found to contain genes predicted to encode highly diverse primary and secondary metabolisms, as previously found in strain BS001 [8]. For example, numerous sets of genes were predicted to be involved in carbohydrate metabolism (Additional file 1: Table S3). Also, genes for predicted uptake systems were abundantly present across the three strains. Remarkably, the glycerol uptake and glycerol kinase genes glpK and glpD were found consistently across all three strains. These genes had 100% homology with the same genes found in strain BS001. Secondary metabolite analyses showed that the three strains contain 14, 16 and 17 gene clusters encoding these (strain BS007, BS110 and BS437, respectively; Additional file 1: Table S4). In each strain, one gene cluster was found for nonribosomal peptide synthetase (NRPS) and a hybrid NRPS and polyketide synthase (PKS). Remarkably, the NRPS-PKS encoding systems of strains BS007 and BS110 had the same length (12,267 bp) as well as peptide monomer composition (val-mal-gly). In contrast, the strain BS437 system was shorter (length 9398 bp) and had a reduced peptide a Evidence codes -IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgement monomer composition (mal-gly). Remarkable, we found an additional NRPS gene cluster, uniquely, in the genome of strain BS110 (Additional file 1: Table S4). Next to these gene clusters, others encoding bacteriocin, terpene, ectoine, phosphonate and aryl polene production were also found in all three strains (Additional file 1: Table S4).
In addition, sets of plant-interactive genes were detected in all three genomes. In particular, those for production of indole acetic acid from tryptophan, as well as of 1aminocyclopropane-1-carboxylate deaminase (ACC deaminase), were found. We also found the nodulation genes nodI, nodJ, nodN and nodW across all three genomes, next to (uniquely) nodV in strain BS110 (Additional file 1: Table S5). Similar sets of genes have previously been found in strain BS001 and these were implied in a putative 'rhizosphere phase' of this strain [8]. Together, the data indicated the presence of genes for a convergent suite of traits with ecological relevance across the three strains.
The ability of bacteria to produce exopolysaccharides is critical in biofilm formation, and the biofilm (extra-matrix) a Evidence codes -IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgement poly-β-1,6-N-acetyl-D-glucosamine (PGA) system has been shown to be an important component of Paraburkholderia biofilms [16]. PGA-encoding genes were previously found in the strain BS001 genome [8]. Other exopolysaccharideproduction systems, such as those for alginate, pel and psl, have been identified in P. aeruginosa [17]. The analysis of the genomes of the three novel strains uncovered several such systems in all strains. Specifically, complete PGA systems (pgaA, pgaB, pgaC and pgaD), next to two genes of the pel (pelB and pelD) system, were found. In Pseudomonas aeruginosa, the pel (pelA-F) system produces a biofilm matrix, a glucose-rich polysaccharide polymer that has essential structural and protective roles [18]. The analysis also found several alginate production system genes (algA, algB, algC, algD, algP, algU and kinB) in all strains. The exception was algE1, which was only found in the strain BS007 genome. In contrast, we did not find any gene from the psl exopolysaccharide production system (Table 7).
Furthermore, complete sets of T3SS-encoding genes were found in all three genomes (Table 7). A phylogenetic tree based on eight (concatenated) conserved genes (SctS, a Evidence codes -IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project. If the evidence is IDA, then the property was directly observed for a live isolate by one of the authors or an expert mentioned in the acknowledgement SctR, SctQ, SctV, SctU, SctJ, SctN and SctT) of the T3SS showed that all systems belong to the Hrp-2 type of the T3SS (Figs. 5 and 6). It has been suggested that this type is required for the establishment of interaction with fungi [19,20]. Moreover, copies (sometimes partial) of other secretion systems, i.e. the T1SS, T2SS, T4SS and T6SS, were discovered in the three genomes (Additional file 1: Table  S6). These genomic evidences indicate that the three P. terrae strains are highly versatile in a range of (potentially host-related) niches in soil.
We previously found that, upon physical contact with the soil fungus L. sp strain Karsten, a five-gene cluster in P. terrae strain BS001 becomes highly expressed [12]. This gene cluster was hypothesized to be involved in energy generation coupled to an oxidative stress response, with four of the five genes being highly upregulated [12]. The five-gene cluster includes an alkyl hydroperoxidase AhpD family core domain containing protein, a cupin domain containing protein, a LysR family transcriptional regulator, a putative nucleosidediphosphate sugar epimerase and a conserved exported Fungi-interactive, phylogenetic tree, prophage identification.
Fungi-interactive, phylogenetic tree, prophage identification.  (2) and (3) represents Primary/Automatic annotations), (iv) GC skew (G + C/G-C) and (v) color-code representing rRNA (blue), tRNA (green), miscellaneous RNA (orange), Transposable elements (pink) and pseudogenes (grey) Pseudogenes The total is based on either the size of the genome in base pairs or the total number of protein encoding genes in the annotated genome; N/D not determined The total is based on the total number of protein encoding genes in the genome protein of unknown function [12]. Our current genome analyses revealed that the complete gene cluster was present in all of the newly sequenced genomes (Additional file 1: Table  S7). A synteny assessment of the respective clusters of the strain BS007, BS110 and BS437 genomes with that of strain BS001 showed synteny and high levels of homology across all clusters (94%-100%) (Fig. 7).

Presence of bacteriophage-related sequences
We finally analyzed the three genomes for the presence of prophage-like sequences, as prophages endow bacteria with traits that may advance their evolutionary fitness (following a lysogenic conversion). Thus, phenotypic plasticity of the host bacteria (i.e. with respect to virulence factors, auxiliary metabolic genes, and traits affecting biofilm formation) is fostered [21][22][23]. The analyses showed that the genomes of P. terrae BS110, BS007 and BS437 all contain considerable amounts of prophage-like sequences (9.9%, 11.8% and 11.3%, respectively), with strain BS437 being able to produce phage progeny [34]. We then analyzed the three genomes for the presence of CRISPR-Cas spacer sequences. CRISPR-Cas systems provide so-called adaptive immunity to bacteria, serving as a heritable record of past infections with phages or other extraneous elements [24]. Using the (web-based) CRISPRFinder program [25], we found CRISPR sequences to be present in all three strains; respectively   Tables S6-S8 21, 22 and 15 such sequences were found in strains BS007, BS110 and BS437. This finding indicated the host strains had been exposed to numerous extrachromosomal element (e.g. phage) infestations.

Conclusions
The here reported genome analyses of the fungalinteractive Paraburkholderia terrae strains BS110, BS007 and BS437 revealed that all genomes were large in size, Fig. 5 Phylogenetic tree of selected type-3 secretion systems (T3SS). The tree was generated based on alignment of eight conserved genes of the T3SS (SctS, SctR, SctQ, SctV, SctU, SctJ, SctN, and SctT). Evolutionary distance was computed with MEGA7 using a maximum-likelihood method.
The bootstrap values above 50% (from 1000 replicates) are indicated at the nodes. The T3SSs of P. terrae strains BS007, BS110 and BS437 T3SS belong to the Hrp-2 type, as previously reported for BS001 [8]. Different types of T3SSs were described in Abby and Rocha [19] encompassing a suite of metabolic, nutrient capture and 'interactivity' genes. The repertoire of genetic systems found probably encompasses traits that allow adaptation to niches in the soil as influenced by organisms such as fungi, as well as plants. Moreover, potential defense systems were also found. Thus, all genomes harbored highly diverse primary and secondary metabolite systems. Furthermore, they also contained sets of genes for type-4 pili, biofilm formation (PGA, alginate and pel), secretion systems (T1SS, T2SS, T3SS, T4SS and T6SS) and glycerol uptake systems; such systems potentially enable them to reap the ecological benefits  Fig. 7 Synteny comparison five-gene cluster among strains. The corresponding genes were indicated by the color boxes. Comparison percentage was generated using BLAST+ 2.4.0 (tBLASTx with cutoff value 10 −3 ) and figures were created with the Easyfig program [33] conferred by fungal hyphae in soil. A five-gene cluster, that had been found to be highly upregulated upon physical contact with Lyophyllum sp. strain Karsten in strain BS001, was consistently found in all three strains. This allowed the hypothesis that this gene cluster may confer a fitness advantage to the organisms in the early stages of contact with fungal mycelium in soil. Finally, our analyses also highlight the presence of a considerable amount of prophage-like sequences, complete or incomplete, in the P. terrae genomes. The significance of these prophage sequences for the host cells and their effects on the ecological functioning and adaptability of the hosts is still under investigation.