Draft genome sequences of Bradyrhizobium shewense sp. nov. ERR11T and Bradyrhizobium yuanmingense CCBAU 10071T

The type strain of the prospective 10.1601/nm.30737 sp. nov. ERR11T, was isolated from a nodule of the leguminous tree Erythrina brucei native to Ethiopia. The type strain 10.1601/nm.1463 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T, was isolated from the nodules of Lespedeza cuneata in Beijing, China. The genomes of ERR11T and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T were sequenced by DOE–JGI and deposited at the DOE–JGI genome portal as well as at the European Nucleotide Archive. The genome of ERR11T is 9,163,226 bp in length and has 102 scaffolds, containing 8548 protein–coding and 86 RNA genes. The 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T genome is arranged in 108 scaffolds and consists of 8,201,522 bp long and 7776 protein–coding and 85 RNA genes. Both genomes contain symbiotic genes, which are homologous to the genes found in the complete genome sequence of 10.1601/nm.24498 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+110 T. The genes encoding for nodulation and nitrogen fixation in ERR11T showed high sequence similarity with homologous genes found in the draft genome of peanut–nodulating 10.1601/nm.27386 10.1601/strainfinder?urlappend=%3Fid%3DLMG+26795 T. The nodulation genes nolYA-nodD2D1YABCSUIJ-nolO-nodZ of ERR11T and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+10071 T are organized in a similar way to the homologous genes identified in the genomes of 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+110 T , 10.1601/nm.25806 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 and 10.1601/nm.1462 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+05525. The genomes harbor hupSLCFHK and hypBFDE genes that code the expression of hydrogenase, an enzyme that helps rhizobia to uptake hydrogen released by the N2-fixation process and genes encoding denitrification functions napEDABC and norCBQD for nitrate and nitric oxide reduction, respectively. The genome of ERR11T also contains nosRZDFYLX genes encoding nitrous oxide reductase. Based on multilocus sequence analysis of housekeeping genes, the novel species, which contains eight strains formed a unique group close to the 10.1601/nm.25806 branch. Genome Average Nucleotide Identity (ANI) calculated between the genome sequences of ERR11T and closely related sequences revealed that strains belonging to 10.1601/nm.25806 branch (10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+15615), were the closest strains to the strain ERR11T with 95.2% ANI. Type strain ERR11T showed the highest DDH predicted value with 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+15615 (58.5%), followed by 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 (53.1%). Nevertheless, the ANI and DDH values obtained between ERR11T and 10.1601/strainfinder?urlappend=%3Fid%3DCCBAU+15615 or 10.1601/strainfinder?urlappend=%3Fid%3DUSDA+4 were below the cutoff values (ANI ≥ 96.5%; DDH ≥ 70%) for strains belonging to the same species, suggesting that ERR11T is a new species. Therefore, based on the phylogenetic analysis, ANI and DDH values, we formally propose the creation of 10.1601/nm.30737 sp. nov. with strain ERR11T (10.1601/strainfinder?urlappend=%3Fid%3DHAMBI+3532 T=10.1601/strainfinder?urlappend=%3Fid%3DLMG+30162 T) as the type strain. Electronic supplementary material The online version of this article (10.1186/s40793-017-0283-x) contains supplementary material, which is available to authorized users.


Introduction
Biological nitrogen fixation is a vital process in ecosystem functioning, offering a nitrogen for plant growth. Legume plants form a nitrogen-fixing symbiotic association with soil bacteria known as rhizobia. The symbiotic association results in the formation of nodules, shelter and powerhouse of nitrogen fixation for the rhizobia, on the roots or stems of host legumes [1]. The rhizobia belong to Alphaproteobacteria and Betaproteobacteria [2]. alphaproteobacterial Bradyrhizobium was first described as slowgrowing rhizobia by Jordan [3]. Since then, 33 distinct rhizobial species belonging to the genus Bradyrhizobium were formally described [4]. In addition, unique Bradyrhizobium groups isolated from diverse legume species might represent new species [5][6][7][8][9][10][11].
In rhizobial taxonomic studies, polyphasic approaches such as phenotypic features, analysis of the 16S rRNA genetic marker, and DDH were for years used as standard criteria for the description of new bacterial species. Nevertheless, the 16S rRNA gene sequence difference between closely related species, particularly in the genus Bradyrhizobium is low for differentiation of closely related species [5,12,13]. Bacterial strains in the same species could be delineated at ≥70% DDH relatedness [14,15], but yet this method is vulnerable to variable laboratory results that lead to an inconsistent classification of the same species [16]. To resolve the issues related to the traditional wetlab DDH technique, a digital DDH method was proposed for calculation of the DDH from genome sequences for bacterial classification study [17][18][19].
Multilocus sequence analysis (MLSA) of housekeeping protein-coding genes has become a common practice in bacterial taxonomic studies. The method offers high resolution and hence, has been used in rhizobial taxonomic studies for species identification and differentiating strains at the species level [5,13,20,21]. Recently, the genomewide average nucleotide Identity (ANI) method has successfully been used for classification of various bacterial species [22][23][24]. According to Richter and Rosselló-Móra [25] and Kim et al. [23], the ANI cutoff value that corresponds to the traditional 70% DNA-DNA relatedness cutoff value for species delineation was in the range 95-96%, depending on the nature of bacterial genome sequences. A more advanced ANI calculation was carried-out by Varghese et al. [24] by including a large number of genome sequences. Based on this study, a 96.5% ANI value is the minimum threshold that corresponds to 70% DNA-DNA relatedness cutoff value for strains (genomes) belong to the same species. To set the 96.5% ANI cutoff value for species description, the alignment fraction (AF) between the genomes should be 0.6 or above (i.e. AF covering at least 60% of the gene content of a pair of genomes) [24].
In Ethiopia, an endemic multipurpose legume tree E. brucei [26] is used for the production of firewood and a shade for coffee plantations [27] and it also improves soil fertility [28]. Crotalaria spp. [29] and Indigofera spp. [30] are among the diverse perennial herb and shrub legumes found in Ethiopia [31]. Crotalaria spp. [29] are used for green manuring, as a fallow before the main crop or for intercropping with cereal plants in order to amend soil nitrogen fertility. Some Crotalaria spp. [29] can be used as food and feed [32][33][34]. Indigofera spp. [30] are used for fodder for livestock, particularly in dryland areas as the species are resistant to water stress [35]. A group of rhizobial strains belonging to the genus Bradyrhizobium was isolated from nodules of the legume tree E. brucei [26] and the shrub legumes Crotalaria spp. [29] and Indigofera spp.
[30] growing in Ethiopia. These bacteria formed a unique branch which was distinct from other known species of the genus Bradyrhizobium in phylogenetic trees constructed based on sequence analysis of housekeeping genes [5]. To describe this group as a new Bradyrhizobium species using the genome-wide ANI and digital DDH methods, a representative strain Bradyrhizobium sp. ERR11 (hereafter Bradyrhizobium shewense sp. nov. ERR11 T ) was selected for genome sequencing. The sequencing was done under the DOE-JGI 2014 Genomic Encyclopedia of Type Strains, Phase III, a project designed for sequencing of soil and plant-associated and newly described type strains [36]. Therefore, the main purpose of this study was 1) to present classification and general features of Bradyrhizobium shewense sp. nov., 2) to report the genome sequence and annotation of the type strain ERR11 T . In addition, the genome sequence and annotation of reference type strain B. yuanmingense CCBAU 10071 T [37] sequenced for this study will be reported.

Classification and features
The strain ERR11 T is the type strain of newly proposed B. shewense sp. nov. This novel species includes strains isolated from nodules of E. brucei [26], Indigofera spp.
[30] and Crotalaria spp. [29] growing in Ethiopia. Previously, the strains were identified as a unique group using recA, glnII, and rpoB single gene sequence analysis and on the phylogenetic tree constructed based concatenated recA-glnII-rpoB gene sequences. On the phylogenetic tree, the strains in the novel group formed their own cluster exclusive of validly published species, and consequently, this group were designated as Bradyrhizobium genosp ETH1 [5]. To define the current taxonomic position of the novel rhizobial species, we reconstructed a phylogenetic tree from concatenated recA-glnII-rpoB sequences by including more and recently published reference sequences from the public database. In this phylogenetic tree, the bacterial grouping was consistent with our previous tree produced from concatenated recA, rpoB and glnII gene sequences [5]. The novel species formed a distinct group close to a B. ottawaense branch that contains strains isolated from the nodules of soybean (Glycine max) [38] grown in Ottawa, Canada [39] (Fig. 1). The average recA-glnII-rpoB gene sequences (1411 bp) similarity between the type strain ERR11 T and other strains in the novel species was in the range 99-100% (data not shown). The closest species was B. ottawaense [39] followed by B. liaoningense [40]. The similarity between strains in the novel group and strains in the closest species was 96% and they all showed 95% average gene sequence similarity with strains in B. liaoningense ( Table 5). The type strain ERR11 T showed 94-95% similarity of recA-glnII-rpoB gene sequence with the type strains of neighbor branches; B. yuanmingense CCBAU10071 T [37], Bradyrhizobium daqingense CGMCC 1.10947 T [41], B. arachidis LMG 26795 T [42,43] and Bradyrhizobium subterraneum 58 2-1 T [44].
Minimum Information about the Genome Sequence is provided in Table 1 and the Additional file 1: Table S1. The type strain ERR11 T is a rod-shaped Gram-negative strain and has a dimension of 1.0-2.3 μm length and 0.7-1.0 μm width (Fig. 2). The species includes slow-growing bacteria, forming creamy, raised, smooth margin colonies of 1-2 mm in diameter after 7-10 days of incubation on YEM agar plates at 28°C. The bacteria are able to grow at 15°C-30°C temperature, in 0.0-0.5% NaCl concentrations and in the pH range 5-10. The type strain ERR11 T and all other strains in the novel group were not able to grow at pH 4, at 4°C and 35°C, and in the 1-5% NaCl range (Additional file 1: Table S1). The carbon source utilization pattern of the type strain ERR11 T and other strains was tested as previously described [22] using Biolog GN2 plates with 95 carbon  Evidence codes -IDA Inferred from Direct Assay, TAS Traceable Author Statement (i.e., a direct report exists in the literature), NAS Non-traceable Author Statement (i.e.,not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [101] ERR11 T CCBAU 10071 T Fig. 2 Gram stain and dimensions of B. shewense sp. nov. ERR11 T and B. yuanmingense CBAU1007 T sources, following the manufacturer's guideline [45]. Concisely, bacterial colonies grown on YEM agar were transferred and incubated on R2A media. Bacterial suspension was made by transferring colonies from R2A media into 0.5% (w/v) saline solution. Then, each of the wells of the Biolog GN2 Microplate was filled with 150 μl of the suspension. The results were recorded as positive when the wells turned purple after 4, 24, 48 h or 96 h incubation at 28°C [46]. The carbon source utilization characteristics are presented in Additional file 2: Table S2. In general, the test strains showed a positive reaction for 66 of the carbon sources and negative reaction for 29 of the carbon sources (Additional file 2: Table  S2). Despite that the diversity in carbon utilization patterns was minimal among test strains and between reference strain B. yuanmingense CCBAU 10071 T [37], only the test strains responded positively for adonitol, xylitol, and cis-aconitic acid carbon sources. Type strain CCBAU 10071 T and other strains in B. yuanmingense were first described as distinct species using phenotypic features, SDS-PAGE analysis of whole-cell proteins, DNA-DNA hybridization and16S rRNA gene sequence analyses [37]. In agreement with the previous study, in this study based on recA-glnII-rpoB sequence analysis, the strains belonging to B. yuanmingense formed a district branch in Fig. 1. B. yuanmingense CCBAU 10071 T is motile and Gram-negative. The rod-shaped form ( Fig. 2) has dimensions of approximately 0.5 μm in width and 1.5-2.0 μm in length. It is slow-growing, forming colonies with about 1-2 mm diameter after 7 days incubation at 28°C on YMA. The optimum growth temperature reported was between 25°C and 30°C [37]. The organism grows best at pH 6.5-7.5 and growth recorded negative at pH 5.0 and pH 10.0, 10°C or 40°C and with 1.0% NaCl in YEMA [37]. Minimum Information about the Genome Sequence (MIGS) of CCBAU 10071 T is provided in Table 1.

Symbiotaxonomy
The symbiotic properties of the strains in B. shewense sp. nov. was studied in our previous study [5]. The strains recovered from nodules of Indigofera spp.
[30] and Crotalaria spp. [29] formed an effective symbiotic association with the original host plants and also on soybean plants [5]. The type strain ERR11 T and other strains were again tested in this study for nodulation and nitrogen fixation ability on E. brucei [26], Indigofera arrecta [47] and Crotalaria juncea [48] as well as on food legumes soybean and peanut (Arachis hypogaea) [49]. All the sterilization and germination methods for I. arrecta [47] and C. juncea [48] seeds were as described previously [5]. Seeds from E. brucei [26], soybean and peanut were sterilized by soaking in 70% alcohol for 3 min and a sodium hypochlorite solution for 3 min followed by rinsing with 5-6 changes of sterilized water. E. brucei [26] seeds were germinated at room temperature (at about 25°C) on 0.75% water agar or by wrapping with a sterilized paper towel. The soybean and peanut were germinated at 28°C on 0.75% water agar. The symbiotic characteristics of B. shewense sp. nov. strains are presented in Additional file 1: Table S1. The results show that the type strain ERR11 T and other strains obtained from E. brucei [26], Crotalaria spp. [29] and Indigofera spp.
[30] were able to form effective nodules on C. juncea plants [48]. Strains from E. brucei [26] including the type strain were unable to form effective symbiotic associations with soybean plants. B. yuanmingense is CCBAU 10071 T was isolated from the nodules of the Lespedeza cuneata [50] legume in Beijing, China. In addition to its original host, the strain was also able to form an ineffective symbiotic association with Medicago sativa [51] and Melilotus albus [37,52].

Genome project history
Type strains ERR11 T and CCBAU 10071 T were sequenced at the DOE-JGI as part of the Genomic Encyclopedia of Bacterial and Archaeal Type Strains, Phase III: the genomes of soil and plant-associated and newly described type strains sequencing project. The plant and soil associated bacteria were considered for sequencing to understand better their environmental and agricultural importance from the sequence information. The sequencing project was also designed to produce genome sequence data that can be used for bacterial classification studies and for a description of a new species using ANI and Genome-to-Genome-Distance values [36]. Based on our previous MLSA, the type strain ERR11 T together with other strains formed a distinctive phylogenetic group without including any known Bradyrhizobium species, and this group representing most likely a new species [5]. Therefore, the aim of the genome sequencing of ERR11 T was to describe the group as a new species by comparing the genome sequence data of ERR11 T with the genome sequences of other Bradyrhizobium species present in public databases. For this purpose, the type strain CCBAU 10071 T [37] was also sequenced in this study to be used as a reference for our genome sequence comparison analysis. The ERR11 T genome project is deposited at the DOE-JGI genome portal [53] as well as at European Nucleotide Archive [54] under accession numbers FMAI01000001-FMAI01000102. The genome sequence of CCBAU 10071 T is also available at DOE-JGI genome portal [53] and at European Nucleotide Archive [55] under accession numbers FMAE01000001-FMAE01000108. The Sequencing, assembling, finishing, and annotation were performed by the DOE-JGI [53]. The genome projects information is depicted in Table 2.

Growth conditions and genomic DNA preparation
The growth conditions and DNA isolation methods were as previously described [22]. In brief, the strains ERR11 T =HAMBI 3532 T and CCBAU 10071 T = LMG 21827 T were grown on YEM agar plates at 28°C for 7-10 days and a pure colony of the cultures was transferred and grown in YEM broth till the culture reached late-logarithmic phase. The genomic DNAs were extracted from cell pellets following the CTAB DNA extraction protocol of the DOE-JGI [56].

Genome sequencing and assembly
Strains ERR11 T and CCBAU 10071 T were sequenced at the DOE-JGI by using the Illumina technology [57]. An Illumina std. shotgun library was constructed and sequenced using the Illumina HiSeq 2000 platform which produced 7,620,202 reads totaling 1150.7 Mb for ERR11 T and 9,923,442 reads counting 1498.4 Mb of CCBAU 10071 T . Details regarding the general aspects of library construction and sequencing methods can be found at the DOE-JGI website [53]. Artifacts from Illumina sequencing and library preparation were removed by passing all raw Illumina sequence data through DUK, filtering program developed by DOE-JGI [58]. The filtered Illumina reads were assembled first using Velvet (version 1.2.07) [59] and 1-3 kb simulated paired-end reads were created from Velvet contigs using wgsim (version 0.3.0) [60]. The Illumina reads were then assembled with the simulated read pairs using Allpaths-LG (version r46652) [61]. The final draft genome assembly comprises 9.2 Mb genome size containing 107 contigs in 102 scaffolds for strain ERR11 T ; 109 contigs in 108 scaffolds with a total size of 8.2 Mb for CCBAU 10071 T . The final assembly was based on 1399.7 Mb Illumina data and 225.2X input read coverage for the strain ERR11 T ; 1399.7 Mb Illumina data and 279.9× input read coverage for the strain CCBAU 10071 T .

Genome annotation
Genes were first predicted by the Prodigal [62] program at the DOE-JGJ annotation pipeline [63], followed by a round of manual curation using GenePRIMP [64]. The predicted CDSs were translated and functionally annotated by searching against the NCBI non redundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [65] was used to identify tRNA genes and ribosomal RNA genes were predicted by searches against the ribosomal RNA genes in the SILVA database [66]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genomes for the corresponding Rfam profiles using INFERNAL [67]. Additional gene prediction and functional annotation of the predicted genes were accomplished by using the Integrated Microbial Genomes (IMG) platform [68] developed by DOE-JGI [69].

Genome properties
The genome of ERR11 T consists 102 scaffolds with a total size of 9,163,226 bp and a 63.2% G + C content. From a total 8634 genes, 8548 were protein-coding genes and 86 RNA encoding genes. The genome of CCBAU 10071 T is arranged in 108 scaffolds and has a size of 6,928,453 bp with a 63.8% G + C content and of the 7861 predicted genes 7776 were protein-coding genes and 85 were RNAs-coding genes ( Table 3). The majority of the protein-coding genes of ERR11 T (72.8%) and CCBAU 10071 T (72.6%) were annotated to functions and the remaining 2266 (26.3%) and 2073 (26.4%) genes were without a functional prediction for ERR11 T and CCBAU 10071 T , respectively. About 62% CDSs of ERR11 T and 63% CDSs of CCBAU 10071 T were assigned to COG functional categories. The distribution of the genes assigned into COGs functional categories is presented in Table 4.   The total is based on the total number of protein coding genes in the genome

Insights from the genome sequence
Genome wide comparative analysis The strains belonging to B. shewense sp. nov. formed their own group close to B. ottawaense branch on the phylogenetic tree reconstructed based on recA-glnII-rpoB concatenated gene sequences (Fig. 1). Comparative analysis of the genome sequences between type strain ERR11 T and relatively close references was thus done for detail taxonomic study of the unique group and to describe it as a novel species. Among the reference genomes presented in the Fig. 1 [37] sequenced in this study was also included in the comparative analyses.
To evaluate the similarity between the genomes, we calculated genome-wide ANI by averaging the nucleotide identity of orthologous genes identified as bidirectional best hits as previously described [22,24]. Based on this method, 96.5% ANI and 0.6 AF were set as the threshold values between strains in the same species [24]. In addition, DDH values were predicted between the genomes by using Genome-to-Genome Distance Calculator (GGDC) [78,79]. This program computes the distance between genomes using three different formals: 1, high-scoring segment pairs (HSPs) /total length; 2, identities /HSP length; 3, identities/total length. The formula 2 proved to be a robust and recommended method for draft genome distance comparison [80].
The ANI values and DDH estimated results are presented in Table 5. The soybean-nodulating strain USDA 4 was previously classified as B. japonicum USDA 4 based on sequence analysis of 16S rRNA gene and the internally transcribed spacer region of the 5′-23S rRNA gene [73]. However, the ANI value between type the strain B. japonicum USDA 6 T and USD 4 was 90.2% and the DDH value between the two was 39.0%, suggesting that USDA 4 does not belong to the B. japonicum species. The strains USDA 4 and CCBAU 15615 were tightly grouped with strains in B. ottawaense on the phylogenetic tree in Fig. 1. Both USDA 4 and CCBAU 15615 shared 99% recA-glnII-rpoB sequence identity with B. ottawaense OO99 T and B. ottawaense OO100 [39]. Even though the reference strains OO99 T and OO100 were not sequenced and not included in our ANI calculation, the recA-glnII-rpoB sequence analysis result strongly indicates that both USDA 4 and CCBAU 15615 belong to B. ottawaense. The ANI values between type strain  1%) is below the threshold of 70%, which is commonly used value for species delineation [78,79]. In agreement with recA-glnII-rpoB gene sequence analysis, both ANI and DDH results revealed that the closest strains for ERR11 T were strains belong to B. ottawaense group (CCBAU 15615 and USDA 4). Nevertheless, both the ANI and DDH values between ERR11 T and CCBAU 15615 or USDA 4 were below the cutoff values of the strains of the same species, suggesting that ERR11 T belong to the novel group. Shared orthologous protein clusters between the genomes of ERR11 T and the closest reference strains USDA 4 and CCBAU 83689 were identified using an OrthoVenn program [81] as described previously [22]. The orthologous clusters are shown in a Venn diagram (Fig. 3). The number of protein clusters identified in each of ERR11 T , USDA 4 and CCBAU 83689 was 6850, 5897 and 6923, respectively. In the genome of ERR11 T , 99 of the clusters were identified as unique protein clusters without homologs in the other genomes. In USDA 4 and CCBAU 83689, 44 and 77 protein clusters respectively, were also identified as unique clusters with no detectable homologous with other genomes. Of the total proteins used in the analysis 1456, 2028, 1102 were single copy gene clusters in ERR11 T , USDA 4 and CCBAU 83689, respectively. Of the clusters, in total 5310 homologous protein clusters were shared in common by all of the three genomes. Strain ERR11 T shares about 76.7% (6560) of its proteins with USDA 4 and 64.4% (5501) clusters with CCBAU 83689. Based on the pairwise comparison, ERR11 T shared the highest number with strain USDA 4 with 1250 protein clusters and ERR11 T shared only 191 protein clusters with CCBAU 83689. This result is in accordance with the phylogenetic tree (Fig. 1), ANI and DDH results ( Table 5), supporting that strain USDA 4 (in the B. ottawaense species group) is more closely related to ERR11 T compared to strain CCBAU 83689 (in B. liaoningense).

Comparative analysis of genes linked to symbiosis and denitrification Symbiotic genes
The nodulation genes (nod, nol, noe) for the synthesis of the backbone of LCO Nod factors and substituent groups and genes coding for nitrogen fixation (nif, fix) are required in rhizobia-legume symbiosis [70,72]. In order to search the symbiotic genes in ERR11 T and CCBAU 10071 T , the genomes were assembled against completely sequenced USDA 110 T and USDA 6 T using the Genome Gene Best Homologs package from program IMG-ER [69]. In addition, the symbiotic genes were also compared against other draft Bradyrhizobium genomes: LMG 26795 T , CGMCC 1.10947 T , CGMCC 1.10948 T , USDA 4, and CCBAU 05525. To see the arrangement of symbiotic genes, the genome of ERR11 T and references were aligned using the progressive Mauve alignment method [82]. Summary of the symbiotic genes identified in ERR11 T and CCBAU 10071 T and their locations in the genomes and resemblance with genes in the reference genomes are shown in Additional file 3: Table S3. The main nodulation genes; in scaffolds Ga0061098_1039 and Ga0061099_1014, in the genome of ERR11 T and CCBAU 10071 T , respectively. The result of the Mauve alignment (Fig. 4) shows that these genes are homologous and organized in the same region (module) similarly as the genes found in the genome of USDA 110 T , USDA 4, and CCBAU 05525. Additional nodulation genes of ERR11 T are scattered in scaffolds Ga0061098_1005 (nodWV, nodM, noeL, nolXWTUV), Ga0061098_1016 (nodU), Ga0061098_1006 (nodT) and Ga0061098_1031 (noeE, noeI). These genes are also identified in the genome of CCBAU 10071 T in Ga0061099_1013 and Ga0061099_1018, Ga0061099_1014, Ga0061099_1005 and Ga0061099_1022, respectively.
In the genome of ERR11 T , the genes coding for the nitrogen-fixing nitrogenase complex [83] are mainly located in scaffolds Ga0061098_1005 (nifDKENX-nifT-nifB-nifZ-nifHQV-fixBCX), and Ga0061098_1039 (fixR-nifA-fixA). The nif/fix genes in the genome of CCBAU 10071 T are distributed in scaffolds Ga0061099_1013 (nifDKENX), Ga0061099_1041 (nifT-nifB-nifZ), Ga0061099_1036 (nif HQV-fixBCX), and Ga0061099_1014 (fixR-nifA-fixA). The fix genes (fixK2-fixJL-fixNOPGHIS), which are required for creating microoxic respiration for the rhizobia during symbiosis, are also conserved in the genomes of ERR11 T (in scaffold Ga0061098_1024) and CCBAU 10071 T (in scaffold Ga0061099_10014) in a similar fashion as the homologous genes found in USDA110 T . Generally, the nodulation and nitrogen fixation genes of ERR11 T and CCBAU 10071 T showed 70.0-100% sequence similarity with homologous genes found in the reference genomes of USDA110 T , USDA6 T , USDA 4, CCBAU 23303 T , CGMCC 1.10947 T , and CCBAU 05525 (Additional file 3: Table S3). The nodulation and nitrogen fixation genes of ERR11 T mostly showed the highest sequence similarities (>90%) specifically with homologous genes found in the genome of peanut-nodulating strain LMG 26795 T , suggesting that these strains may have a similar origin of symbiotic genes.
Nitrogen fixation in symbiosis is an ATP-dependent energy intensive reaction, where energy is released in the form of H 2 as a result of the reduction of N 2 by nitrogenase. The rhizobia which have hydrogen-uptake systems are capable of recycling the released H 2 in the rhizobia-legume symbiosis [84]. This way some rhizobia increase the energy efficiency in symbiosis and consequently the nitrogen-fixation and legume productivity. The hydrogenase uptake complex is coded by clusters of hupNCUVSLCDFGHIJK, hypABFCDE, and hoxXA genes [70,72,85,86]. Clusters of hupSLCFHK and hypBFDE genes were identified in the genomes of ERR11 T in scaffold Ga0061098_1005 and in CCBAU 10071 T in scaffold Ga0061099_1013 (Additional file 3: Table S3). The composition of hydrogenase genes in the clusters hup, hyp and hox and their expression can be different between rhizobial species and are also missing in some rhizobia [70,72,84]. Rhizobia with the functional hydrogenase uptake system, such as strain USDA 110 T contained a complete set of hup-hyp-hox genes [72]. In the genomes of ERR11 T and CCBAU 10071 T , some of the genes are missing or incomplete. Therefore, further study and complete sequencing may confirm if the hydrogenase uptake system is functional in these strains.

Denitrifying genes
Denitrification is a process by which NO 3 − and NO 2 − are reduced to N 2 when NO 3 − or NO 2 − is used by microorganisms as a final electron acceptor for respiration as an  [87]. Thus, denitrification result in nitrogen losses from terrestrial and aquatic ecosystems and also contribute to the production of a potent greenhouse gas, N 2 O. The denitrification is common among the bacteria in the Proteobacteria class and also in Archaea [88]. Symbiotic nitrogen-fixing rhizobia, particularly species belonging to Bradyrhizobium were reported to be involved in the denitrification process in low oxygen environments [89]. All or some of the genes for NO 3 − , NO 2 − , NO and N 2 O reductions were found in several rhizobial species investigated thus far [90] and emission of N 2 O by symbiotic rhizobia inside the root nodules was reported [91,92]. Nitrogen-fixing USDA 110 T is known to denitrify as free living and also in the symbiotic condition in root nodules of soybean [89,93]. Strain USDA 110 T requires napEDABC, nirK, norCBQD, and nosRZDFYLX gene clusters for NO 3 − , NO 2 − , NO and N 2 O reductase, respectively [90]. In the genome of ERR11 T , the napCBADE, norDQBCE, nosRZDYL cluster of genes are present in the scaffolds Ga0061098_1005, Ga0061098_1001, and Ga0061098_1006, respectively. The gene for nitrite reductase (nir) was not found in ERR11 T . Therefore, the nitrite reductase activity may be lacking in ERR11 T and denitrification in this strain may depend only on nitrate, nitric oxide, and nitrous oxide reductase reactions. The genome of CCBAU 10071 T harbors only denitrifying genes napEDABC and norCBQD for nitrate and nitric oxide reduction, respectively (Additional file 3: Table S3). Further experimental study with appropriate methods and techniques can help to understand better the presence of denitrification enzyme activities in the type strains ERR11 T and CCBAU 10071 T and to confirm if the type strains are involved in the denitrification process and N 2 O emission.

Conclusion
In this study, we present the genome sequences of B. shewense sp. nov. strain ERR11 T and the type strain B. yuanmingense CCBAU 10071 T . The draft genome size of ERR11 T and CCBAU 10071 T is about 9.2Mbp and 8.2Mbp, respectively. Type strain CCBAU 10071 T was selected for sequencing to be used as a reference for our comparative genomic analysis. The genomes of the type strains ERR11 T and CCBAU 10071 T carry genes for nodulation, nitrogen fixation, the hydrogen-uptake system as well as genes for denitrification. The nod genes nolY-nolA-nodD2-nod-D1YABCSUIJ-nolO-nodZ in the genomes of ERR11 T and CCBAU 10071 T are organized similarly as homologous genes identified in the genomes of USDA 110 T , USDA 4, and CCBAU 05525. The nodulation and nitrogen fixation genes of ERR11 T share high sequence similarity with peanut-nodulating type strain B. arachidis LMG 26795 T [42,43] The denitrification genes nap, nor and nos of ERR11 T and nap and nor of CCBAU 10071 T are homologous to the genes in found in the genome of USDA 110 T , a known denitrifying rhizobium, indicating that ERR11 T and CCBAU 10071 T may involve in reduction of nitrate, nitric oxide, or nitrous oxide. Based on the phylogenetic analyses of recA-glnII-rpoB sequences, the strains (ERR2A, ERR2B, ERR11, ERR13, CIR42, CSR10B, IAR8 and AURI6) belonging to the novel species formed a unique group within the genus Bradyrhizobium. In order to verdict this result, comparative genomic analyses based on ANI calculation and DDH methods were done. The results from both ANI and DDH supported the result from the phylogenetic analysis, in which the genome of the type strain ERR11 T showed 95.2% ANI and 53.1 DDH similarity with the closest reference strain USDA 4. These values are lower than the 96.5% ANI and 70% DDH cutoff values designed for strains of the same species. These results confirm that B. shewense sp. nov. should be considered as a new Bradyrhizobium species. Therefore, based on the phylogenetic analysis, ANI and DDH results and by including phenotypic characteristics, we formally propose the creation of B. shewense sp. nov. that contains the strain ERR11 T (= HAMBI 3532 T =LMG 30162 T ). The type strain forms an effective nitrogen-fixing symbiosis with E. brucei [26], I. arrecta (47) and peanut.
Description of Bradyrhizobium shewense sp. nov.