High-quality permanent draft genome sequence of Ensifer sp. PC2, isolated from a nitrogen-fixing root nodule of the legume tree (Khejri) native to the Thar Desert of India

Ensifer sp. PC2 is an aerobic, motile, Gram-negative, non-spore-forming rod that was isolated from a nitrogen-fixing nodule of the tree legume P. cineraria (L.) Druce (Khejri), which is a keystone species that grows in arid and semi-arid regions of the Indian Thar desert. Strain PC2 exists as a dominant saprophyte in alkaline soils of Western Rajasthan. It is fast growing, well-adapted to arid conditions and is able to form an effective symbiosis with several annual crop legumes as well as species of mimosoid trees and shrubs. Here we describe the features of Ensifer sp. PC2, together with genome sequence information and its annotation. The 8,458,965 bp high-quality permanent draft genome is arranged into 171 scaffolds of 171 contigs containing 8,344 protein-coding genes and 139 RNA-only encoding genes, and is one of the rhizobial genomes sequenced as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) project proposal. Electronic supplementary material The online version of this article (doi:10.1186/s40793-016-0157-7) contains supplementary material, which is available to authorized users.


Introduction
The genus Prosopis (family Leguminosae, sub-family Mimosoideae [1]) comprises about 44 species that are widely distributed in the world's semi-arid regions, mostly in North and South America with a few species found in Africa and south west Asia [2][3][4]. Several species have been widely introduced throughout the world over the last 200 years [5]. Prosopis may have evolved from P. africana (Guill. & Perr.) Taub., in which various character traits and small genome size (392-490 Mbp) indicate that it is a primitive species [2]. According to Burkart [2], Prosopis is an old genus that diverged early into several principal lineages, with some of these lineages producing more recent episodes of speciation. This is supported by a recent molecular dating analysis that places the divergence of the New World Prosopis Sections during the Oligocene (33.9 to 23.03 Mya) [6], which is remarkably ancient considering that the subfamily Mimosoideae originated between 42-50 Mya [7]. Section Prosopis consists of three species, Prosopis cineraria (L.) Druce, P. farcta (Banks et Sol.) Eig. and P. koelziana Burkart, which are native to North Africa and Asia [6].
P. cineraria is endemic to arid and semi-arid regions of the Indian Thar Desert and is designated as the state tree of Rajasthan [8]. It symbolizes the sacred mythological "Kalpa Vriksh" (wish tree) of the desert and is historically important, as it has been worshiped since ancient times by many rural communities in these arid regions. P. cineraria is a multipurpose tree used as food, fodder, shelter and medicine by the local inhabitants. It is an important component of agro forestry, agrisilvicultural and silvopastoral systems in the alkaline soil of the Thar Desert. The tree is extremely drought and salt tolerant, having a deep root system (>100 metres) that helps in acquiring nutrients and moisture from deeper soil layers. It produces green pods that are rich in nutrients and antioxidants and eaten as a vegetable in the hot summer [9]. P. cineraria is a good candidate for rehabilitation of dry, marginal or degraded lands of low fertility and/or high salinity. It plays a vital role as a soil binder in the stabilization of sand dunes and enriches poor desert soil by fixing atmospheric nitrogen in association with its rhizobial microsymbionts [10][11][12][13].
Prosopis is a promiscuous genus, being nodulated by a wide range of taxonomically diverse rhizobia. Mesquite (Torr.) in the Sonoran Desert, California is nodulated by diverse strains of fast-and slow-growing rhizobia [14]. Mesorhizobium chacoense CECT 5336 T is a microsymbiont of Prosopis alba Griseb. growing in the Chaco Arido region in Argentina [15], whereas in Spain is nodulated by strains of Ensifer medicae, E. meliloti and Rhizobium giardinii [16]. In Africa, the introduced Prosopis species P. chilensis (Molina) Stuntz, P. cineraria, P. juliflora (Sw.) DC. and P. pallida (Willd.) Kunth are reported to nodulate with strains of Ensifer arboris, E. kostiense, E. saheli and E. terangae [17,18] and P. juliflora also forms effective symbioses with strains of Mesorhizobium plurifarium [19] and Rhizobium etli [20]. Nodulation of P. cineraria growing in its native range was first described by Basak and Goyal [10]. Recently, P. cineraria and other native legumes growing in the alkaline soils of the Thar desert have been reported to nodulate with a dominant novel group of Ensifer strains (PC2, TW10, TP13, RA9, TV3 and TF7) that are closely related to African and Australian Ensifer strains on the basis of 16S rRNA sequence similarity, but form a distinct, wellseparated cluster [21,22].
The indigenous rhizobia of wild tree legumes growing in such arid and harsh environments have superior tolerance to abiotic factors such as salt stress, elevated temperatures and drought and can be used as inoculants for wild as well as crop legumes cultivated in reclaimed desert lands [10]. Because of its ability to nodulate the keystone species P. cineraria as well as crop legumes such as Vigna radiata (L.) R.Wilczek and V. unguiculata (L.) Walp. [21], strain PC2 has therefore been selected as part of the DOE Joint Genome Institute 2010 Genomic Encyclopedia for Bacteria and Archaea-Root Nodule Bacteria (GEBA-RNB) sequencing project [23]. Here we present a summary classification and a set of general features for Ensifer sp. strain PC2, together with a description of its genome sequence and annotation.

Classification and features
Ensifer sp. PC2 is a motile, Gram-negative strain in the order Rhizobiales of the class Alphaproteobacteria. The rod shaped form ( Fig. 1 Left and Center) has dimensions of approximately 0.3-0.5 μm in width and 1.25-1.5 μm in length. It is fast growing, forming colonies within 3-4 days when grown on half strength Lupin Agar [24], tryptone-yeast extract agar (TY) [25] or a modified yeast-mannitol agar (YMA) [26] at 28°C. Colonies on ½LA are white, opaque, slightly domed and slightly mucoid with smooth margins (Fig. 1 Right). Figure 2 shows the phylogenetic relationship of Ensifer sp. PC2 in a 16S rRNA sequence based tree. This strain is the most similar to Ensifer saheli LMG 7837 T based on the 16S rRNA gene alignment, with sequence identities of 99.41 % over 1,366 bp, as determined using the EzTaxon-e database, which contains the sequences of validly published type strains [27]. The PC2 16S rRNA gene sequence has 100 % sequence identity with that of another Indian Thar Desert rhizobial strain, Ensifer sp. TW10, isolated from a nodule of the perennial legume Tephrosia wallichii [22]. Minimum Information about the Genome Sequence for PC2 is provided in Table 1 and Additional file 1: Table S1.

Symbiotaxonomy
Ensifer sp. strain PC2 is able to nodulate and fix nitrogen with both mimosoid and papilionoid legume hosts. It is interesting to note that sp. PC2 is able to nodulate and fix nitrogen with Acacia saligna (Labill.) Wendl., a promiscuous legume tree that mainly nodulates with species of in its native southwestern Australia range [28]. PC2 also effectively nodulates the Central American mimosoid tree Leucaena leucocephala (Lam.) de Wit. PC2 appears to be a relatively promiscuous strain that has potential to be used as an inoculant for crop legumes species such as Vigna radiata (L.) Wilczek and V. unguiculata (L.) Walp.. The symbiotic characteristics of sp. strain PC2 on a range of selected hosts are summarised in Additional file 2: Table S2.

Genome project history
This organism was selected for sequencing on the basis of its environmental and agricultural relevance to issues in global carbon cycling, alternative energy production, and biogeochemical importance, and is part of the Genomic Encyclopedia of Bacteria and Archaea, The Root Nodulating Bacteria chapter project at the U.S. Department of Energy, Joint Genome Institute. The genome project is deposited in the Genomes OnLine Database [29] and a high-quality permanent draft genome sequence is deposited in IMG [30]. Sequencing, finishing and annotation were performed by the JGI [31]. A summary of the project information is shown in Table 2.

Growth conditions and genomic DNA preparation
Ensifer sp. PC2 was streaked onto TY solid medium [25,32] and grown at 28°C for three days to obtain well grown, well separated colonies, then a single colony was selected and used to inoculate 5 ml TY broth medium. The culture was grown for 48 h on a gyratory shaker (200 rpm) at 28°C. Subsequently 1 ml was used to inoculate 60 ml TY broth medium and grown on a (The species name "Sinorhizobium chiapanecum" has not been validly published.) Azorhizobium caulinodans ORS 571 T was used as an outgroup. All sites were informative and there were no gap-containing sites. Phylogenetic analyses were performed using MEGA, version 6 [44]. The tree was built using the Maximum-Likelihood method with the General Time Reversible model [45]. Bootstrap analysis [46] with 500 replicates was performed to assess the support of the clusters. Type strains are indicated with a superscript T. Strains with a genome sequencing project registered in GOLD [29] are in bold font and the GOLD ID is provided after the GenBank accession number, where this is available. Finished genomes are indicated with an asterisk. gyratory shaker (200 rpm) at 28°C until OD 600nm 0.6 was reached. DNA was isolated from 60 ml of cells using a CTAB bacterial genomic DNA isolation method [http:// jgi.doe.gov/collaborate-with-jgi/pmo-overview/protocolssample-preparation-information/]. Final concentration of the DNA was 0.5 mg ml −1 .

Genome sequencing and assembly
The draft genome of sp. PC2 was generated at the JGI using the Pacific Biosciences (PacBio) technology. A Pac-Bio SMRTbell™ library was constructed and sequenced on the PacBio RS platform, which generated 403,200 filtered subreads totaling 1.1 Gbp. All general aspects of library construction and sequencing performed at the JGI can be found on the JGI website [http://jgi.doe.gov/]. The raw reads were assembled using HGAP (version: 2.0.12.0.1) [33]. The final draft assembly contained 171 contigs in 171 scaffolds, totalling 8.5 Mbp in size. The input read coverage was 181.5x.

Genome annotation
Genes were identified using Prodigal [34] as part of the DOE-JGI genome annotation pipeline [35,36]. The predicted CDSs were translated and used to search the National Center for Biotechnology Information nonredundant database, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAScanSE tool [37] was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA [38]. Other non-coding RNAs such as the RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL [39]. Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes platform [40] developed by the Joint Genome Institute, Walnut Creek, CA, USA [41].

Genome properties
The genome is 8,458,965 nucleotides with 61.32 % GC content ( Table 3)

Conclusion
Based on the 16S rRNA gene alignment, Ensifer sp. PC2 is most closely related to Ensifer sp. TW10 and Ensifer sp. WSM1721, two strains isolated from perennial legumes growing in arid climates and alkaline soils in India and Australia, respectively [21,42]. Ensifer fredii strains isolated from Chinese soybean were also superdominant in sampling sites with alkaline-saline soils [43], which suggests that the biogeographic distribution of several Ensifer spp. is linked to their adaptation to alkaline soils. Further, this suggests that the symbiotic associations formed by promiscuous legumes, such as Prosopis, are likely to vary depending on which rhizobial genera are best adapted to the edaphic conditions in which the host is growing. The ability of PC2 to fix nitrogen with both P. cineraria (L.) Druce and the crop legumes Vigna radiata (L.) R.Wilczek and V. unguiculata (L.) Walp. makes it a valuable inoculant strain for use in arid, alkaline regions such as the Thar desert. Analysis of the PC2 sequenced genome and comparison with the genomes of sequenced Ensifer spp. and other rhizobia will provide insights into the molecular basis of the patterns seen in rhizobial biogeographic distributions and associations with plant hosts and into the molecular determinants of rhizobial tolerance to arid and alkaline environments.