Complete genome sequence of the nitrogen-fixing bacterium Azospirillum humicireducens type strain SgZ-5T

The Azospirillum humicireducens strain SgZ-5T, belonging to the Order Rhodospirillales and the Family Rhodospirillaceae, was isolated from a microbial fuel cell inoculated with paddy soil. A previous work has shown that strain SgZ-5T was able to fix atmospheric nitrogen involved in plant growth promotion. Here we present the complete genome of A. humicireducens SgZ-5T, which consists of a circular chromosome and six plasmids with the total genome size of 6,834,379 bp and the average GC content of 67.55%. Genome annotations predicted 5969 protein coding and 85 RNA genes including 14 rRNA and 67 tRNA genes. By genomic analysis, we identified a complete set of genes that is potentially involved in nitrogen fixation and its regulation. This genome also harbors numerous genes that are likely responsible for phytohormones production. We anticipate that the A. humicireducens SgZ-5T genome will contribute insights into plant growth promoting properties of Azospirillum strains. Electronic supplementary material The online version of this article (10.1186/s40793-018-0322-2) contains supplementary material, which is available to authorized users.


Introduction
Bacteria that live in the plant rhizosphere and possess a large array of potential mechanisms to enhance plant growth are considered as PGPR [1][2][3]. Azospirillum represents a well characterized genus of PGPR due to its capacity of fixing atmospheric nitrogen [4,5]. Although the exact contribution of Azospirillum to biological nitrogen fixation in plant growth promotion is debated [2], agricultural applications of the genus Azospirillum have been still developed [6,7]. Another main characteristic of Azospirillum proposed to explain plant growth promotion has been related to its ability to produce phytohormones [8,9].
At present, there are 17 species within the genus Azospirillum [10], of which the nitrogen-fixing bacterium A. humicireducens SgZ-5 T , the focus species of this study, was initially isolated from the anode biofilm of a MFC. A soil sample collected from paddy field in Guangzhou City, Guangdong Province, China (23.18 o N 113.36 o E) was used as inoculating source of the MFC. In a previous report [11], the nitrogen-fixing capability of strain SgZ-5 T was confirmed by acetylene-reduction assay and identification of a nifH gene. Furthermore, this strain has the ability to grow under anaerobic conditions via the oxidation of various organic compounds coupled to the reduction of humus [11], showing its potential use in plant rhizosphere. Here, we describe the physiological features together with the whole genome sequence of A. humicireducens SgZ-5 T .
A phylogenetic tree was constructed from aligning the 16S rRNA gene sequences of strain SgZ-5 T and type strains of the genus Azospirillum by MEGA 5 using the neighbour-joining method [12]. The phylogenetic position of strain SgZ-5 T is shown in Fig. 2, where A. humicireducens can be grouped as a Azospirillum species, forms a distinct subclade together with A. lipoferum that are known as a biofertilizer widely used for agricultural production [13,14]. The 16S rRNA gene of strain SgZ-5 T is 98% similar to that of A. lipoferum NCIMB 11861 T . Since nifH gene is highly conserved among nitrogen-fixing Proteobacteria [15], a nifH-based phylogenetic tree was constructed to identify the relationship of A. humicireducens to other species within the genus Azospirillum and related genus (Additional file 1). The phylogenetic reconstruction indicated the close relationship of the A. humicireducens SgZ-5 T nifH gene with that from Azospirillum sp. B510.

Genome sequencing information
Genome project history A. humicireducens SgZ-5 T was selected for genome sequencing on the basis of its biotechnological potential in agricultural applications as a PGPR likely harboring multiple PGPP [11]. The complete genome sequences have been deposited at Gen-Bank/EMBL/DDBJ under the accession numbers CP015285.1, CP028902-CP028907. Project information is available from Genome Online database number Gp0150267 at Joint Genome Institute.  In Table 2, we summarize the project information and its association with Minimum Information about a Genome Sequence (MIGS) [16].

Growth conditions and genomic DNA preparation
A. humicireducens SgZ-5 T was routinely cultured in NB medium containing (L − 1 ) 5 g peptone, 3 g beef extract and 5 g NaCl at 30°C. For genome sequencing, total genomic DNA was extracted from 10 mL overnight cultures using a DNA extraction kit following the manufacture's instructions (Aidlab). Quantification and quality control of the genomic DNA were completed by using a Qubit fluorometer (Invitrogen, CA, USA) with Qubit dsDNA BR Assay kit and 0.7% agarose gel electrophoresis with λ-Hind III digest DNA marker.

Genome sequencing and assembly
Complete genome sequencing was performed on an Illumina HiSeq 2500 system by constructing three DNA libraries (a paired-end library with insert size of 491 bp, and two mate pair libraries with insert sizes of 2.5 and 6.9 kb). After filtering low quality and Illumina PCR adapter reads, a total of 1967 Mb clean data were obtained from 2052 Mb raw data. Subsequently, all reads data were denovo assembled into a circular contig with 259 folds of genomic coverage, using SOAPdenovo v.2.04 [17]. Detailed genome sequencing project information is shown in Table 2.

Genome properties
The  Table 4 and a graphical map is represented in Fig. 3. Furthermore, 4550 (75.2%) genes were assigned to 21 COG functional categories. The distribution of genes into different COG functional categories is provided in Table 5. Six Azospirillum species genomes (including A. humicireducens) of characterized strains are compared in Table 6. Almost all of these Azospirillum genomes consisting of 6-8 replicons have the total size of 6.5-7.6 Mb and the average GC content of 67.5-70.7%, and contain the total genes in the range of 5951 to 6982 [3,6,26,27]. Furthermore, the main features of A. humicireducens SgZ-5 T genome are close to those of A. lipoferum 4B genome.

Insights into the genome sequence
Nitrogen fixation is the major proposed mechanism, by which Azospirillum affects plant growth [2,4]. A complete set of genes encoding enzymes involved in nitrogen fixation was found in the genomic analysis of A. humicireducens SgZ-5 T ( Table 7). The main genes involved in this process are nif genes, of which nifDK genes (A6A40_02900 and A6A40_02895) annotated as nitrogenase molybdenum-iron proteins and nifH gene (A6A40_02905) encoding dinitrogenase reductase protein have been identified. In the upstream region of the nifHDK operon, we have found that nifEN genes (A6A40_02875 and A6A40_02870) involved in synthesis of the molybdenum-iron cofactor of nitrogenase are clustered into a single operon together with nifX (A6A40_02865). Furthermore, the genome of A. humicireducens SgZ-5 T has nifUSVW genes (A6A40_02235, A6A40_02230, A6A40_ 02225 and A6A40_02215), which are separated from the structural nifENX operon by about 160 kb.
Organization of the nitrogen fixation gene cluster in A. humicireducens SgZ-5 T is presented in Fig. 4. Except for the separately transcribed nifA (A6A40_09040), nifB (A6A40_09050) and nifZ genes (A6A40_09070 and A6A40_09075), all the nif genes have resided in the nitrogen fixation gene cluster of 176.7 kb. Besides, an operon containing fixABCX genes (A6A40_02185, A6A40_02190, A6A40_02195 and A6A40_02220) responsible for electron transfer to nitrogenase is located upstream of this gene cluster. Nevertheless, the fixABCX operon is generally regulated by a transcriptional activator NifA protein for all nitrogen-fixing bacteria in the genus Azospirillum studied so far [5]. Furthermore, draTG genes (A6A40_02920 and A6A40_02925) implicated in posttranslational regulatory process of nitrogenase activity were found in the downstream of and divergently oriented with respect to nifHDK genes. On the whole, the nitrogen fixation gene cluster of A. humicireducens SgZ-5 T was in agreement with that in A. brasilense, A. lipoferum and Azospirillum sp.  The total is based on the total number of protein coding genes in the annotated genome  [6,26,28,29], suggesting that nitrogen fixation process demands the systematic action of various genes. Since tryptophan is a main precursor for biosynthesis of IAA, a well-known phytohormone [30], the genes in A. humicireducens SgZ-5 T related to the production of this amino acid have been analyzed (Additional file 2). The genome harbors three genes trpE, trpG and trpEG (A6A40_04380, A6A40_04655 and A6A40_05775), each encoding the key enzyme anthranilate synthase in tryptophan biosynthesis. Together with trpG, the genes trpD (A6A40_04650) and trpC (A6A40_04645) form a gene cluster of 2.4 kb. Except for anthranilate synthase, this trpGDC gene cluster encodes anthranilate phosphoribosyltransferase and indole-3-glycerol phosphate synthase, which plays a role in synthesis of tryptophan used in multiple biological processes including IAA biosynthesis [31]. The same trpGDC cluster was previously found in A. brasilense [32]. Although the ipdC gene, related to the indole-3-pyruvate pathway for the biosynthesis of IAA [30], was not discovered in the A. humicireducens SgZ-5 T genome, alternative pathway might exist in SgZ-5 T . In the genome, A6A40_22745 and A6A40_22755 were assigned as candidates for iaaM and iaaH genes, respectively. These two genes were also found in the Azospirillum sp. B510 genome, and are known to be involved in the IAM pathway for IAA biosynthesis by catalyzing the decarboxylation of tryptophan into IAM and the hydrolysis of IAM to produce IAA [6,30].
The A. humicireducens SgZ-5 T genome also contains a terpene gene cluster of 24.0 kb consisting of 23 genes (A6A40_04945, A6A40_04950, A6A40_04955, …, A6A4 0_05055) (Additional file 3). This gene cluster encodes a series of proteins, which are involved in the biosynthesis of secondary metabolite production of terpenoid. Thereinto, A6A40_05010 was indentified as the crtB gene, encoding phytoene synthase involved in the biosynthesis of carotenoid. Similar genes in this gene cluster were previously observed in the A. lipoferum 4B genome [7,26]. Furthermore, some phytohormones including gibberellins and abscisic acid with over 120 types found in plants, fungi, and bacteria, are synthesized through the terpenoid pathway [2]. Therefore, A. humicireducens SgZ-5 exhibits an attractive application as a PGPR likely harboring multiple PGPP in agriculture.

Conclusion
We report here an inventory of the genomic features of the nitrogen-fixing bacterium A. humicireducens SgZ-5 T . The genome sequence of strain SgZ-5 T revealed further genetic elements involved in nitrogen fixation and its regulation, as well as in the production of phytohormones. We anticipate that knowledge of this genome will contribute to new insights into the mechanisms of plant growth stimulation through genomic comparisons among available complete genomes of Azospirillum strains.

Additional files
Additional file 1: Phylogenetic tree based on the partial nifH gene sequences showing the position of A. humicireducens SgZ-5 T relative to other species within the genus Azospirillum and related genus. The strains and their corresponding GenBank accession numbers of nifH gene were indicated in parentheses. The sequences were aligned using Clustal W and the neighbor-joining tree was constructed based on kimura 2-paramenter distance model by using MEGA 5. Bootstrap values above 50% were obtained from 1000 bootstrap replications.