- Extended genome report
- Open Access
Genome sequencing and analysis of Ralstonia solanacearum phylotype I strains FJAT-91, FJAT-452 and FJAT-462 isolated from tomato, eggplant, and chili pepper in China
Standards in Genomic Sciencesvolume 12, Article number: 29 (2017)
Ralstonia solanacearum is an extremely destructive pathogen able to cause disease in a wide range of host plants. Here we report the draft genome sequences of the strains FJAT-91, FJAT-452 and FJAT-462, isolated from tomato, eggplant, and chili pepper, respectively, in China. In addition to the genome annotation, we performed a search for type-III secreted effectors in these strains, providing a detailed annotation of their presence and distinctive features compared to the effector repertoire of the reference phylotype I strain (GMI1000). In this analysis, we found that each strain has a unique effector repertoire, encoding both strain-specific effector variants and variations shared among all three strains. Our study, based on strains isolated from different hosts within the same geographical location, provides insight into effector repertoires sufficient to cause disease in different hosts, and may contribute to the identification of host specificity determinants for R. solanacearum.
Ralstonia solanacearum is often considered one of the most destructive bacterial pathogens, causing bacterial wilt disease in more than 250 plant species worldwide . The pathogenicity of R. solanacearum heavily relies on the injection of proteins inside plant cells through a type-III secretion system (T3SS). The versatility of R. solanacearum strains correlates with the presence of a larger number of T3SS substrates, called type-III effectors (T3Es), encoded in their genomes, in comparison to other bacterial pathogens . T3Es are important virulence factors required by most gram-negative pathogens to manipulate plant cells and cause disease [3, 4]. Bacteria from a single R. solanacearum strain can inject more than 70 T3Es (termed Rips for Ralstonia injected proteins) inside plant cells [2, 5]. Studies conducted in Pseudomonas syringae and Xanthomonas axonopodis strains indicate that T3E repertoires are highly variable among strains of these species, and led to the hypothesis that T3E composition may shape the host range of bacterial pathogens [6, 7]. Although the genome sequences and T3E repertoires have been defined for several R. solanacearum strains, repertoire comparisons have failed in identifying host specificity determinants so far , which may suggest that genome sequences from additional strains infecting different hosts are required for this analysis. Additionally, the diversity in the geographical origins of sequenced strains hinders this comparative analysis, since additional environmental factors, such as temperature, light, and humidity may have a significant impact on the requirement of effectors for a successful infection. In this project, we sequenced and annotated the genomes of the R. solanacearum strains FJAT-91, FJAT-452 and FJAT-462, isolated from tomato ( Solanum lycopersicum ), eggplant ( Solanum melongena ), and chili pepper ( Capsicum annum), respectively, in the Fujian province (China) . In addition, we performed a search for T3Es in these strains, providing a detailed annotation of their presence and distinctive features compared to the effector repertoire of the reference phylotype I strain (GMI1000). To our knowledge, this is the first report of genome sequences combined with T3E repertoire analysis performed in strains isolated from different hosts with the same geographical origin.
Classification and features
Ralstonia solanacearum belongs to the order Burkholderiales of the class Betaproteobacteria . It is an aerobic, Gram-negative bacterium, naturally present in soil, water, infected plants or plant debris. It has a worldwide distribution, with higher incidence in tropical and subtropical regions, but also present in other temperate areas . R. solanacearum is the agent causing bacterial wilt disease in multiple host plants, characterized by a sudden wilt of the whole plant. The strains sequenced in this study, FJAT-91, FJAT-452 and FJAT-462, were isolated from naturally infected tomato ( Solanum lycopersicum ), eggplant ( Solanum melongena ), and chili pepper ( Capsicum annum) plants, respectively, in the Fujian province (China). Plants showing typical wilting symptoms were collected, surface-sterilized, and the tissue was homogenized with sterile water before plating serial dilutions to determine the causal agent [8, 9]. Sequence analysis determined that they belong to the R. solanacearum species complex . The pathogenicity of FJAT-91 has been confirmed and used as positive control for pathogenicity assays in tomato plants in previous studies . All three isolated strains displayed the typical physiological features of strains from the R. solanacearum species complex, showing aerobic growth in laboratory conditions, and were able to form 3–4 mm colonies within 2 days at 28 °C when grown on a rich laboratory medium containing tetrazolium chloride and high glucose content. For all three strains, colony shape was irregular, mucooid, and displayed a pink area in the middle of the colony and a large white edge (Fig. 1). Gene sequence analysis of PCR-amplified fliC, hrpB and pehA genes indicated that these strains belong to the phylotype I (represented by the reference strain GMI1000; Fig. 2), mostly formed by Asian strains . The classification and general features of the three strains are summarized in the Tables 1, 2 and 3, and a phylogenetic tree is shown in the Fig. 2.
Genome sequencing information
Genome project history
This sequencing project was started in 2015, assembly and annotation was performed in 2016. Assembled draft genome sequences for the strains FJAT-91, FJAT-452 and FJAT-462 have been deposited to GenBank (Table 4). Raw genomic reads have been deposited to the Sequence Read Archive with accession numbers SRP091690, SRR4431158, SRR4431159, SRR4428740.
Growth conditions and genomic DNA preparation
R. solanacearum strains were grown in rich medium (10 g/l bactopeptone, 1 g/l yeast extract and 1 g/l casamino acids). Genomic DNA was extracted from bacterial cultures grown to stationary phase for 18 h at 28 °C and shaking at 220 rpm (OD600 = 1) using the Blood & Cell Culture DNA Mini kit (Qiagen), following manufacturer’s instructions for gram-negative bacteria. DNA concentration and quality were measured using a Qubit 2.0 Fluorometer (Invitrogen).
Genome sequencing and assembly
For each genome, we prepared a paired-end library with an average insert size of 470 bp and sequenced the library for 250 bp from both ends using Illumina HiSeq 2500. The number of raw read bases was greater than 300 million (>50x genome coverage) for each sequenced strain. The raw sequencing data were first preprocessed to remove adapter sequences, low-quality regions, and short sequences (less than 20 nucleotides) with Cutadapt  and SolexaQA . The remaining clean reads were de novo assembled into contigs and scaffolds by using SOAPdenovo2 and GapCloser v1.12 . Contigs and scaffolds were further assembled into chromosome, plasmid and scaffolds with CONTIGuator, using the GMI1000 genome as the reference. The resulting FJAT-91, FJAT-452 and FJAT-462 genomes are 4,620,128 bp, 5,334,434 bp and 5,083,617 bp, respectively (Table 5), close to the genome length of the R. solanacearum reference strain GMI1000 (5,810,922 bp) .
Genome annotation was performed using Prokka (v1.11)  with the option for non-coding RNA (ncRNA) search. The COG database  and Pfam v30.0  were used for functional annotation of genes. T3Es in the three newly sequenced strains were identified and annotated in two steps: first, 52, 62 and 60 of the T3Es from the R. solanacearum species complex  were identified in FJAT-91, FJAT-452 and FJAT-462, respectively, based on Prokka annotations; second, known T3Es protein sequences  were used as query to search the assembled genome sequences of three strains using BLAST  with a stringent significance cutoff of e-value < 1e-30, identity > 60, and coverage on the query T3E protein sequence being over 50% or at least 100 aa in length. As a result, 72, 78 and 75 T3Es were identified in FJAT-91, FJAT-452 and FJAT-462, respectively. These two sets of T3E genes were merged together to generate the final lists of T3E genes in the three genomes. To identify the sequence variations within T3E genes between three strains and the reference strain, the clean reads from the three newly sequenced strains were mapped to the reference genome GMI1000 using BWA (v0.7.12) . SNPs and INDELs were identified using Samtools (v0.1.19)  and vcftools (v0.1.12)  and were further annotated using SnpEff (v4.0) .
The genome of R. solanacearum strain FJAT-91 has 329 scaffolds and the average GC content of the genome is 60.6% (Table 5). A total of 6,522 genes (6457 CDSs and 65 ncRNAs) were predicted. Of the protein-coding genes, 2544 (39.4%) had functions assigned while 3913 were considered hypothetical (Table 5). 42.03% of the CDSs could be assigned to one COG functional category and 36.56% contained one or more conserved PFAM-A domains (Table 6). The genome of R. solanacearum strain FJAT-452 has 309 scaffolds and the average GC content of the genome is 62.33% (Table 5). A total of 6729 genes (6658 CDSs and 71 ncRNAs) were predicted. Of the protein-coding genes, 3075 (46.19%) had functions assigned while 3583 were considered hypothetical (Table 5). 49.01% of the CDSs could be assigned to one COG functional category and 44.28% contained one or more conserved PFAM-A domains (Table 6). The genome of R. solanacearum strain FJAT-462 has 358 scaffolds and the average GC content of the genome is 61.45% (Table 5). A total of 6758 genes (6696 CDSs and 62 ncRNAs) were predicted. Of the protein-coding genes, 2855 (42.64%) had functions assigned while 3,841 were considered hypothetical (Table 5). 45.49% of the CDSs could be assigned to one COG functional category and 39.93% contained one or more conserved PFAM-A domains (Table 6).
Insights from the genome sequence
Comparative analysis of virulence-related genes
T3E proteins are essential virulence factors in most gram-negative bacterial pathogens, such as R. solanacearum [2, 5], although they can also be perceived by resistant hosts as invasion signals, leading the development of plant defense responses . The expression of genes encoding T3Es and structural components of the T3SS is activated after the perception of plant signals, and coordinated by a well-studied signaling pathway . We analyzed the presence of genes involved in plant sensing and virulence regulation in the newly sequenced strains, and found that all the major regulators are present in the three strains (Table 7). These genes displayed a high percentage of similarity when compared to their homologs in the GMI1000 reference strain, ranging from 98.97 to 100% at the DNA level and from 99.19 to 100% at the amino acid level (Table 7).
The composition of T3E repertoires often defines the host range of specific strains. In this regard, we have identified over 70 T3Es in each strain based on comparisons with effector sequences in public databases (Table 8). Comparisons with the reference GMI1000 strain suggest that the FJAT-91 strain lacks the T3E genes ripAG, ripS4, ripM, ripP3, hyp16, ripAI and ripY; the FJAT-452 strain lacks the T3E genes ripP3, hyp16 and ripM, and the FJAT-462 strain lacks the T3E genes ripAI, ripS4, ripP3, hyp16, ripM and ripAM. On the other hand, several T3E genes that are not present in GMI1000 were found in the three newly sequenced strains, including ripBE (in FJAT-462), ripS7 (in FJAT-452), hyp7 (in FJAT-452 and FJAT-462) and ripAL and ripF2 (in all 3 strains). The presence of most new T3E genes was confirmed by sequence analysis of PCR-amplified fragments from the three strains, being 100% identical among them and very similar or identical (78.84–100%) to their closest orthologs from other sequenced strains (Fig. 3). However, the hyp7 gene from FJAT-462 has a 1206 bp insertion annotated as a transposase 180 bp downstream the start codon (Fig. 3). By comparing the sequences of the T3E genes that are shared by the three newly sequenced strains and the reference strain GMI1000, we identified 652, 798 and 692 variant sites in T3E sequences of FJAT-91, FJAT-452, FJAT-462, respectively (Table 9). These variations were classified into 7 types: missense variant, synonymous variant, frame shift variant, inframe deletion, inframe insertion, stop codon gain, and stop codon loss. Among them, 351 variations are shared by the three newly sequenced strains (Fig. 4). For example, the effector ripA1 has both missense and synonymous variants, ripAZ1 have a frame-shift variant, and ripX has an inframe deletion in all three strains (Fig. 5).
Besides T3Es, R. solanacearum employs several additional virulence factors to achieve infection, such as EPS. The signaling cascade leading to the production of EPS involves several different regulatory components . We analyzed the presence of genes involved in the regulation of EPS production, and found that all the major regulators are present in the three strains (Table 7). These genes displayed a high percentage of similarity when compared to their homologs in the GMI1000 reference strain, with most genes ranging from 98.35%–100% at the DNA level (98.26–100% at the amino acid level), with the exception of phcB, which shows a lower similarity in the FJAT-91 and FJAT-452 strains (86.41% at the DNA level in both strains) (Table 7). Other genes encoding putative virulence factors, such as egl (encoding an endoglucanase) and pehB (encoding an exo-poly-α-d-galacturonosidase) were also present in the three strains, with >99% similarity at the DNA and amino acid level compared to GMI1000 (Table 7).
Earlier studies on the T3E repertoires of different plant pathogens suggested that T3E composition might shape the host range [6, 7]. In this study, we sequenced and analysed the genome of three R. solanacearum strains isolated from different host plants with similar geographical origin (Fujian province, China). Our analysis indicates that each one of these strains have a unique effector repertoire (Table 7). In contrast to what we observed for T3E genes, all the analysed genes involved in the perception of plant signals and the regulation of virulence factors were present in all strains, and displayed a high degree of similarity between the newly sequenced strains and the GMI1000 reference strains (Table 7), suggesting that the mechanism of perception of plant signals does not differ significantly among bacteria infecting different plant species.
In addition to their presence or absence in specific strains, T3E genes may undergo several types of mutations that change or disrupt their coding sequence. As a consequence, the encoded proteins may lose the original function, become unstable, or gain a new function. This allelic diversification may be imposed by the host defense system, and allows pathogens to avoid perception by the immune system of resistant host plants, in a phenomenon called pathoadaptation . We identified alterations in effector sequences that were conserved in the three sequenced strains (Fig. 4). These sequence modifications may be due to the geographical distribution of these strains in comparison with the GMI1000 reference strain, originally isolated from French Guyana (South America), and may have functional relevance in the subversion of host functions in specific environmental conditions. Similarly, it is noteworthy that the FJAT-91 strain lacks 7 T3Es compared to GMI1000, while both are able to cause disease in tomato plants. Comparative analyses using the same tomato cultivars in controlled conditions will determine whether (i) these effectors are really dispensable to infect tomato, (ii) these effectors are dispensable in specific tomato cultivars, (iii) these effectors trigger immunity in specific tomato cultivars, or (iv) the environmental conditions in the FJAT-91 isolation site are more favourable to R. solanacearum infection, rendering unnecessary their virulence activities. The strain-specific absence of T3E genes or strain-specific loss-of-function variants (Fig. 4) may be caused by adaptation of these strains to specific hosts. Similarly, the transposase insertion in hyp7 (specific from FJAT-462; Fig. 3) is likely to alter or abolish the function of the encoded T3E in this strain, and may suggest that this T3E is not needed (or its alteration is actually required) to cause disease in chili pepper plants. Additional functional characterization will be required to determine whether these effectors induce immune responses in eggplant or chili pepper, and may allow the identification of novel sources of resistance against R. solanacearum . Our analysis shows that these unique effector repertoires are sufficient to cause disease in different hosts within a similar geographical location, allowing us to reduce the impact of environmental conditions in the analysis of the requirement of T3Es to cause infection. This information, together with the increasing number of sequenced R. solanacearum strains, constitutes one more step towards the identification of host specificity determinants for R. solanacearum .
Basic local alignment search tool
Clusters of orthologous groups
Single nucleotide polymorphism
Type-III secretion system
Mansfield J, Genin S, Magori S, Citovsky V, Sriariyanum M, Ronald P, Dow M, Verdier V, Beer SV, Machado MA, et al. Top 10 plant pathogenic bacteria in molecular plant pathology. Mol Plant Pathol. 2012;13(6):614–29.
Peeters N, Carrere S, Anisimova M, Plener L, Cazale AC, Genin S. Repertoire, unified nomenclature and evolution of the Type III effector gene set in the Ralstonia solanacearum species complex. BMC Gen. 2013;14:859.
Macho AP. Subversion of plant cellular functions by bacterial type-III effectors: beyond suppression of immunity. New Phytol. 2016;210(1):51–7.
Macho AP, Zipfel C. Targeting of plant pattern recognition receptor-triggered immunity by bacterial type-III secretion system effectors. Current opinion in microbiology. 2015;23C:14–22.
Deslandes L, Genin S. Opening the Ralstonia solanacearum type III effector tool box: insights into host cell subversion mechanisms. Curr Opin Plant Biol. 2014;20:110–7.
Baltrus DA, Nishimura MT, Romanchuk A, Chang JH, Mukhtar MS, Cherkis K, Roach J, Grant SR, Jones CD, Dangl JL. Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates. PLoS Pathog. 2011;7(7):e1002132.
Hajri A, Brin C, Hunault G, Lardeux F, Lemaire C, Manceau C, Boureau T, Poussier S. A "repertoire for repertoire" hypothesis: repertoires of type three effectors are candidate determinants of host specificity in Xanthomonas. PLoS One. 2009;4(8):e6632.
Lin H, Che J, Liu B, Zheng X, Xiao R. Genetic Diversity Analysis of Ralstonia solanacearum Based on BOX-PCR and REP-PCR. J Agri Biotechnol. 2011;19(6):1099–109.
Elphinstone JG. The current bacterial wilt situation: a global overview. In: Allen PPACH C, editor. Bacterial Wilt Disease and the Ralstonia solanacearum Species Complex. St Paul: APS Press; 2005. p. 9–28.
Zheng XF, Zhu YJ, Liu B, Zhou Y, Che JM, Lin NQ. Relationship Between Ralstonia solanacearum Diversity and Severity of Bacterial Wilt Disease in Tomato Fields in China. J Phytopathol. 2014;162(9):607–16.
Martin M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnetjournal. 2011;17(1):10–2.
Cox MP, Peterson DA, Biggs PJ. SolexaQA: At-a-glance quality assessment of Illumina second-generationsequencing data. BMC Bioinformatics. 2010;11:485.
Luo RB, Liu BH, Xie YL, Li ZY, Huang WH, Yuan JY, He GZ, Chen YX, Pan Q, Liu YJ, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. Gigascience. 2012;1(1):1–6.
Salanoubat M, Genin S, Artiguenave F, Gouzy J, Mangenot S, Arlat M, Billault A, Brottier P, Camus JC, Cattolico L, et al. Genome sequence of the plant pathogen Ralstonia solanacearum. Nature. 2002;415(6871):497–502.
Seemann T. Prokka: rapid prokaryotic genome annotation. Bioinformatics. 2014;30(14):2068–9.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41.
Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, et al. Pfam: the protein families database. Nuc Acid Res. 2014;42(Database issue):D222–230.
Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL. BLAST+: architecture and applications. BMC Bioinformatics. 2009;10:421.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60.
Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27(21):2987–93.
Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, et al. The variant call format and VCFtools. Bioinformatics. 2011;27(15):2156–8.
Cingolani P, Patel VM, Coon M, Nguyen T, Land SJ, Ruden DM, Lu X. Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift. Front Genet. 2012;3:35.
Cook DE, Mesarich CH, Thomma BP. Understanding plant immunity as a surveillance system to detect invasion. Annu Rev Phytopathol. 2015;53:541–63.
Valls M, Genin S, Boucher C. Integrated Regulation of the Type III Secretion System and Other Virulence Determinants in Ralstonia solanacearum. PLoS Pathog. 2(8):e82.
Garg RP, Huang J, Yindeeyoungyeon W, Denny TP, Schell MA. Multicomponent Transcriptional Regulation at the Complex Promoter of the Exopolysaccharide I Biosynthetic Operon of Ralstonia solanaceraum. J Bacteriol. 182(23):6659–66.
Ma W, Dong FF, Stavrinides J, Guttman DS. Type III effector diversification via both pathoadaptation and horizontal transfer in response to a coevolutionary arms race. PLoS Genet. 2006;2(12):e209.
Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol. 1993;10(3):512–26.
Kumar S, Stecher G, Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol. 2016;33(7):1870–4.
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26(5):541–7.
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci U S A. 1990;87(12):4576–9.
Garrity GM, Bell JA, Lilburn T. Class I. Alphaproteobacteria class. nov. In: Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual® of Systematic Bacteriology: Volume Two The Proteobacteria Part C The Alpha-, Beta-, Delta-, and Epsilonproteobacteria. Boston: Springer US; 2005. p. 1–574.
Editor L. Validation List Number 107. List of new names and new combinations previously effectively, but not validly, published. Int J Syst Evol Microbiol. 2006;56:1–6.
Garrity GM, Bell JA, Lilburn T. Class II. Betaproteobacteria class. nov. In: Brenner DJ, Krieg NR, Staley JT, editors. Bergey’s Manual® of Systematic Bacteriology: Volume Two The Proteobacteria Part C The Alpha-, Beta-, Delta-, and Epsilonproteobacteria. Boston: Springer US; 2005. p. 575–922.
Editor L. Validation List No. 57. Validation of the publication of new names and new combinations previously effectively published outside the IJSB. Int J Syst Bacteriol. 1996;46:625–6.
Yabuuchi E, Kosako Y, Yano I, Hotta H, Nishiuchi Y. Transfer of two Burkholderia and an Alcaligenes species to Ralstonia gen. Nov.: Proposal of Ralstonia pickettii (Ralston, Palleroni and Doudoroff 1973) comb. Nov., Ralstonia solanacearum (Smith 1896) comb. Nov. and Ralstonia eutropha (Davis 1969) comb. Nov. Microbiol Immunol. 1995;39(11):897–904.
Denny T, Hayward A, Schaad N, Jones J, Chun W. II Gram-negative bacteria, F. Ralstonia, Laboratory guide for identification of plant pathogenic bacteria. 2001. p. 151–74.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25(1):25–9.
We thank Suomeng Dong and Qinghe Chen for their help in providing the isolates sequenced in this work.
This work was supported by funds from the Chinese Academy of Sciences. CCM is sponsored by a CAS-TWAS President’s Fellowship for International PhD Students. RL is supported by the One Hundred Talent-Program of the Chinese Academy of Sciences. APM is supported by the Chinese 1000 Talents Program.
Conceived the project: HZ, RL, APM. Prepared and sequenced libraries: WJ, AC. Assembled and annotated the genome: YS. Performed effector gene annotation: YS. Validated sequencing data and PCR-based analysis: CCM, KW. Analyzed and interpreted results: YS, KW, CCM, HZ, RL, APM. Wrote the manuscript with input from all the authors: YS, KW, RL, APM. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.