Genome sequence of the model plant pathogen Pectobacterium carotovorum SCC1

Bacteria of the genus Pectobacterium are economically important plant pathogens that cause soft rot disease on a wide variety of plant species. Here, we report the genome sequence of Pectobacterium carotovorum strain SCC1, a Finnish soft rot model strain isolated from a diseased potato tuber in the early 1980’s. The genome of strain SCC1 consists of one circular chromosome of 4,974,798 bp and one circular plasmid of 5524 bp. In total 4451 genes were predicted, of which 4349 are protein coding and 102 are RNA genes.


Introduction
Pectobacterium species are economically important plant pathogens that cause soft rot and blackleg disease on a range of plant species across the world [1,2]. The main virulence mechanism employed by Pectobacterium is the secretion of vast amounts of plant cell wall-degrading enzymes [1,3]. Due to their ability to effectively macerate plant tissue for acquisition of nutrients, Pectobacterium species are considered classical examples of necrotrophic plant pathogens. Among the Pectobacterium species, P. carotovorum has the widest host range while potato is the most important crop affected in temperate regions [1,4]. P. carotovorum strain SCC1 was isolated from a diseased potato tuber in Finland in the early 1980's [5]. It is highly virulent on model plant hosts such as tobacco (Nicotiana tabacum) and thale cress (Arabidopsis thaliana) as well as on the original host, potato (Solanum tuberosum). For three decades, the strain has been used as a model strain in the study of virulence mechanisms of Pectobacterium as well as in the study of plant defense mechanisms against necrotrophic plant pathogens ([e.g. [6][7][8][9][10][11][12][13]). Here we describe the annotated genome sequence of P. carotovorum strain SCC1.

Organism information
Classification and features P. carotovorum strain SCC1 is a Gram-negative, motile, non-sporulating, and facultatively anaerobic bacterium that belongs to the order of Enterobacterales within the class of Gammaproteobacteria. Cells of strain SCC1 are rod shaped with length of approximately 2 μm in the exponential growth phase (Fig. 1). Strain SCC1 is pathogenic causing soft rot disease in plants. It was originally isolated from a diseased potato tuber in Finland in 1982 [5]. It also provokes maceration symptoms on model plants Arabidopsis, tobacco, and tomato (Solanum lycopersicum), and is used as a soft rot model in research.
Strain SCC1 has previously been described belonging to P. carotovorum subsp. carotovorum based on biochemical properties such as its ability to grow at +37°C and in 5% NaCl, its sensitivity to erythromycin, its ability to assimilate lactose, melibiose and raffinose but not sorbitol, and its inability to produce reducing sugars from sucrose and acid from α-methyl glucoside [14]. A phylogenetic tree generated based on seven housekeeping genes (dnaN, fusA, gyrB, recA, rplB, rpoS and gyrA) clusters strain SCC1 together with other P. carotovorum strains (Fig. 2). However, sequence based phylogenetic analysis was inconclusive regarding the subspecies status. Overall, the phylogeny of Pectobacterium species and subspecies is currently in turmoil and assigning strains to subspecies is challenging [15].
P. carotovorum strain SCC1 has been deposited at the International Center for Microbial Resources -French collection of plant-associated bacteria (accession: CFBP 8537). MIGS of strain SCC1 is summarized in Table 1.

Genome sequencing information
Genome project history P. carotovorum strain SCC1 has been used as a model soft rot pathogen in the field of plant-pathogen interactions ever since its isolation in the 1980's. The sequencing of the genome of strain SCC1 was initiated in 2008 in order to further facilitate its use as a model pathogen.
The project was carried out jointly by the Institute of Biotechnology, Department of Biosciences and Department of Agricultural Sciences at the University of Helsinki, Finland. The genome was sequenced, assembled and annotated. The final sequence contains two scaffolds representing one chromosome and one plasmid. The sequence of the chromosome contains one gap of estimated length of 3788 bp. The genome sequence is deposited in GenBank under the accession numbers CP021894 (chromosome) and CP021895 (plasmid). Summary information of the project is presented in Table 2.

Growth conditions and genomic DNA preparation
After isolation from potato in 1982, P. carotovorum strain SCC1 has been stored in 22% glycerol at −80°C. For preparation of genomic DNA, the strain was first grown overnight on solid LB medium (10 g tryptone, 5 g yeast extract, 10 g NaCl, and 15 g agar per one liter of medium) at 28°C. A single colony was then picked and grown overnight in 10 ml of liquid LB medium at 28°C with shaking. Cells were harvested by centrifugation for 20 min at 3200 g at 4°C and resuspended into TE buffer (10 mM Tris-HCl pH 7.5, 1 mM EDTA). SDS (5% w/v) and Proteinase K (1 mg/ml) were used to break the cells for one hour at 50°C. Genomic DNA was extracted using phenol-chloroform purification followed by ethanol precipitation. The quantity and quality of the DNA was assessed by spectrophotometry and agarose gel electrophoresis.  Maximum likelihood tree of Pectobacterium carotovorum SCC1 and other closely related Pectobacterium strains. The phylogenetic tree was constructed from the seven housekeeping genes (dnaN, fusA, gyrB, recA, rplB, rpoS and gyrA). The concatenated sequences were aligned using MAFFT multiple sequence alignment program (version 7) with default parameters [42]. The phylogenetic tree was built in RAxML (Randomized Axelerated Maximum Likelihood) program with Maximum likelihood (ML) inference [43]. 88 different nucleotide substitution models were tested with jModelTest 2.0 and the best model was selected using Akaike information criterion (AIC) [44]. Bootstrap values from 1000 replicates are shown in each branch. Dickeya solani IPO2222 was used as the outgroup. Type strains are marked with T after the strain name. GenBank accession numbers are presented in the parentheses. The scale bar indicates 0.04 substitutions per nucleotide position Evidence codes -IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [55]

Genome annotation
Coding sequences were predicted using the Prodigal gene prediction tool [16]. GenePRIMP [17] was run to correct systematic errors made by Prodigal and to reanalyze the remaining intergenic regions for missed CDSs. Functional annotation for the predicted genes was performed using the PANNZER annotation tool [18]. The annotation was manually curated with information from publications and the following databases: COG [19], KEGG [20], CDD [21], UniProt and NCBI nonredundant protein sequences. To identify RNA genes, RNAmmer v1.2 [22] (rRNAs) and tRNAscan-SE [23] (tRNAs) were used. Clusters of Orthologous Groups assignments and Pfam domain predictions were done using the WebMGA server [24]. Transmembrane helices were predicted with TMHMM [25] and Phobius [26]. For signal peptide prediction, SignalP 4.1 [27] was used. CRISPRFinder [28] was used to detect Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRs).

Genome properties
The genome of P. carotovorum SCC1 consists of one circular 4,974,798 bp chromosome and one circular 5524 bp plasmid (Table 3, Fig. 3). The total genome size is 4,980,322 bp with an overall G + C content of 51.85% ( Insights from the genome sequence P. carotovorum strain SCC1 harbors a small cryptic plasmid of 5524 bp, pSCC1. The plasmid contains sequences for RNAI and RNAII, two non-coding RNAs involved in replication initiation and control in enterobacterial RNA priming plasmids such as ColE1 [29]. A similar replication region has previously been described in the small cryptic plasmid pEC3 of P. carotovorum subsp. carotovorum strain IFO3380 [30]. In addition to the two RNA genes, pSCC1 was predicted to contain nine proteincoding genes. Four of these (mobABCD) encode mobilization proteins. The mob locus is required for mobilization of non-self-transmissible plasmids and is found on many enterobacterial plasmids including pEC3 [31]. No function could be assigned to the remaining five genes on pSCC1. One of them, SCC1_4463, is very similar to genes found in many Enterobacteriaceae genomes, especially those of genera Enterobacter, Escherichia and Salmonella, whereas similar genes to the other four on pSCC1 are not widely present in other sequenced genomes. Pectobacterium infection is characterized by maceration symptoms caused by the secretion of a large arsenal of plant cell wall-degrading enzymes. Accordingly, the genome of P. carotovorum strain SCC1 was found to contain genes for eleven pectate lyases (pelABCILWXZ, hrpW, SCC1_1311, and SCC1_2381), one pectin lyase (pnl), four polygalacturonases (pehAKNX), one oligogalacturonate lyase (ogl), three cellulases (celSV, bcsZ), one rhamnogalacturonate lyase (rhiE), two pectin methylesterases (pemAB), and two pectin acetylesterases (paeXY). In addition, the genome harbors two genes encoding proteases previously characterized as plant cell walldegrading enzymes (prt1, prtW) as well as a number of putative proteases, some of which may function in plant cell wall degradation. Different Pectobacterium species and strains have been found to harbor very similar collections of plant cell wall-degrading enzymes [32], and the number and types of enzymes in the genome of strain SCC1 fit this picture well.
Protein secretion plays an essential role in soft rot pathogenesis [33]. The most important secretion system in Pectobacterium is the type II secretion system, also known as the Out system (outCDEFGHIJKLMN), which transports proteins from the periplasmic space into the extracellular environment [34]. It is responsible for the secretion of most plant cell wall-degrading enzymes such as pectinases and cellulases as well as some other virulence factors such as the necrosis-inducing protein Nip [33,35]. Furthermore, Pectobacterium genomes typically harbor multiple type I secretion systems, which secrete proteases and adhesins [33]. At least four type I secretion systems are encoded in the genome of P. carotovorum SCC1 (prtDEF, SCC1_1144-1146, SCC1_1589-1591, and SCC1_3286-3288). Strain SCC1 also harbors a type III secretion system cluster (SCC1_2406-2432), which has previously been characterized in this strain and shown to affect the speed of symptom development during infection [6,36]. Overall, the role of the type III secretion system in Pectobacterium is not well understood and P. wasabiae and P. parmentieri seem to lack it completely [32,37]. The type IV secretion system has been shown to have a minor contribution to virulence of P. atrosepticum [38]. However, it is sporadically distributed among Pectobacterium strains [33], and no type IV secretion genes could be found from the genome of P. carotovorum SCC1. Finally, the type VI secretion system has also been shown to have a small effect on virulence at least in some Pectobacterium species [32,39]. In P. carotovorum SCC1, one type VI secretion system cluster is present in the genome (SCC1_0988-1002).
Soft rot pathogens have been suggested to be able to use insect vectors in transmission, and indeed, certain P.  carotovorum strains can infect Drosophila flies and persist in their guts [40]. This ability has been linked to the Evf (Erwinia virulence factor) protein [41]. The evf gene is present in the genome of P. carotovorum SCC1 suggesting that the strain may have the ability to interact with insects.

Conclusions
In this study, we presented the annotated genome sequence of the pectinolytic plant pathogen Pectobacterium carotovorum SCC1 consisting of a chromosome of 4,974,798 bp and a small cryptic plasmid of 5524 bp. Strain SCC1 was originally isolated from a diseased potato tuber and it has been used as a model strain to study interactions between soft rot pathogens and their host plants for decades. In accordance with the pathogenic lifestyle, the genome of strain SCC1 was found to harbor a large arsenal of plant cell wall-degrading enzymes similar to other sequenced Pectobacterium genomes. In addition, an insect interaction gene, evf, is present in the genome of strain SCC1 suggesting the possibility of insects as vectors or alternative hosts for this strain. The genome sequence will drive further the use of P. carotovorum SCC1 as a model plant pathogen.

Competing interests
The authors declare that they have no competing interests.

Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. The total is based on the total number of protein coding genes in the genome