- Open Access
Complete genome sequence of Staphylothermus hellenicus P8T
Standards in Genomic Sciences volume 5, pages12–20 (2011)
Staphylothermus hellenicus belongs to the order Desulfurococcales within the archaeal phylum Crenarchaeota. Strain P8T is the type strain of the species and was isolated from a shallow hydrothermal vent system at Palaeochori Bay, Milos, Greece. It is a hyperthermophilic, anaerobic heterotroph. Here we describe the features of this organism together with the complete genome sequence and annotation. The 1,580,347 bp genome with its 1,668 protein-coding and 48 RNA genes was sequenced as part of a DOE Joint Genome Institute (JGI) Laboratory Sequencing Program (LSP) project.
Strain P8T (=DSM 12710 = JCM 10830) is the type strain of the species Staphylothermus hellenicus. It was isolated from a shallow hydrothermal vent at Palaeochori Bay near the island of Milos, Greece . There is one other validly named species in the genus, S. marinus, for which a complete genome sequence has been determined and published [2,3]. The S. hellenicus genome is the ninth to be published from the order Desulfurococcales in the phylum Crenarchaeota. The only other genus in the Desulfurococcales for which two species have been sequenced is Desulfurococcus. Figure 1 shows the phylogenetic position of S. hellenicus with respect to the other species in the order Desulfurococcales.
S. hellenicus was isolated from sediment at Palaeochori Bay, Milos, Greece . For isolation, 1 ml of sediment was added to half-strength SME medium  with 2% elemental sulfur and incubated at 90°C under H2/CO2. Colonies were isolated on plates with the same medium and with 1% Phytagel and 2–3% sodium alginate added . S. hellenicus is a regular-shaped coccus (Figure 2) which can form large aggregates of up to fifty cells, similar to S. marinus [1,12]. No flagella were observed and cells were nonmotile. The temperature range for growth of S. hellenicus is 70–90°C, with an optimum at 85°C . The salinity range was from 2% to 8% NaCl, and the optimum was 4% NaCl . The pH range for growth was from 4.5 to 7.5. The optimum pH was 6.0 . S. hellenicus is a strict anaerobe, and can grow under H2/CO2 or N2/CO2 . It is a heterotroph which grows well on yeast extract but poorly on peptone . Many carbon sources were tested, but no growth was observed, showing that a complex nutrient source is required . Elemental sulfur was required for growth . The features of the organism are listed in Table 1.
Genome sequencing information
Genome project history
This organism was selected for sequencing on the basis of its phylogenetic position and is part of a Laboratory Sequencing Project (LSP) to sequence diverse archaea. The genome project is listed in the Genomes On Line Database  and the complete genome sequence has been deposited in GenBank. Sequencing, finishing, and annotation were performed by the DOE Joint Genome Institute (JGI). A summary of the project information is shown in Table 2.
Growth conditions and DNA isolation
S. hellenicus P8T cells were grown in a 300 liter fermenter at 85°C in SME medium  with 0.1% yeast extract, 0.1% peptone, and 0.7% elemental sulfur under a 200 kPa N2 atmosphere. DNA was isolated with a Qiagen Genomic 500 DNA Kit.
Genome sequencing and assembly
The genome of S. hellenicus was sequenced at the Joint Genome Institute (JGI) using a combination of Illumina and 454 technologies. An Illumina GA II shotgun library with reads of 730 Mb, a 454 Titanium draft library with average read length of 310.5 +/− 187.8 bases, and a paired end 454 library with an average insert size of 28 Kb were generated for this genome. Illumina sequencing data was assembled with Velvet , and the consensus sequences were shredded into 1.5 kb overlapped fake reads and assembled together with the 454 data with Newbler. Draft assemblies were based on 208 Mb 454 draft data.
The initial Newbler assembly contained 4 contigs in 1 scaffold. We converted the initial 454 assembly into a phrap assembly by making fake reads from the consensus, collecting the read pairs in the 454 paired end library. The Phred/Phrap/Consed software package was used for sequence assembly and quality assessment [25–27] in the following finishing process. After the shotgun stage, reads were assembled with parallel phrap (High Performance Software, LLC). Possible mis-assemblies were corrected with gapResolution (Cliff Han, unpublished), Dupfinisher , or sequencing cloned bridging PCR fragments with subcloning or transposon bombing (Epicentre Biotechnologies, Madison, WI). Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 23 additional reactions were necessary to close gaps and to raise the quality of the finished sequence.
Genes were identified using Prodigal , followed by a round of manual curation using GenePRIMP . The predicted CDSs were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant database, UniProt, TIGRFam, Pfam, PRIAM, KEGG, COG, and InterPro databases. The tRNAScanSE tool  was used to find tRNA genes, whereas ribosomal RNAs were found by using BLASTn against the ribosomal RNA databases. The RNA components of the protein secretion complex and the RNase P were identified by searching the genome for the corresponding Rfam profiles using INFERNAL . Additional gene prediction analysis and manual functional annotation was performed within the Integrated Microbial Genomes (IMG) platform  developed by the Joint Genome Institute, Walnut Creek, CA, USA .
The genome includes one chromosome and no plasmids, for a total size of 1,580,437 bp (Table 3 and Figure 3). This genome size is close to the average for Desulfurococcales. The GC percentage is 36.8%, which is lower than most of the Desulfurococcales. A total of 1,716 genes were identified: 48 RNA genes and 1,668 protein-coding genes. There are 69 pseudogenes, comprising 4.1% of the protein-coding genes. About 62% of predicted genes begin with ATG, 30% begin with TTG, and 7% begin with GTG. There is one copy of each ribosomal RNA. Table 4 shows the distribution of genes in COG categories.
Comparison with the S. marinus genome
The genome of S. hellenicus is slightly larger than the genome of S. marinus (1.58 Mbp vs. 1.57 Mbp), and the number of protein-coding genes is also larger (1668 vs. 1610). However, the number of pseudogenes is also higher in S. hellenicus (69 vs. 40). Some of the COG categories show different numbers of genes between the two organisms. S. hellenicus has 25 additional genes that do not belong to COGs. S. hellenicus has greater numbers of genes involved in cell wall biogenesis (39 vs. 23), nucleotide transport and metabolism (44 vs. 39) and carbohydrate transport and metabolism (79 vs. 72), while S. marinus has greater numbers of genes in the categories of energy production and conversion (92 vs. 79) and inorganic ion transport and metabolism (85 vs. 67).
The genes involved in cell wall metabolism that are in S. hellenicus but not in S. marinus are genes involved in nucleotide-sugar metabolism and glycosyltransferases, suggesting that S. hellenicus may have a greater variety of sugars attached to glycolipids and glycoproteins. Most of the additional S. hellenicus genes are located within a region of fifty genes on the chromosome (Shell_0865-Shell_0915) that is not present in S. marinus. The additional genes in S. hellenicus involved in nucleotide metabolism include adenylosuccinate synthase, adenylosuccinate lyase, and GMP synthase. Both S. hellenicus and S. marinus lack de novo purine synthesis, but the presence of these three additional enzymes suggests that S. hellenicus may be able to synthesize AMP and GMP from IMP, while S. marinus is unable to do so. The additional genes in carbohydrate transport and metabolism include nucleotide-sugar modifying enzymes that were also included in cell wall metabolism, but they also include a probable β-1,4-endoglucanase (cellulase) from glycosyl hydrolase family 5.
The genes found in S. marinus but not in S. hellenicus belong to the categories of energy production and conversion, and inorganic ion transport and metabolism. They include proteins related to subunits of multisubunit cation:proton antiporters and proteins related to subunits of NADH dehydrogenase and formate hydrogen lyase. These proteins are similar to subunits of mbh, a multisubunit membrane-bound hydrogenase from Pyrococcus furiosus , and mbx, a multisubunit complex of unknown function that probably has a role in sulfur reduction, also from P. furiosus . S. marinus has three operons related to mbh and mbx, while S. hellenicus has only one, suggesting that the three operons may be redundant in function in S. marinus. Since S. marinus and S. hellenicus lack other enzymes involved in sulfur reduction, it is possible that these mbh/mbx-related operons play a role in sulfur reduction in these organisms.
Arab H, Völker H, Thomm M. Thermococcus aegaeicus sp. nov. and Staphylothermus hellenicus sp. nov., two novel hyperthermophilic archaea isolated from geothermally heated vents off Palaeochori Bay, Milos, Greece. Int J Syst Evol Microbiol 2000; 50:2101–2108. PubMed doi:10.1099/00207713-50-6-2101
Anderson IJ, Dharmarajan L, Rodriguez J, Hooper S, Porat I, Ulrich LE, Elkins JG, Mavromatis K, Sun H, Land M, et al. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota. BMC Genomics 2009; 10:145. PubMed doi:10.1186/1471-2164-10-145
Anderson IJ, Sun H, Lapidus A, Copeland A, Glavina Del Rio T, Tice H, Dalin E, Lucas S, Barry K, Land M, et al. Complete genome sequence of Staphylothermus marinus Stetter and Fiala 1986 type strain F1. Stand Genomic Sci 2009; 1:183–188. PubMed doi:10.4056/sigs.30527
Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol 2000; 17:540–552. PubMed
Lee C, Grasso C, Sharlow MF. Multiple sequence alignment using partial order graphs. Bioinformatics 2002; 18:452–464. PubMed doi:10.1093/bioinformatics/18.3.452
Stamatakis A, Hoover P, Rougemont J. A rapid bootstrap algorithm for the RAxML web servers. Syst Biol 2008; 57:758–771. PubMed doi:10.1080/10635150802429642
Hess PN, De Moraes Russo CA. An empirical test of the midpoint rooting method. Biol J Linn Soc Lond 2007; 92:669–674. doi:10.1111/j.1095-8312.2007.00864.x
Pattengale ND, Alipour M, Bininda-Emonds ORP, Moret BME, Stamatakis A. How many bootstrap replicates are necessary? J Comput Biol 2010; 17:337–354. PubMed doi:10.1089/cmb.2009.0179
Swofford DL. PAUP*: Phylogenetic analysis using parsimony (*and other methods), Version 4.0 b10. Sinauer Associates, Sunderland, 2002.
Liolios K, Chen IM, Mavromatis K, Tavernarakis N, Hugenholtz P, Markowitz VM, Kyrpides NC. The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res 2010; 38:D346–D354. PubMed doi:10.1093/nar/gkp848
Stetter KO, König H, Stackebrandt E. Pyrodictium gen. nov., a new genus of submarine disc-shaped sulfur reducing archaebacteria growing optimally at 105°C. Syst Appl Microbiol 1983; 4:535–551.
Fiala G, Stetter KO, Jannasch HW, Langworthy TA, Madon J. Staphylothermus marinus sp. nov. represents a novel genus of extremely thermophilic submarine heterotrophic archaebacteria growing up to 98°C. Syst Appl Microbiol 1986; 8:106–113.
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, Tatusova T, Thomson N, Allen MJ, Angiuoli SV, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol 2008; 26:541–547. PubMed doi:10.1038/nbt1360
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 1990; 87:4576–4579. PubMed doi:10.1073/pnas.87.12.4576
Garrity GM, Holt JG. Phylum AI. Crenarchaeota phy. nov. In Bergey’s Manual of Systematic Bacteriology, vol. 1. 2nd ed. Edited by: Garrity GM, Boone DR and Castenholz RW. Springer, New York 2001: 169–210.
List Editor. Validation of publication of new names and new combinations previously effectively published outside the IJSEM. Validation List no. 85. Int J Syst Evol Microbiol 2002; 52:685–690. PubMed doi:10.1099/ijs.0.02358-0
Reysenbach AL. Class I. Thermoprotei class. nov. In Bergey’s Manual of Systematic Bacteriology, vol. 1. 2nd ed. Edited by: Garrity GM, Boone DR, and Castenholz RW. Springer, New York; 2001: 169.
Huber H, Stetter O. Order II. Desulfurococcales ord. nov. In: Garrity GM, Boone DR, Castenholz RW (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 1, Springer, New York, 2001, p. 179–180.
Burggraf S, Huber H, Stetter KO. Reclassification of the crenarchael orders and families in accordance with 16S rRNA sequence data. Int J Syst Bacteriol 1997; 47:657–660. PubMed doi:10.1099/00207713-47-3-657
Zillig W, Stetter KO, Prangishvilli D, Schäfer W, Wunderl S, Janekovic D, Holz I, Palm P. Desulfurococcaceae, the second family of the extremely thermophilic, anaerobic, sulfur-respiring Thermoproteales. Zentralbl Bakteriol Parasitenkd Infektioskr Hyg Abt 1 Orig 1982; 3:304–317.
List Editor. Validation List no. 10. Validation of the publication of new names and new combinations previously effectively published outside the IJSB. Int J Syst Bacteriol 1983; 33:438–440. doi:10.1099/00207713-33-2-438
List Editor. Validation List no. 22. Validation of the publication of new names and new combinations previously effectively published outside the IJSB. Int J Syst Bacteriol 1986; 36:573–576. doi:10.1099/00207713-36-4-573
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25–29. PubMed doi:10.1038/75556
Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 2008; 18:821–829. PubMed doi:10.1101/gr.074492.107
Ewing B, Green P. Base-calling of automated sequencer traces using phred. II. Error probablilities. Genome Res 1998; 8:186–194. PubMed
Ewing B, Hillier L, Wendl MC, Green P. Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res 1998; 8:175–185. PubMed
Gordon D, Abajian C, Green P. Consed: a graphical tool for sequence finishing. Genome Res 1998; 8:195–202. PubMed
Han C, Chain P. Finishing repeat regions automatically with Dupfinisher. In Proceedings of the 2006 international conference on bioinformatics and computational biology, ed. Arabnia HR, Valafar H. CSREA Press, 2006:141–146.
Hyatt D, Chen GL, Lacascio PF, Land ML, Larimer FW, Hauser LJ. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 2010; 11:119. PubMed doi:10.1186/1471-2105-11-119
Pati A, Ivanova NN, Mikhailova N, Ovchinnikova G, Hooper SD, Lykidis A, Kyrpides NC. GenePRIMP: a gene prediction improvement pipeline for prokaryotic genomes. Nat Methods 2010; 7:455–457. PubMed doi:10.1038/nmeth.1457
Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997; 25:955–964. PubMed doi:10.1093/nar/25.5.955
INFERNAL. Inference of RNA alignments. http://infernal.janelia.org
The Integrated Microbial Genomes (IMG) platform. http://img.jgi.doe.gov
Markowitz VM, Mavromatis K, Ivanova NN, Chen IMA, Chu K, Kyrpides NC. IMG ER: a system for microbial genome annotation expert review and curation. Bioinformatics 2009; 25:2271–2278. PubMed doi:10.1093/bioinformatics/btp393
Silva PJ, van den Ban EC, Wassink H, Haaker H, de Castro B, Robb FT, Hagen WR. Enzymes of hydrogen metabolism in Pyrococcus furiosus. Eur J Biochem 2000; 267:6541–6551. PubMed doi:10.1046/j.1432-1327.2000.01745.x
Schut GJ, Bridger SL, Adams MWW. Insights into the metabolism of elemental sulfur by the hyperthermophilic archaeon Pyrococcus furiosus: characterization of a coenzyme A-dependent NAD(P)H sulfur oxidoreductase. J Bacteriol 2007; 189:4431–4441. PubMed doi:10.1128/JB.00031-02