Non-contiguous finished genome sequence of Phocaeicola abscessus type strain 7401987T

Roux, Véronique; Robert, Catherine; Raoult, Didier

doi:10.4056/sigs.4428244

Open access
Published: 20 December 2013

Non-contiguous finished genome sequence of Phocaeicola abscessus type strain 7401987^T

Véronique Roux¹,
Catherine Robert¹ &
Didier Raoult¹

Standards in Genomic Sciences volume 9, pages 351–358 (2013)Cite this article

687 Accesses
2 Citations
1 Altmetric
Metrics details

Abstract

Phocaeicola abscessus strain 7401987^T is the sole member of the genus Phocaeicola. This bacterium is Gram-negative, non-spore-forming, coccoid to rod-shaped and motile by lophotrichous flagella. It was isolated from a human brain abscess sample. In this work, we describe a set of features of this organism, together with the complete genome sequence and annotation. The 2,530,616 bp long genome contains 2,090 protein-coding genes and 54 RNA genes, including 4 rRNA operons.

Introduction

Phocaeicola abscessus strain 7401987^T(CSUR P22^T= DSM 21584^T= CCUG 55929^T) is the type strain of P. abscessus. This bacterium was isolated from a brain abscess sample from a 76-year-old patient who underwent neurosurgical intervention after cancer of the face [1]. It is a Gram-negative strictly anaerobic coccoid to rod-shaped bacterium. Currently, the genus Phocaeicola contains only one species [2].

Here we present a summary classification and a set of features for P. abscessus, together with the description of the non-contiguous finished genomic sequencing and annotation.

Classification and features

The 16S rRNA gene sequence of P. abscessus strain 7401987^T was compared with sequences deposited in the Genbank database, confirming the initial taxonomic classification. Figure 1 shows the phylogenetic neighborhood of P. abscessus in a 16S rRNA based tree. The bacterium was characterized in 2007. It was isolated in the Timone Hospital microbiology laboratory (Table 1).

Table 1. Classification and general features of Phocaeicola abscessus strain 7401987^T

Full size table

Cells are coccoid (0.3–0.6 µm wide and 0.4–0.9 µm long) to rod-shaped (0.4–1.7 µm wide and 1.2–6.5 µm long) and motile by flagella in a lophotrichous arrangement. Optimal growth of strain 7401987^T occurs at 37°C with range for growth between 30 and 37 °C. Surface colonies on chocolate agar after 7 days incubation at 37 °C under anaerobic conditions were white, circular, regular, smooth, shiny, convex and 1 mm in diameter. The isolate was asaccharolytic. Activities of acid phosphatase, naphthol-AS-BI-phosphohydrolase, N-acetyl-β-glucosaminidase, α-fucosidase, α-galactosidase, β-galactosidase, β-galactosidase 6-phosphate, α-glucosidase, N-acetyl-β-glucosaminidase, alkaline phosphatase, leucyl glycine arylamidase and alanine arylamidase were detected. The fatty acid profile was characterized by the predominance of anteiso-C_15:0 (28.2%), C_16:0 (18.0%), iso-C_15:0 (12.3%) and iso-C_17:0_3-OH (11.7%). The size and ultrastructure of cells were determined by negative staining transmission electron microscopy. (Figure 2). Cells are coccoid (0.3–0.6 µm wide and 0.4–0.9 µm long) to rod-shaped (0.4–1.7 µm wide and 1.2–6.5 µm long).

Genome sequencing and annotation

Genome project history

The organism was selected for sequencing on the basis of its phylogenetic position and 16S rRNA similarity to other members of the order Bacteroidales and is part of study of the new species characterized in our laboratory. A summary of the project information is shown in Table 2. The EMBL accession number is CAKQ01000000 and consists of 39 contigs (≥ 500 bp) and 9 scaffolds. Table 2 shows the project information and its association with MIGS version 2.0 compliance.

Table 2. Project information

Full size table

Growth conditions and DNA isolation

P. abscessus strain 7401987^T, was grown anaerobically on chocolate agar at 37°C. Ten petri dishes were spread and resuspended in 3 ml of TE buffer. Three hundred µl of 10% SDS and 150 µl of proteinase K were then added and incubation was performed overnight at 56°C. The DNA was then extracted using the phenol/chloroform method. The yield and the concentration was measured by the Quant-it Picogreen kit (Invitrogen) on the Genios Tecan fluorometer at 88 ng/µl.

Genome sequencing and assembly

Shotgun and 3-kb paired-end sequencing strategies were performed. The shotgun library was constructed with 500 ng of DNA with a GS Rapid library Prep kit (Roche). For the paired-end sequencing, 5 µg of DNA was mechanically fragmented on a Hydroshear device (Digilab) with an enrichment size at 3–4 kb. The DNA fragmentation was visualized using a 2100 BioAnalyzer (Agilent) on a DNA labchip 7500 with an optimal size of 3.1 kb. The library was constructed according to the 454 GS FLX Titanium paired-end protocol. Circularization and nebulization were performed and generated a pattern with an optimal size of 579 bp. After PCR amplification through 17 cycles followed by double size selection, the single stranded paired-end library was then quantified using a Genios fluorometer (Tecan) at 8,770 pg/µL. The library concentration equivalence was calculated as 1.39E+10 molecules/µL. The library was stored at −20°C until further use.

The shotgun and paired-end libraries were clonally-amplified with 0.5 cpb and 2 cpb in 3 and 2 SV-emPCR reactions with the GS Titanium SV emPCR Kit (Lib-L) v2 (Roche). The yields of the emPCR were 9.63% and 10.3%, respectively, in the 5 to 20% range from the Roche procedure. Approximately 790,000 beads for the shotgun application and for the 3kb paired end were loaded on a GS Titanium PicoTiterPlate PTP Kit 70x75 and sequenced with a GS FLX Titanium Sequencing Kit XLR70 (Roche). The run was performed overnight and then analyzed on the cluster through the gsRunBrowser and Newbler assembler (Roche). A total of 311,276 passed filter wells were obtained and generated 35.9 Mb with a length average of 282 bp. The passed filter sequences were assembled using Newbler with 90% identity and 40 bp as overlap. The final assembly identified 9 scaffolds and 39 contigs (>500 bp).

Genome annotation

Open Reading Frames (ORFs) were predicted using Prodigal [10] with default parameters but the predicted ORFs were excluded if they were spanning a sequencing GAP region. The predicted bacterial protein sequences were searched against the GenBank database [11] and the Clusters of Orthologous Groups (COG) databases [12] using BLASTP. The tRNAscan-SE tool [13] was used to find tRNA genes, whereas ribosomal RNAs were found by using RNAmmer [14]. Transmembrane domains and signal peptides were predicted using TMHMM [15] and SignalP [16], respectively. ORFans of alignment length greater than 80 amino acids were identified if their BLASTp E-value was lower than 1e-03. If alignment lengths were smaller than 80 amino acids, we used an E-value of 1e-05. Such parameter thresholds have been used in previous works to define ORFans.

To estimate the mean level of nucleotide sequence similarity at the genome level between P. abscessus and Prevotella timonensis, Bacteroides thetaiotaomicron and Paraprevotella clara, we compared the ORFs using only comparison sequences in the RAST server [17] at a query coverage of ≥70% and a minimum nucleotide length of 100 bp.

Genome properties

The genome is 2,530,616 bp long with a 47.31% GC content (Table 3, Figure 3). Of the 2,144 predicted genes, 2,090 were protein-coding genes, and 54 were RNAs. A total of 1,464 genes (70.05%) were assigned a putative function. A total of 112 genes were identified as ORFans (5.39%). The remaining genes were annotated as hypothetical proteins (436 genes (20.86%)). The remaining genes were annotated as either hypothetical proteins or proteins of unknown functions. The distribution of genes into COGs functional categories is presented in Table 4. The properties and the statistics of the genome are summarized in Tables 3 and 4. Two CRISPRs were found using CRISPERfinder program online [18]. The first one on contig 1 includes at least 3 predicted spacer regions and the second one on contig 18 includes at least 53 predicted spacer regions.

Table 3. Nucleotide content and gene count levels of the genome

Full size table

Table 4. Number of genes associated with the 25 general COG functional categories

Full size table

Comparison with other genomes

Phocaeicola abscessus is the sole bacterium included in the genus Phocaeicola. We compared the genome of P. abscessus with those of Prevotella timonensis (CBQQ010000001) Paraprevotella clara (AFFY01000000) and Bacteroides thetaiotaomicron (AE015928.1). P. abscessus showed a mean nucleotide sequence similarity of 76.40%, 77.06% and 77.52% at the genome level (range 70–92.25%, 70.04–95.51% and 70.04–93.02%) with P. timonensis, P. clara and B. thetaiotaomicron, respectively. Presently, the family to which P. abscessus belongs is undetermined and the sole comparison based on nucleotide sequence similarity may not be sufficient to answer this question. In the future, further comparison of the genomes will allow us to find traits to classify the genus Phocaeicola in one of these 3 families or to create a new family, the family Phocaeicolaceae.

References

Al Masalma M, Raoult D, Roux V. Phocaeicola abscessus gen. nov., sp. nov., an anaerobic bacterium isolated from a human brain abscess sample. Int J Syst Evol Microbiol 2009; 59:2232–2237. PubMed http://dx.doi.org/10.1099/ijs.0.007823-0
Article PubMed Google Scholar
Euzéby JP. List of Bacterial Names with Standing in Nomenclature: a folder available on the Internet. Int J Syst Bacteriol 1997; 47:590–592. PubMed http://dx.doi.org/10.1099/00207713-47-2-590
Article PubMed Google Scholar
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Mol Biol Evol 2011; 28:2731–2739. PubMed http://dx.doi.org/10.1093/molbev/msr121
Article PubMed Central CAS PubMed Google Scholar
Woese CR, Kandler O, Wheelis ML. Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. Proc Natl Acad Sci USA 1990; 87:4576–4579. PubMed http://dx.doi.org/10.1073/pnas.87.12.4576
Article PubMed Central CAS PubMed Google Scholar
Validation List No. 143. Int J Syst Evol Microbiol 2012; 62:1–4.
Krieg NR, Ludwig W, Euzéby J, Whitman WB. Phylum XIV. Bacteroidetes phyl. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 4, Springer, New York, 2011, p. 25.
Google Scholar
Krieg NR. Class I. Bacteroidia class. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 4, Springer, New York, 2011, p. 25.
Google Scholar
Krieg NR. Order I. Bacteroidales ord. nov. In: Krieg NR, Staley JT, Brown DR, Hedlund BP, Paster BJ, Ward NL, Ludwig W, Whitman WB (eds), Bergey’s Manual of Systematic Bacteriology, Second Edition, Volume 4, Springer, New York, 2011, p. 25.
Google Scholar
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet 2000; 25:25–29. PubMed http://dx.doi.org/10.1038/75556
Article PubMed Central CAS PubMed Google Scholar
Prodigal http://prodigal.ornl.gov/
GenBank database. http://www.ncbi.nlm.nih.gov/genbank
Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res 2000; 28:33–36. PubMed http://dx.doi.org/10.1093/nar/28.1.33
Article PubMed Central CAS PubMed Google Scholar
Lowe TM, Eddy SR. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 1997; 25:955–964. PubMed
Article PubMed Central CAS PubMed Google Scholar
Lagesen K, Hallin P, Rodland EA, Staerfeldt HH, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 2007; 35:3100–3108. PubMed http://dx.doi.org/10.1093/nar/gkm160
Article PubMed Central CAS PubMed Google Scholar
Krogh A, Larsson B, von Heijni G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 2001; 305:567–580. PubMed http://dx.doi.org/10.1006/jmbi.2000.4315
Article CAS PubMed Google Scholar
Bendtsen JD, Nielsen H, von Heijne G, Brunak S. Improved prediction of signal peptides: SignalP 3.0. J Mol Biol 2004; 340:783–795. PubMed http://dx.doi.org/10.1016/j.jmb.2004.05.028
Article PubMed Google Scholar
Aziz RK, Bartels D, Best AA, DeJongh M, Disz T, Edwards RA, Formsma K, Gerdes S, Glass EM, Kubal M, et al. The RAST Server: Rapid Annotations using Subsystems Technology. BMC Genomics 2008; 9:75–89. PubMed http://dx.doi.org/10.1186/1471-2164-9-75
Article PubMed Central PubMed Google Scholar
http://crispr.u-psud.fr/Server/

Download references

Acknowledgements

The authors thank Mr. Julien Paganini at Xegen Company (www.xegen.fr) for automating the genomic annotation process and Laetitia Pizzo for her technical assistance.

Author information

Authors and Affiliations

Aix Marseille Université, Faculté de médecine, Aix-Marseille Université, Marseille cedex, France
Véronique Roux, Catherine Robert & Didier Raoult

Authors

Véronique Roux
View author publications
You can also search for this author in PubMed Google Scholar
Catherine Robert
View author publications
You can also search for this author in PubMed Google Scholar
Didier Raoult
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Véronique Roux.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Roux, V., Robert, C. & Raoult, D. Non-contiguous finished genome sequence of Phocaeicola abscessus type strain 7401987^T. Stand in Genomic Sci 9, 351–358 (2013). https://doi.org/10.4056/sigs.4428244

Download citation

Published: 20 December 2013
Issue Date: September 2013
DOI: https://doi.org/10.4056/sigs.4428244

Non-contiguous finished genome sequence of Phocaeicola abscessus type strain 7401987^T

Abstract

Introduction

Classification and features

Genome sequencing and annotation

Genome project history

Growth conditions and DNA isolation

Genome sequencing and assembly

Genome annotation

Genome properties

Comparison with other genomes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Environmental Microbiome

Contact us

Non-contiguous finished genome sequence of Phocaeicola abscessus type strain 7401987T

Abstract

Introduction

Classification and features

Genome sequencing and annotation

Genome project history

Growth conditions and DNA isolation

Genome sequencing and assembly

Genome annotation

Genome properties

Comparison with other genomes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Environmental Microbiome

Contact us

Non-contiguous finished genome sequence of Phocaeicola abscessus type strain 7401987^T