- Short genome report
- Open Access
Genome sequences of Knoxdaviesia capensis and K. proteae (Fungi: Ascomycota) from Protea trees in South Africa
Standards in Genomic Sciencesvolume 11, Article number: 22 (2016)
Two closely related ophiostomatoid fungi, Knoxdaviesia capensis and K. proteae, inhabit the fruiting structures of certain Protea species indigenous to southern Africa. Although K. capensis occurs in several Protea hosts, K. proteae is confined to P. repens. In this study, the genomes of K. capensis CBS139037 and K. proteae CBS140089 are determined. The genome of K. capensis consists of 35,537,816 bp assembled into 29 scaffolds and 7940 predicted protein-coding genes of which 6192 (77.98 %) could be functionally classified. K. proteae has a similar genome size of 35,489,142 bp that is comprised of 133 scaffolds. A total of 8173 protein-coding genes were predicted for K. proteae and 6093 (74.55 %) of these have functional annotations. The GC-content of both genomes is 52.8 %.
Two lineages of the polyphyletic assemblage known as ophiostomatoid fungi  are associated with the fruiting structures (infructescences) of serotinous Protea L. plants . Protea species are a key component of the fynbos vegetation in the Core Cape Subregion (CCR) of South Africa  and the genus is predominantly encountered in South Africa [4, 5]. The Protea-associated ophiostomatoid fungi are, therefore, believed to be endemic to this region, similar to their hosts. This association of ophiostomatoid fungi with a keystone plant genus in a biodiversity hotspot is intriguing , as many ophiostomatoid fungi are notorious pathogens of trees [7–10], yet the Protea ophiostomatoid species are not associated with disease symptoms .
Ophiostomatoid fungi are characterized by the flask-shaped morphology of their sexual fruiting structures and their association with arthropods [1, 12]. The Protea-associated members of this assemblage are primarily dispersed by mites that come into contact with fungal spores in the Protea infructescences [13, 14]. These mites have limited dispersal ability, but use beetles and possibly larger vertebrates (such as birds) as vehicles for long-distance dispersal [15, 16].
The three Knoxdaviesia M.J. Wingf., P.S. van Wyk & Marasas species associated with Protea have intriguing host ranges. K. capensis M.J. Wingf. & P.S. van Wyk occurs in at least eight different Protea hosts, whereas K. proteae M.J. Wingf., P.S. van Wyk & Marasas and K. wingfieldii (Roets & Dreyer) Z.W. de Beer & M.J. Wingf. are confined to single host species, respectively P. repens L. and P. caffra Meisn.[17–20]. An investigation of the population biology of K. proteae , revealed that this fungus has a high level of intra-specific genetic diversity and that it is extensively dispersed within the CCR of South Africa [16, 21]. However, other than host range and dispersal mechanisms, little is known about the biology and ecology of Knoxdaviesia in general . Here we present the description of the first drafts of the genome sequences of the two CCR species, K. capensis and K. proteae , as well as their respective annotations.
Classification and features
The one lineage of Protea-associated ophiostomatoid fungi resides in the Ophiostomataceae (Ophiostomatales, Ascomycota), while the second resides in the Gondwanamycetaceae (Microascales, Ascomycota) [11, 22]. The latter group includes three closely related Protea-associated species in the genus Knoxdaviesia (Fig. 1). This genus was initially described to accommodate the asexual state of the first species in the genus, K. proteae . Under the dual nomenclature system of fungi, the sexual state of this fungus was described in the same paper as Ceratocystiopsis proteae M.J. Wingf., P.S. van Wyk & Marasas . A new genus, Gondwanamyces G.J. Marais & M.J. Wingf., was later described to accommodate the sexual state of this species and that of another species, Ophiostoma capense M.J. Wingf. & P.S. van Wyk . The asexual states of both remained to be treated as species of Knoxdaviesia . Since the abolishment of the dual nomenclature system of fungi, the oldest genus name takes preference, irrespective of morph [25, 26]. The name Knoxdaviesia , therefore, has priority and all species previously treated in Gondwanamyces were transferred to Knoxdaviesia .
In a study determining the genome sequence of any fungus, it is advisable to use a living isolate connected to the type specimen. However, the ex-type isolate of K. proteae (CMW738 = CBS486.88) is more than 20 years old and does not display the characteristic morphological features of the fungus in culture anymore. No living ex-type isolate exists for K. capensis . We thus collected fresh isolates of both species for this study in order to eliminate possible mutations or degradation that may have occurred though continual artificial propagation in culture media. The new isolates (Figs. 1 & 2) were collected from the same localities and hosts as the holotype specimens: K. capensis (CMW40890 = CBS139037) from the infructescences of P. longifolia Andrews in Hermanus, and K. proteae (CMW40880 = CBS140089) from P. repens infructescences in Stellenbosch, both locations in the Western Cape Province of South Africa. General features of these isolates are outlined in Table 1.
Genome sequencing information
Genome project history
Considering the lack of ecological information on the genus Knoxdaviesia and the close relationship these Microascalean fungi have to important plant pathogens, two Protea-associated Knoxdaviesia species, believed to be native to the CCR in South Africa, were selected for genome sequencing. Both species were sequenced at Fasteris in Switzerland. The genome projects are listed in the Genomes OnLine Database  and the whole genome shotgun (WGS) project has been deposited at DDBJ/EMBL/GenBank (Table 2). Table 2 presents the project information and its association with the minimum information about a genome sequence version 2.0 compliance . The full MIGS records for K. capensis and K. proteae are available in Additional file 1: Table S1 and Additional file 2: Table S2, respectively.
Growth conditions and genomic DNA preparation
Both K. capensis and K. proteae were cultured on Malt Extract Agar (MEA; Merck, Wadeville, South Africa) overlaid with sterile cellophane sheets (Product no. Z377597, Sigma-Aldrich, Steinham, Germany). After 10 days of growth at 25 °C, mycelia was scraped from the cellophane and DNA was extracted according to Aylward et al. . Approximately 5 μg DNA from each species was used to prepare the three Illumina libraries (Table 2).
RNA was extracted from the K. proteae genome isolate to use as evidence for gene prediction. After growth on MEA at 25 °C for approximately 10 days, total RNA was isolated from the mycelia with the PureLink™ RNA Mini Kit (Ambion, Austin, TX, USA). Quality control was performed on the Agilent 2100 Bioanalyzer (Agilent Technologies, USA) using the RNA 6000 Nano Assay kit (Agilent Technologies, USA). The mRNA component of the total RNA was subsequently extracted with the Dynabeads® mRNA purification kit (Ambion, Austin, TX, USA).
Genome sequencing and assembly
The genomes of K. capensis and K. proteae were sequenced with the Illumina HiSeq 2500 platform at Fasteris, Switzerland, using two paired-end and one Nextera mate-pair library (Table 2). More than 60 million paired-end and 8 million mate-pair reads were obtained for each species. These reads were trimmed in CLC Genomics Workbench 6.5 (CLC bio, Aarhus, Denmark) so that the Phred Q (quality) score of each base was at least Q20. VelvetOptimiser (Gladman & Seeman, unpublished), a Perl script used as part of the Velvet assembler [31, 32], was initially used to optimize the assembly parameters. Assembly of contigs was performed in ABySS 1.5.2  using the optimal parameters suggested by VelvetOptimiser as a starting point. Several assemblies were computed using kmer-values slightly higher and lower than the kmer-value suggested by VelvetOptimiser. The assembly with the lowest number of contigs was used to build scaffolds in SSPACE 3.0 , discarding scaffolds smaller than 1000 bp. Automatic gap closure was performed in GapFiller 1.10 . The average genome coverage of each library was estimated using the Lander-Waterman equation (total sequenced nucleotides/genome size) (Table 2), which yielded a combined average coverage for the three libraries of 188.5x ( K. capensis ) and 271.5x ( K. proteae ).
The K. capensis genome consists of 29 scaffolds ranging between 1226 and 5,637,848 bp, whereas the 133 scaffolds of K. proteae are sized between 1022 and 2,610,973 bp. A search for the 1438 fungal universal single-copy ortholog genes with BUSCO 1.1b1  identified 1355 complete and 67 partial genes in K. capensis and 1366 complete and 57 partial genes in K. proteae . The two genomes are therefore estimated to be >98 % complete.
The extracted mRNA of K. proteae was sequenced using an Ion PI™ Chip on the Ion Proton™ System (Life Technologies, Carlsbad, CA) at the Central Analytical Facility (CAF), Stellenbosch University, South Africa. The >49 million raw RNA-Seq reads were mapped to the K. capensis genome in CLC Genomics Workbench and assembled with Trinity 2.0.6  using the genome-guided option.
Genome annotation was performed with the MAKER 2.31.8 pipeline [38, 39], using custom repeat libraries for each species constructed with RepeatScout 1.0.5  and two de novo gene predictors, SNAP 2006-07-28  and AUGUSTUS 3.0.3 . The assembled K. proteae RNA-Seq and predicted protein and/or transcript sequences from 22 sequenced Sordariomycete species (Additional file 3: Table S3), including two Microascalean fungi, were provided as additional evidence. AUGUSTUS was trained with the assembled K. proteae RNA-Seq data and subsequently MAKER was used to annotate the largest scaffold of the K. capensis and the largest scaffold of the K. proteae assembly, independently. After manually curating all the gene predictions on these scaffolds with Apollo 1.11.8 , SNAP was trained with the curated gene predictions of each scaffold and the scaffolds were re-annotated. SNAP was re-trained for each species individually and subsequently both genomes were annotated. EuKaryotic Orthologous Group (KOG) classifications were assigned to the predicted proteins through the WebMGA  portal that performs reverse-position-specific BLAST  searches on the KOG database . Additional functional annotations were predicted with InterProScan 5.13-52.0 [47, 48], SignalP 4.1  and TMHMM 2.0 .
K. capensis and K. proteae have similar genome sizes at 35.54 and 35.49 Mbp, respectively. It was possible to assemble the K. capensis genome into 29 scaffolds larger than 1000 bp, whereas the number of scaffolds above this threshold achieved for K. proteae was 133. Both genomes had a GC content of 52.8 %.
A total of 7940 protein-coding genes were predicted for K. capensis and 8174 for K. proteae . Additionally 137 and 116 tRNA and 30 and 27 rRNA genes were predicted for each species, respectively. More than 74 % of the protein-coding genes of each species could be assigned to a putative function via the KOG and Pfam databases. The content of the two genomes are summarized in Tables 3 and 4.
At least six Microascalean fungi currently have publically accessible genomes [51–54]. K. capensis and K. proteae , however, represent the first sequenced genomes from the Microascalean family Gondwanamycetaceae . The genomes of these two species will not only enable further understanding of the unique ecology of Protea-inhabiting fungi, but will also be valuable in taxonomic and evolutionary studies.
core cape subregion
malt extract agar
EuKaryotic Orthologous Groups of proteins
Spatafora JW, Blackwell M. The polyphyletic origins of ophiostomatoid fungi. Mycol Res. 1994;98:1–9.
Wingfield BD, Viljoen CD, Wingfield MJ. Phylogenetic relationships of ophiostomatoid fungi associated with Protea infructescences in South Africa. Mycol Res. 1999;103:1616–20.
Manning J, Goldblatt P. Plants of the Greater Cape Floristic Region. 1: The Core Cape Flora. Strelitzia 29, vol 29. Pretoria: South African National Biodiversity Institute; 2012.
Gibbs RG. Analysis of the size and composition of the southern African flora. Bothalia. 1984;15:613–29.
Rebelo T. Proteas: a field guide to the Proteas of Southern Africa. Vlaeberg: Fernwood Press; 1995.
Mittermeier RA, Myers N, Thomsen JB, Da Fonseca GAB, Olivieri S. Biodiversity hotspots and major tropical wilderness areas: approaches to setting conservation priorities. Conserv Biol. 1998;12:516–20.
Paine TD, Raffa KF, Harrington TC. Interactions among scolytid bark beetles, their associated fungi, and live host conifers. Annu Rev Entomol. 1997;42:179–206.
Brasier CM. Ophiostoma novo-ulmi sp. nov., causative agent of current Dutch elm disease pandemics. Mycopathologia. 1991;115:151–61.
Harrington TC, Wingfield MJ. The Ceratocystis species on conifers. Can J Botany. 1998;76:1446–57.
Roux J, Wingfield MJ. Ceratocystis species: emerging pathogens of non-native plantation Eucalyptus and Acacia species. South Forests. 2009;71:115–20.
Roets F, Wingfield MJ, Crous PW, Dreyer LL. Taxonomy and ecology of ophiostomatoid fungi associated with Protea infructescences. In: Seifert KA, de Beer ZW, Wingfield MJ, editors. Ophiostomatoid fungi: expanding frontiers. Utrecht: CBS Biodiversity Series; 2013. p. 177–87.
Malloch D, Blackwell M. Dispersal biology of the ophiostomatoid fungi. In: Wingfield MJ, Seifert KA, Webber JF, editors. Ceratocystis and Ophiostoma: taxonomy, ecology and pathology. St. Paul: APS Press; 1993. p. 195–206.
Roets F, Wingfield MJ, Crous PW, Dreyer LL. Discovery of Fungus-Mite Mutualism in a Unique Niche. Environ Entomol. 2007;36:1226–37.
Roets F, Wingfield MJ, Wingfield BD, Dreyer LL. Mites are the most common vectors of the fungus Gondwanamyces proteae in Protea infructescences. Fungal Biol. 2011;115:343–50.
Roets F, Crous PW, Wingfield MJ, Dreyer LL. Mite-mediated hyperphoretic dispersal of Ophiostoma spp. from the Infructescences of South African Protea spp. Environ Entomol. 2009;28:143–52.
Aylward J, Dreyer LL, Steenkamp ET, Wingfield MJ, Roets F. Long-distance dispersal and recolonization of a fire-destroyed niche by a mite-associated fungus. Fungal Biol. 2015;119:245–56.
Roets F, Theron N, Wingfield MJ, Dreyer LL. Biotic and abiotic constraints that facilitate host exclusivity of Gondwanamyces and Ophiostoma on Protea. Fungal Biol. 2011;116:49–61.
Roets F, Dreyer LL, Crous PW. Seasonal trends in colonisation of Protea infructescences by Gondwanamyces and Ophiostoma spp. S Afr J Bot. 2005;71:307–11.
Wingfield MJ, Van Wyk PS. A new species of Ophiostoma from Protea infructescences in South Africa. Mycol Res. 1993;97:709–16.
Crous PW, Summerell BA, Shivas RG, Burgess TI, Decock CA, Dreyer LL, et al. Fungal Planet description sheets: 107-127. Persoonia. 2012;28:138–82.
Aylward J, Dreyer LL, Steenkamp ET, Wingfield MJ, Roets F. Panmixia defines the genetic diversity of a unique arthropod-dispersed fungus specific to Protea flowers. Ecol Evol. 2014;4:3444–55.
Réblová M, Gams W, Seifert KA. Monilochaetes and allied genera of the Glomerellales, and a reconsideration of families in the Microascales. Stud Mycol. 2011;68:163–91.
Wingfield MJ, Wyk PSV, Marasas WFO. Ceratocystiopsis proteae sp. nov. with a new anamorph genus. Mycologia. 1988;80:23–30.
Marais GJ, Wingfield MJ, Viljoen CD, Wingfield BD. A new ophiostomatoid genus from Protea infructescences. Mycologia. 1998;90:136–41.
Hawksworth DL. A new dawn for the naming of fungi: impacts of decisions made in Melbourne in July 2011 on the future publication and regulation of fungal names. IMA Fungus. 2011;2:155.
McNeill J, Barrie F, Buck W, Demoulin V, Greuter W, Hawksworth D, et al. International Code of Nomenclature for algae, fungi, and plants (Melbourne Code) adopted by the Eighteenth International Botanical Congress Melbourne, Australia, July 2011. Regnum Vegetabile 154, Koeltz Scientific Books. 2012.
De Beer ZW, Seifert KA, Wingfield MJ. A nomenclature for ophiostomatoid genera and species in the Ophiostomatales and Microascales. In: Seifert KA, de Beer ZW, Wingfield MJ, editors. Ophiostomatoid fungi: expanding frontiers, CBS Biodiversity Series. 2013. p. 245–322.
Pagani I, Liolios K, Jansson J, Chen I-MA, Smirnova T, Nosrat B, et al. The Genomes OnLine Database (GOLD) v. 4: status of genomic and metagenomic projects and their associated metadata. Nucleic Acids Res. 2012;40:D571–9.
Field D, Garrity G, Gray T, Morrison N, Selengut J, Sterk P, et al. The minimum information about a genome sequence (MIGS) specification. Nat Biotechnol. 2008;26:541–7.
Aylward J, Dreyer LL, Steenkamp ET, Wingfield MJ, Roets F. Development of polymorphic microsatellite markers for the genetic characterisation of Knoxdaviesia proteae (Ascomycota: Microascales) using ISSR-PCR and pyrosequencing. Mycol Prog. 2014;13:439–44.
Zerbino DR. Using the Velvet de novo Assembler for Short‐Read Sequencing Technologies. Curr Protoc Bioinformatics. 2010; Unit 11.5:1-13.
Zerbino DR, Birney E. Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res. 2008;18:821–9.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol İ. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–23.
Boetzer M, Henkel CV, Jansen HJ, Butler D, Pirovano W. Scaffolding pre-assembled contigs using SSPACE. Bioinformatics. 2011;27:578–9.
Boetzer M, Pirovano W. Toward almost closed genomes with GapFiller. Genome Biol. 2012;13:R56.
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Supplementary Online Materials: http://busco.ezlab.org/files/BUSCO-SOM.pdf. 2015.
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat Protoc. 2013;8:1494–512.
Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, et al. MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008;18:188–96.
Holt C, Yandell M. MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12:491.
Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21:i351–8.
Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5:59.
Stanke M, Steinkamp R, Waack S, Morgenstern B. AUGUSTUS: a web server for gene finding in eukaryotes. Nucleic Acids Res. 2004;32:W309–12.
Lewis SE, Searle S, Harris N, Gibson M, Lyer V, Richter J, et al. Apollo: a sequence annotation editor. Genome Biol. 2002;3:1–14.
Wu S, Zhu Z, Fu L, Niu B, Li W. WebMGA: a customizable web server for fast metagenomic sequence analysis. BMC Genomics. 2011;12:444.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215:403–10.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC Bioinformatics. 2003;4:41.
Goujon M, McWilliam H, Li W, Valentin F, Squizzato S, Paern J, et al. A new bioinformatics analysis tools framework at EMBL–EBI. Nucleic Acids Res. 2010;38:W695–9.
Zdobnov EM, Apweiler R. InterProScan – an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001;17:847–8.
Petersen TN, Brunak S, von Heijne G, Nielsen H. SignalP 4.0: discriminating signal peptides from transmembrane regions. Nat Meth. 2011;8:785–6.
Krogh A, Larsson B, Von Heijne G, Sonnhammer EL. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol. 2001;305:567–80.
Van der Nest MA, Bihon W, De Vos L, Naidoo K, Roodt D, Rubagotti E, et al. Draft genome sequences of Diplodia sapinea, Ceratocystis manginecans, and Ceratocystis moniliformis. IMA Fungus. 2014;5:135–40.
Van der Nest MA, Beirn LA, Crouch JA, Demers JE, de Beer ZW, De Vos L, et al. IMA Genome-F 3: Draft genomes of Amanita jacksonii, Ceratocystis albifundus, Fusarium circinatum, Huntiella omanensis, Leptographium procerum, Rutstroemia sydowiana, and Sclerotinia echinophila. IMA Fungus. 2014;5:473–86.
Vandeputte P, Ghamrawi S, Rechenmann M, Iltis A, Giraud S, Fleury M, et al. Draft genome sequence of the pathogenic fungus Scedosporium apiospermum. Genome Announc. 2014;2:e00988-14.
Wilken PM, Steenkamp ET, Wingfield MJ, De Beer ZW, Wingfield BD. IMA Genome-F 1: Ceratocystis fimbriata: draft nuclear genome sequence for the plant pathogen, Ceratocystis fimbriata. IMA Fungus. 2013;4:357.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.
Tamura K, Nei M. Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzees. Mol Biol Evol. 1993;10:512–26.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–9.
We are grateful to Quentin Santana and Dr. Lieschen Bahlmann for their guidance in the genome assembly and annotation procedures and to Dr. Wilhelm de Beer for the taxonomic information he contributed to this manuscript. This research was funded by the National Research Foundation (NRF) and the Department of Science and Technology/NRF Centre of Excellence in Tree Health Biotechnology. We also thank the Cape Nature Conservation Board for supplying the necessary collection permits.
The authors declare that they have no competing interests.
MJW, BDW and ETS conceived the study. LLD and FR supervised the study. JA performed the laboratory work. JA assembled and annotated the genomes with the help of BDW and ETS. JA drafted the manuscript with the help of LLD and FR. ETS revised the manuscript. All authors read and approved the final manuscript.