Skip to main content

Table 1 Estimating plant microbiome diversity and NP potential on Earth

From: Deep learning approaches for natural product discovery from plant endophytic microbiomes

Here, we use simple, additive, non-combinatoric approaches to estimate endophyte species richness. In these calculations, only endophytic fungi and bacteria and considered with the simplifying assumption that each endophyte species acts alone (without the plant or other microbes) to synthesize its NPs

Historical estimates: The most widely cited estimate of endophyte species on Earth was proposed over 25 years ago, predicting 1.3 million endophytic fungi on Earth [11]. The simple calculation considered only culturable fungi from vascular plants and was based on researchers’ experience suggesting each plant species hosts 2 to 5 unique host-specific endophytes (thus, for 270,000 plant species there would be up to 5 × 270,000 unique endophytes). While this study did not estimate global NPs, the following calculation attempts to do this. This study argued that each phylogenetic cluster of fungi produced largely similar set of known secondary metabolites which largely differed from that of other clusters. For the 8 well-studied groups of endophytic fungi in [11], comprising 8300 species there are an average of 1038 species per group, which would comprise 1253 groups (1.3 million/1083 species). Together these groups reportedly produced 5351 unique known secondary metabolites, or an average of 669 secondary metabolites per group, which for 1253 groups would produce an estimated 838,100 unique secondary metabolites on Earth. Among the shortcomings of these estimates are omission of bacterial endophytes and non-culturable endophytes, and omission of novel metabolites and predictions of silent or cryptic biosynthetic clusters.

Estimates based on next-generation sequencing: Based on amplicon sequencing of endophytes from plant tissues using 16S rRNA and ITS or 18S rRNA genes revealing large numbers of previously uncultured endophytes (i.e. OTU-based surveys, Fig. 1), a simple “back-of-the-napkin” estimate suggests there may be at least one non-culturable host-specific fungal endophyte for every one that is culturable [23], and perhaps 10 host-specific bacterial endophytes, such that for the estimated ~ 300,000 plant species on Earth, there may be 10 fungi + 10 bacteria (=20) ×  300,000 = 120 million endophyte species on Earth. Whereas this is two orders of magnitude greater than historical estimates, this would constitute only 0.012% of the estimated 1 trillion microbial species on Earth [32], suggesting it is not absurdly high. Based on estimates of known metabolites discussed above, this suggests endophytic fungi might produce 77,450,000 unique secondary metabolites (110,800 × 699 per group) and estimating about half as many unique secondary metabolites per bacterial species, there would be perhaps 38,725,000 unique bacterial metabolites. However, considering studies that suggest ~ 90% of secondary metabolite biosynthetic capacity is silent or cryptic [33], the estimated endophyte-derived secondary metabolites on Earth might total 1.045 × 109, or a billion potential endophyte secondary metabolites.

Model-fitting: Estimates of endophyte species richness and NP potential could incorporate OTU data (e.g. see Fig. 1) into models of species discovery or species accumulation curves. These can be based on number of leaves sampled for endophytes [34] or published new species or OTUs [35]. Alternatively, endophyte OTU data can be estimated using frequency counts, rank species abundance distributions, or Poisson lognormal (log-log) fitting approaches and scaling laws [32, 36,37,38,39,40]. In the latter case, it has been argued that microbes in microbiomes closely fit the pattern where S (number of species) scales with N (number of individuals) where commonness (resampling) is constrained by scaling N z where S  N z and 0.25 ≤ z ≤ 0.5 (and for microbes z = 0.38 while for macroscopic organisms z = 0.24), and globally Nmax (number of individuals of the most abundant species) = 0.38 * N 0.93 r2 = 0.90 [32]. Empirically, results scale at S = 7.6 * N 0.35, r2 = 0.38. For endophytes, using estimated values of 104 to 108 endophytic bacterial cells per g of plant [41] plus ~ 10–100 fungal individuals per g, and an estimate of total Earth plant carbon (C) of 450 Gt [42] and assuming 0.43 g of C per 1 g plant matter [43], we estimate Earth’s bacterial endophyte individuals, N at 1.044 × 1022 to 1026, which with scaling laws results in an estimate of global endophytic bacteria species, S between 386 million and 9.7 billion and S for global endophytic fungal species between 34 and 77 million. These values produce not unreasonable estimates of numbers of microbial species per plant species, within the range of values summarized based on OTUs in Fig. 1 (i.e. for bacteria, 386 to 9700 million species divided by 300,000 plants = 1290 to 32,300 bacterial endophyte species per plant species – for example, similar OTU estimates in [44]; and for fungi 34 to 77 million species divided by 300,000 plant species = 113 to 257 fungal endophyte species per plant species). Extending the idea of endophyte secondary metabolite uniqueness per species-group as discussed above [11], there may be an estimated 124 million to 3.1 billion bacterial and 22 million to 50 million fungal secondary metabolites that could be expressed in cultures, and considering additional cryptic expression [33], up to 1.3 to 28.3 × 109 potential endophyte secondary metabolites to be discovered.