Fig. 1From: Accurate prediction of metagenome-assembled genome completeness by MAGISTA, a random forest model built on alignment-free intra-bin statisticsGraphical summary of the pre-processing steps used to evaluate the usability of a specified combination of fragment length and signature choice for a given set of five genomes. Genomes are split into fragments of a specified length and with specified overlap. For each fragment, each signature calculated using the target method is viewed as an observation and PCA is performed to reduce to two dimensions. Finally, QDA is performed between the two closest clusters made up of observations from the same genome and the accuracy of this classifier is producedBack to article page