Skip to main content

Table 3 Performance of all models on the test dataset (all) and subsets containing real and simulated reads

From: Accurate prediction of metagenome-assembled genome completeness by MAGISTA, a random forest model built on alignment-free intra-bin statistics

Bin statistic

Model

\({\mathbf{R}}_{{\mathbf{y}}\sim {\mathbf{x}}}^{{\textbf{2}}}\)

RMSE

Real

Simulated

All

Real

Simulated

All

Completeness

CheckM

0.744

0.612

0.685

17.28

22.54

20.05

MAGISTA

0.814

0.730

0.777

14.73

18.81

16.87

MAGISTIC

0.905

0.836

0.873

10.52

14.68

12.75

Purity

CheckM

0.722

-0.261

0.143

7.74

30.61

22.21

MAGISTA

0.204

0.240

0.365

13.10

23.76

19.12

MAGISTIC

0.672

0.234

0.449

8.41

23.85

17.80

F1

CheckM

0.778

0.536

0.666

14.85

23.46

19.58

MAGISTA

0.787

0.725

0.766

14.57

18.04

16.38

MAGISTIC

0.884

0.775

0.834

10.75

16.32

13.79