Benchmarking recent Deep Learning methods on the extended Tox21 data set
Sprache des Titels:
19th International Workshop on (Q)SAR in Environmental and Health Sciences (QSAR2021), Poster Session, June 2021, online
The Tox21 data set has evolved into a standard benchmark for computational QSAR methods in toxicology . One limitation of the Tox21 data set is, however, that it only contains twelve toxic assays which strongly restricts its power to distinguish the strength of computational methods. We ameliorate this problem by benchmarking on the extended Tox21 dataset with 68 publicly available assays in order to allow for a better assessment and characterization. The broader range of assays also allows for multi-task approaches, which have been particularly successful as predictive models . Furthermore, previous publications comparing methods on Tox21 did not include recent developments in the field of machine learning, such as graph neural and modern Hopfield networks . Thus we benchmark a set of prominent machine learning methods including those new types of neural networks. The results of the benchmarking study show that the best methods are modern Hopfield networks and multi-task graph neural networks with an average area-under-ROC-curve of 0.91 ± 0.05 (standard deviation across assays), while traditional methods, such as Random Forests fall behind by a substantial margin. Our results of the full benchmark suggest that multi-task learning has a stronger effect on the predictive performance than the choice of the representation of the molecules, such as graph, descriptors, or fingerprints.