Abstract
In this paper, we present a technique for balancing predictive relevance models related to supervised modelling ligand biochemical activities to biological targets. We train uncalibrated models employing conventional supervised machine learning technique, namely Support Vector Machines. Unfortunately, SVMs have a serious drawback. They are sensitive to imbalanced datasets, outliers and high multicollinearity among training samples, which could be a cause of preferencing one group over another. Thus, an additional calibration could be required for balancing a predictive relevance of models. As a technique for this balancing, we propose the Platt’s scaling. The achieved results were demonstrated on single-target models trained on datasets exported from the ExCAPE database. Unlike traditional used machine techniques, we focus on decreasing uncertainty employing deterministic solvers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018
Dostál, Z.: Optimal Quadratic Programming Algorithms, with Applications to Variational Inequalities. Springer Optimization and Its Applications, vol. 23. Springer, New York (2009). https://doi.org/10.1007/b138610
Levenberg, K.: A method for the solution of certain non-linear problems in least squares. Q. Appl. Math. 2(2), 164–168 (1944)
Lin, H.T., Lin, C.J., Weng, R.C.: A note on Platt’s probabilistic outputs for Support Vector Machines. Mach. Learn. 68(3), 267–276 (2007)
Pecha, M., Hapla, V., Horák, D., Čermák, M.: Notes on the preliminary results of a linear two-class classifier in the PERMON toolbox. In: AIP Conference Proceedings, vol. 1978 (2018)
Pecha, M., Horák, D.: Analyzing l1-loss and l2-loss support vector machines implemented in PERMON Toolbox. In: Zelinka, I., Brandstetter, P., Trong Dao, T., Hoang Duy, V., Kim, S. (eds.) AETA 2018 - Recent Advances in Electrical Engineering and Related Sciences: Theory and Application. AETA 2018. LNEE, vol. 554, pp. 13–23. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-14907-9_2
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10(3), 61–74 (1999)
Sun, J., et al.: ExCAPE-DB: an integrated large scale dataset facilitating big data analysis in chemogenomics. J. Cheminformatics 9(1), 1–9 (2017)
Acknowledgments
The author acknowledges the grant programme “Support for Science and Research in the Moravia-Silesia Region 2017” (RRC/10/2017), financed from the budget of the Moravian-Silesian region; and the Grant of SGS No. SP2020/84, VSB - Technical University of Ostrava. This result was also produced support of long-term conceptual development of the research organization of the Institute of Geonics of the Czech Academy of Sciences, RVO: 68145535. The author would like to thank a reviewer for the constructive feedback as well.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Pecha, M. (2022). Balancing Predictive Relevance of Ligand Biochemical Activities. In: Atanassov, K.T., et al. Uncertainty and Imprecision in Decision Making and Decision Support: New Advances, Challenges, and Perspectives. IWIFSGN BOS/SOR 2020 2020. Lecture Notes in Networks and Systems, vol 338. Springer, Cham. https://doi.org/10.1007/978-3-030-95929-6_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-95929-6_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95928-9
Online ISBN: 978-3-030-95929-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)