Issue 2, 2024, Issue in Progress

Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets

Abstract

This study introduces the PocketCFDM generative diffusion model, aimed at improving the prediction of small molecule poses in the protein binding pockets. The model utilizes a novel data augmentation technique, involving the creation of numerous artificial binding pockets that mimic the statistical patterns of non-bond interactions found in actual protein–ligand complexes. An algorithmic method was developed to assess and replicate these interaction patterns in the artificial binding pockets built around small molecule conformers. It is shown that the integration of artificial binding pockets into the training process significantly enhanced the model's performance. Notably, PocketCFDM surpassed DiffDock in terms of non-bond interaction and steric clash numbers, and the inference speed. Future developments and optimizations of the model are discussed. The inference code and final model weights of PocketCFDM are accessible publicly via the GitHub repository: https://github.com/vtarasv/pocket-cfdm.git.

Graphical abstract: Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets

Supplementary files

Article information

Article type
Paper
Submitted
28 Nov 2023
Accepted
21 Dec 2023
First published
03 Jan 2024
This article is Open Access
Creative Commons BY-NC license

RSC Adv., 2024,14, 1341-1353

Augmenting a training dataset of the generative diffusion model for molecular docking with artificial binding pockets

T. Voitsitskyi, V. Bdzhola, R. Stratiichuk, I. Koleiev, Z. Ostrovsky, V. Vozniak, I. Khropachov, P. Henitsoi, L. Popryho, R. Zhytar, S. Yesylevskyy, A. Nafiiev and S. Starosyla, RSC Adv., 2024, 14, 1341 DOI: 10.1039/D3RA08147H

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements