
Over time RL encourages the agent to make decisions in order to maximize a reward function which can be tailored to optimize drug-like properties. Moreover, reinforcement learning (RL) has been applied in conjunction with generative models to apply an iterative design process in which an agent (a model) learns to generate compounds achieving increasing scores. Neural network architectures including recurrent neural networks (RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), and graph neural networks (GNNs) have demonstrated success in using input data as SMILES or molecular graphs to generate promising chemical ideas. Recently, generative models have been proposed to sample chemical space beyond what is covered by established datasets by conferring the ability to sample novel compounds. The major obstacle is the sheer number of possible molecules, estimated to be on the order of 10 10^60, effectively preventing a brute-force search of chemical space.

One of the quintessential problems is de novo drug design which involves finding promising candidate molecules that satisfy a multi-parameter optimization (MPO) objective.

Machine learning has emerged as a versatile tool with potential to accelerate drug discovery. With docking activated, REINVENT is able to retain key interactions in the binding site, discard molecules which do not fit the binding cavity, harness unused (sub-)pockets, and improve overall performance in the scaffold-hopping scenario. We show that an informative docking configuration can inform the REINVENT agent to optimize towards improving docking scores using public data. Docking algorithms vary greatly in performance depending on the target and the benchmarking and analysis workflow provides a streamlined solution to identifying productive docking configurations. Using the benchmarking and analysis workflow provided in DockStream, execution and subsequent analysis of a variety of docking configurations can be automated. DockStream is a flexible, stand-alone molecular docking wrapper that provides access to a collection of ligand embedders and docking backends. To overcome these limitations, we introduce a structure-based scoring component for REINVENT. However, QSAR models are inherently limited by their applicability domains. A major obstacle of generative models is producing active compounds, in which predictive (QSAR) models have been applied to enrich target activity. This improved and extended iteration supports far more features and scoring function components, which allows bespoke and tailor-made protocols to maximize impact in small molecule drug discovery projects.

Recently, we have released the de novo design platform REINVENT in version 2.0.
