Virtual Screening for R-groups, Including Predicted pIC50 Contributions, within Large Structural Databases, using Topomer CoMFA

Multiple R-groups (monovalent fragments) are implicitly accessible within most of the molecular structures that populate large structural databases. R-group searching would desirably consider pIC₅₀ contribution forecasts as well as ligand similarities or docking scores. However, R-group searching, with or without pIC₅₀ forecasts, is currently not practical. The most prevalent and reliable source of pIC₅₀ predictions, existing 3D-QSAR approaches, is also difficult and somewhat subjective. Yet in 25 of 25 trials on data sets on which a field-based 3D-QSAR treatment had already succeeded, substitution of objective (canonically generated) topomer poses for the original structure-guided manual alignments produced acceptable 3D-QSAR models, on average having almost equivalent statistical quality to the published models, and with negligible effort. Their overall pIC₅₀ prediction error is 0.805, calculated as the average over these 25 topomer CoMFA models in the standard deviations of pIC₅₀ predictions, derived from the 1109 possible “”leave-out-one-R-group”” (LOORG) pIC₅₀ contributions. (This novel LOORG protocol provides a more realistic and stringent test of prediction accuracy than the customary “leave-out-one-compound” LOO approach.) The associated average predictive r² of 0.495 indicates a pIC₅₀ prediction accuracy roughly halfway between perfect and useless. To assess the ability of topomer-CoMFA based virtual screening to identify “highly active” R-groups, a Receiver Operating Curve (ROC) approach was adopted. Using, as the binary criterion for a “highly active” R-group, a predicted pIC₅₀ greater than the top 25% of the observed pIC₅₀ range, the ROC area averaged across the 25 topomer CoMFA models is 0.729. Conventionally interpreted, the odds that a “highly active” R-group will indeed confer such a high pIC₅₀ are 0.729/(1-0.729) or almost 3 to 1. To confirm that virtual screening within large collections of realized structures would provide a useful quantity and variety of R-group suggestions, combining shape similarity with the “highly active” pIC₅₀, the ₅₀ searches provided by these 25 models were applied to 2.2 million structurally distinct R-group candidates among 2.0 million structures within a ZINC database, identifying an average of 5705 R-groups per search, with the highest predicted pIC₅₀ combination averaging 1.6 log units greater than the highest reported pIC₅₀s.

Virtual Screening for R-groups, Including Predicted pIC50 Contributions, within Large Structural Databases, using Topomer CoMFA

Optimize Immuno-oncology Drug Discovery and Development Using QSP

Pirana Modeling Workbench

Changing the Game in Oncology Drug Development and Patient Access

Immunogenicity Prediction and Dose Optimization using Clinically-Validated In Silico Modeling and Simulation