Generating putative JNK3 allosteric binders with Boltz-2

TL;DR: In this blog, we test the Boltz-2 affinity prediction module to guide de novo molecule design of putative JNK3 ligands, not only for orthosteric ATP-competitive binding, but also for possible allosteric sites... data and analysis available on GitHub.

Boltz-2 (Passaro et al. 2025) has made huge improvements to open-source co-folding with the additional functionality of binding affinity prediction which is now 'approaching the accuracy of FEP-based methods'. Although its real-world accuracy and domain of applicability is yet to be determined by the community (for example, see blogs by Pat Walters, deepmirror, and Serhii Vakal), my collaborators at Johnson and Johnson and I wanted to test how this new functionality behaved as a desirability function to guide generative models. This was exemplified in the paper itself where they used the affinity prediction to guide SynFlowNet (Cretu et al. 2024) to generative putative TYK2 binders.

Optimizing JNK3 affinity with Boltz-2

Firstly I implemented Boltz-2 as a scoring function within MolScore (Thomas et al. 2024) to enable reproducible integration across generative models. Then we trained ACEGEN (a chemical language model with RL) and SynFlowNet (a synthesis-aware GFlowNet) to optimize this predicted binding affinity for JNK3 over the course of 10,000 molecules evaluated. We find this budget practically reasonable as the prediction affinity module took ~30 s per molecule, so it is not particularly fast (yet). In total, each run took ~48 hrs on 3 consumer grade GPUs.

We can see below (b) that ACEGEN improves the predicted binding affinity quite efficiently, especially when compared to SynFlowNet which doesn't seem to learn at all. This is not especially surprising considering that in the Boltz-2 paper they trained the model over a budget of 400,000 molecules (40x more than here). Note that I used the default hyperparameters for SynFlowNet, and there may exist more efficient ones. The major benefit of SynFlowNet is that it is synthesis constrained, however, measuring synthetic accessibility (d) doesn't show a significant benefit via this proxy. In fact, as measured by SAScore and QED we are in better chemical space with ACEGEN (usual caveats to these metrics apply). So do we get close to JNK3 ligand space?

The answer to this particularly interesting---no, not really (e). In the top 500 compounds, two known JNK3 ligand scaffolds are recovered, but the majority of scaffolds have ECFP4 Tanimoto similarity values < 0.5 (quite low for scaffolds). Something similar was also observed in the Boltz-2 paper: "[we] find that the generated compounds do not exhibit significant similarity to public TYK2 binders". Are we probing novel chemical space? Or, are we not learning the important chemical features for target-specific binding?

To get a better idea for target-specificity compared to known JNK3 ligands, we predicted the probability of molecules binding to ~430 different kinases at a concentration of 10 µM. Plotted onto the kinome map below, we can see that the ACEGEN de novo compounds that successfully optimize the binding affinity, are also commonly predicted active across the kinome, far more promiscuously than known JNK3 ligands. Note that the small orange circle means that > 20% of molecules are predicted active against that target, then >40%, >60%, and >80% in large red circles. To me, it appears that Boltz-2 leads to enrichment of broad, pan-kinase-like chemotypes and does not enrich protein-ligand specific features. Hence, not observing high similarity to known JNK3 ligands, or TYK2 ligands in the Boltz-2 paper.

Optimizing JNK3 allosteric affinity with Boltz-2

The ATP binding site is highly conserved across kinases and is a central challenge in identifying target selective ligands. So what happens if we force co-folding elsewhere by co-folding de novo ligands with ATP, which should occupy the orthosteric ATP-binding site? Note there are no publicly available structures of allosteric ligands bound to JNK3, hence, no examples in the Boltz-2 training data. This time we are only using ACEGEN as SynFlowNet fails to learn within the budget.

Firstly, optimizing this predicted allosteric binding affinity (b) is slightly harder, taking slightly longer. Besides that, the trends we see are similar. In this case, we should expect that de novo compounds are not similar to known JNK3 ligands (e) as the majority of known ligands should be ATP-competitive.

Where do these molecules bind? Well it seems Boltz-2 narrows down on the DEF allosteric site, which is a site present in some MAPK kinases including the JNK family. In fact, JNK1 does have one structure with an allosteric ligand bound to the DEF region. This is slightly more impressive considering the much smaller amount of relative training data available for Boltz-2. Note the best molecule shown here is likely unsynthesizable, but the majority of other molecules generated do seem reasonable.

Does this fix our predicted kinase promiscuity problem? Well it appears so, but in reality we don't really know as few, if any, kinases will have representative allosteric ligands in the training data. That is also relying on the assumption that this allosteric site exists for JNK3 (likely), they are ligandable (possible), and Boltz-2 does lead to enrichment of pocket-specific ligands.

Absolute binding free energy

Without experimental validation, the next best validation is running ABFE simulations which is usually the most accurate form of binding affinity prediction. Therefore, we can use ABFE as a pseudo-validation for the putative allosteric ligands. As similarly shown in the Boltz-2 paper, we do see correlation between Boltz-2 affinity and ABFE affinity. This correlation was observed on randomly selected compounds at different Boltz-2 affinity ranges (and fixed LogP and mol. weight range). Overall, providing an additional measure of confidence in this approach to finding novel allosteric ligands.

Final comments

While there’s still a journey ahead to fully understand how this approach responds to specific model biases — and to possibly refine it by integrating physics‑based corrections — the horizon is coming into view. Are we edging towards a future of "generating the perfect molecule" within a practical time frame? Time will tell (... and rigorous evaluation and experimental validation that comes with time). This pipeline may not be perfect, but in some cases it is starting to look useful enough to start complementing the workflows within industry.

Authors

Dr. Morgan Thomas - Postdoctoral researcher working in generative molecular design

Dr. Jose C Gómez-Tamayo - Principal scientist at Johnson & Johnson Innovative Medicine leading co-folding integration for small molecules

References

Passaro S, Corso G, Wohlwend J, Reveiz M, Thaler S, Ram Somnath V, Getz N, Portnoi T, Roy J, Stark H, Kwabi-Addo D. Boltz-2: Towards Accurate and Efficient Binding Affinity Prediction. BioRxiv. 2025:2025-06.
Cretu M, Harris C, Igashov I, Schneuing A, Segler M, Correia B, Roy J, Bengio E, Liò P. Synflownet: Design of diverse and novel molecules with synthesis constraints. arXiv preprint arXiv:2405.01155. 2024 May 2.
Thomas M, O’Boyle NM, Bender A, De Graaf C. MolScore: a scoring, evaluation and benchmarking framework for generative models in de novo drug design. Journal of Cheminformatics. 2024 May 30;16(1):64.

Search This Blog

Cheminformantics