Jun 09 2023 RNA-protein complex refinement using AI modeling and docking

This article explains a workflow for refining protein-RNA complexes by combining AI-based structural models with flexible docking and enhanced sampling.

Association complex of Musashi RBD1 and RBD with a target RNA

Modeling protein-RNA complexes remains difficult even when reasonably good structures for the individual components are available. The hardest part is often not generating a starting model, but refining the interface in a way that captures flexible RNA segments and produces a biologically plausible binding geometry.

This study addresses that problem with a two-step workflow. First, AlphaFold2 was used to generate a structural model for the RNA-binding domains of the human Musashi-1 (MSI1) protein. Second, the resulting model was refined in the presence of RNA using flexible docking based on parallel cascade selection molecular dynamics (PaCS-MD). The point of the method is not simply to place RNA near a protein surface, but to sample interface rearrangements that matter for complex formation.

Musashi-1 is a useful test case because its RNA recognition has been studied experimentally, which means the resulting models can be checked against known interaction patterns. In the refined complexes, the analysis recovered a core set of residues and nucleotides that are consistent with previous work on MSI1-RNA recognition. That matters more than raw structural novelty. A refinement method is only useful if it recovers contacts that make biochemical sense.

Compared with a more standard template-based workflow built around Phyre2, the PaCS-MD approach produced better-supported association complexes in this system. The main reason is that enhanced sampling gives flexible RNA regions more room to explore realistic conformations during docking, instead of forcing the final model to depend too heavily on manual assembly or rigid starting assumptions.

The method still has clear limits. It depends on having useful initial structural information, and the quality of the final complex remains tied to the quality of both the starting model and the sampling protocol. This is not a general solution to protein-RNA structure prediction from sequence alone. It is a refinement strategy that becomes valuable when there is already enough structural context to make flexible docking meaningful.

That makes the study best understood as a methods contribution. It shows that AI-derived protein models can be combined with enhanced-sampling docking to improve protein-RNA complex refinement, at least for systems like MSI1 where independent evidence exists for the binding interface. For researchers working on RNA-binding proteins, this is a more realistic and useful claim than broad promises about drug discovery.

I place that point in a wider modeling context in What AI can and cannot do for RNA structure and RNA-protein modeling.

If your lab or company needs an external review of an RNA-protein modeling workflow, a structure-guided design problem, or a docking strategy, I also offer focused advisory support through my services page.

Abstract

An efficient structural refinement technique for protein-RNA complexes is proposed based on a combination of AI-based modeling and flexible docking. Specifically, an enhanced sampling method called parallel cascade selection molecular dynamics (PaCS-MD) was extended to include flexible docking to construct protein-RNA complexes from those obtained by AI-based modeling (AlphaFold2). With the present technique, the conformational sampling of flexible RNA regions is accelerated by PaCS-MD, enabling one to construct plausible models for protein-RNA complexes. For demonstration, PaCS-MD constructed several protein-RNA complexes of the RNA-binding Musashi-1 (MSI1) family of proteins, which were validated by comparing a group of crucial residues for RNA-binding with experimental complexes. Our analyses suggest that PaCS-MD improves the quality of complex modeling compared to the standard protocol based on template-based modeling (Phyre2). Furthermore, PaCS-MD could also be a beneficial technique for constructing complexes of non-native RNA-binding to proteins.

Citation

A Structural Refinement Technique for Protein-RNA Complexes Using a Combination of AI-based Modeling and Flexible Docking: A Study of Musashi-1 Protein
Nitchakan Darai, Kowit Hengphasatporn, Peter Wolschann, Michael T. Wolfinger, Yasuteru Shigeta, Thanyada Rungrotmongkol, Ryuhei Harada
B. Chem. Soc. Jpn. (2023)

Citation

See Also