Feb 01 2025 What AI can and cannot do for RNA structure and RNA-protein modeling

AI affects RNA structure and RNA-protein modeling in very different ways depending on whether the task concerns ranking, geometry generation, or mechanistic inference, and whether the output remains anchored in physics and experiment.

RNA protein complex embedded in a network

AI now appears in several distinct parts of RNA biology, including sequence annotation, structure scoring, kinetics approximation, and the generation of candidate models for RNA-protein complexes. Treating "AI for RNA" as a single activity obscures the fact that these tasks operate at different levels of inference.

One distinction that matters is between output generation and interpretation. A model may produce a plausible structure, a ranking, or a strong benchmark score without resolving the biological uncertainty that motivated the analysis. Caveats to deep learning approaches to RNA secondary structure prediction makes that point very clearly. Good performance on familiar datasets does not guarantee generalization to new sequence families, new experimental conditions, or regulatory RNAs that occupy multiple states.

Another distinction is between coordinate generation and mechanistic explanation. In RNA-protein systems, AI-derived models can place domains sensibly, suggest contact regions, and reduce the search space. They do not necessarily settle whether the proposed geometry explains specificity, accessibility, competition, or function. A structural refinement technique for protein-RNA complexes using a combination of AI-based modeling and flexible docking is relevant here because refinement, docking, and consistency checks still change the conclusion.

The Musashi line of work makes this especially clear. Theoretical studies on RNA recognition by Musashi1 RNA-binding protein asks which motifs bind better and why. From Structure to Function: Computational Insights into Musashi-RNA Complexes then steps back and asks how those structural observations connect to a broader functional picture. AI enters that workflow at the model-generation stage, not at the level of final interpretation.

Kinetics offers a similar lesson. KinPFN: Bayesian Approximation of RNA Folding Kinetics is interesting precisely because it does not pretend to solve folding physics from scratch. It uses AI as an approximation layer on top of a physically meaningful problem.

The most convincing use cases are therefore those in which AI addresses ranking, initialization, or approximation, while the mechanistic interpretation remains anchored in physics, experiment, or comparative evidence. Confidence scores alone are much less informative than that broader context.

For researchers and teams, the practical issue is usually which part of the inference has actually been automated, and which part still depends on stronger stress testing or on additional evidence. That judgment becomes especially difficult when the output looks polished and the project is under time pressure. In that situation, a careful review of the modelling assumptions and likely failure modes is usually more valuable than enthusiasm. My services page describes the formats I use for that kind of focused review and advisory support.