Predicting RNA structures from sequence and probing data
This review explains how classical thermodynamic RNA folding models can be improved with chemical probing data, and why that combination remains one of the most reliable routes to biologically useful structure prediction.
RNA secondary structure prediction is sometimes presented as if there were a clean historical break between "old biophysics" and "new AI". That is not how the field actually developed. Long before the current machine-learning wave, RNA bioinformatics had already built a sophisticated toolkit around thermodynamic folding, ensemble analysis, comparative evidence, and experimental structure probing. This review comes from that earlier period, and that is exactly why it remains useful. It explains the core logic of the field without mistaking benchmark performance for mechanistic understanding.
The starting point is the classical thermodynamic view of RNA folding. Dynamic programming algorithms can efficiently compute minimum-free-energy structures, base-pairing probabilities, and partition-function ensembles under a well-defined energy model. These methods remain powerful because they do more than output a single structure. They give an explicit physical interpretation of structural alternatives, uncertainty, and energetic tradeoffs. The review therefore emphasizes that RNA structure prediction should not be reduced to "find the one correct fold". For many RNAs, the ensemble itself is the relevant biological object.
At the same time, purely sequence-based thermodynamic prediction has obvious limits. Energy parameters are imperfect, tertiary interactions are usually treated only indirectly, and the energetically optimal structure is not always the biologically realized one. This becomes especially clear for regulatory RNAs, long transcripts, and systems shaped by kinetics, ligand binding, proteins, or cellular context. The review lays out these limitations clearly, which is one reason it has remained a useful reference.
The main contribution of the article is its overview of how chemical and enzymatic structure probing can be integrated with folding algorithms. Methods such as SHAPE, PARS, and related probing strategies provide nucleotide-resolution information about local flexibility or accessibility. The key computational question is how to convert those experimental readouts into something a folding algorithm can use. The review discusses approaches based on pseudo-energies and soft constraints, where probing reactivities perturb the thermodynamic model rather than replacing it outright. That design choice matters because it keeps the prediction grounded in base-pairing energetics while still letting experimental evidence pull on the result.
This combination of experiment and computation was, and remains, one of the most productive ideas in RNA structure prediction. Probing data can help discriminate among near-optimal folds, recover structures that sequence-only models miss, and improve the interpretation of structural ensembles. The review does not oversell the approach, though. Experimental data are noisy, condition-dependent, and often indirect. A reactivity profile is not itself a structure. It still has to be interpreted through a model, and the quality of the result depends on both the experiment and the computational framework used to incorporate it.
That balanced perspective is what makes the article age well. It is optimistic about combining data with theory, but it does not pretend that more data automatically solve the inference problem. In that sense, the paper anticipated a lesson that still matters now. Prediction improves when models encode the right constraints and when external evidence is incorporated thoughtfully, not just when another layer of complexity is added.
The same issue comes up whenever a computational result is used to justify an experimental move. In When to trust RNA structure prediction for experimental decisions, I take that one step further and ask what level of structural evidence is actually enough for a design choice, a mutational plan, or a mechanistic claim.
For readers new to the topic, this review is a good entry point because it connects several levels of the field at once, from classical RNA folding algorithms and ensemble thinking to experimental probing and the practical business of combining them. For readers already working in RNA biology, it remains a strong reminder that the most reliable structural insight often comes from combining complementary sources of information rather than choosing between "physics" and "data".
Two natural follow-ups are SHAPE directed RNA folding with the ViennaRNA Package, which is the more implementation-focused companion piece describing specific SHAPE integration strategies in ViennaRNA, and Caveats in deep learning for RNA secondary structure prediction, which picks up the story from the later AI period and shows why data-driven models still benefit from the classical structural thinking summarized here. Predicting RNA Structures from Sequence and Probing Data SHAPE Directed RNA FoldingCitation
Ronny Lorenz, Michael T. Wolfinger, Andrea Tanzer, Ivo L. Hofacker
Methods 103:86–98 (2016) | doi:10.1016/j.ymeth.2016.04.004 | PDFSee Also
Ronny Lorenz, Dominik Luntzer, Ivo L. Hofacker, Peter F. Stadler, Michael T. Wolfinger
Bioinformatics 32: 145–47 (2016) | doi:10.1093/bioinformatics/btv523 | PDF