Abstract

RNA-protein interactions drive fundamental biological processes and are targets for molecular engineering, yet quantitative and comprehensive understanding of the sequence determinants of affinity remains limited. Here we repurpose a high-throughput sequencing instrument to quantitatively measure binding and dissociation of a fluorescently labeled protein to >10^7 RNA targets generated on a flow-cell surface by in situ transcription and intermolecular tethering of RNA to DNA. Studying the MS2 coat protein, we decompose the binding energy contributions from primary and secondary RNA structure, finding that differences in affinity are often driven by sequence-specific changes in both association and dissociation rates. By analyzing the biophysical constraints and modeling mutational paths describing the molecular evolution of MS2 from low- to high-affinity hairpins, we quantify widespread molecular epistasis and a long-hypothesized, structure-dependent preference for G:U base pairs over C:A intermediates in evolutionary trajectories. Our results suggest that quantitative analysis of RNA on a massively parallel array (RNA-MaP) provides generalizable insight into the biophysical basis and evolutionary consequences of sequence-function relationships.



  • Figure 1. A massively parallel RNA array for quantitative, high-throughput biochemistry.


  • Figure 2. A quantitative map of MS2 binding across RNA sequence variants.


  • Figure 3. Binding affinity is dependent on primary sequence and secondary RNA structure.


  • Figure 4. Sequence-specific contributions of association and dissociation rates to binding affinity.


  • Figure 5. Evolutionary landscapes are highly constrained by biophysical requirements.