RNA-Peptide Binding Analysis
From 21,000 Measurements to Millions of Predictions: AI That Learns Your RNA Binding Landscape
RNAproFold transforms microarray screening data into a predictive engine. Train once on your experimental data, then query any peptide sequence instantly. Three complementary ML architectures ensure robust predictions across diverse chemical space.
R² = 0.71
Validated on 4,200 held-out peptides
<10ms Per Prediction
Query thousands of candidates simultaneously
6,861 Unique Peptides
Trained on comprehensive
microarray data
4 RNA Polymers + Hairpins
PolyA/C/G/U plus structured RNA support
The RNA-Targeting Challenge
Every month of delay in RNA therapeutic development costs pharmaceutical companies $8-15M in lost market opportunity.
Traditional peptide-RNA drug discovery faces critical bottlenecks:
1. Data Trapped in Microarrays
- 1000s of KD measurements collected, then discarded
- Screening costs ~20k €, provides <1% chemical space coverage
- No systematic way to extrapolate beyond tested peptides
2. Iterative Design Bottleneck
- 4-8 weeks from design to experimental validation
- Iterative cycles delay time-to-market
- Manual data analysis is error-prone
3. Polymer-Specific Blind Spots
- Can only test 10³-10⁴ peptides per screen
- Misses optimal candidates outside tested library
- Difficult to explore combinatorial space
Accelerate RNA-Targeted Drug Discovery with AI-Powered Peptide Design
Your microarray data contains patterns for millions of untested peptides. RNAproFold extracts that knowledge and makes it queryable—turning retrospective screening into prospective design.
RNAproFold uses machine learning to predict peptide-RNA binding affinities in seconds, reducing months of experimental screening to minutes of computational analysis.
AI-Driven Peptide Optimization for RNA Therapeutics
Our platform combines experimental microarray data with machine learning to predict binding affinities for any peptide-RNA pair
- Upload raw fluorescence curves from microarray scanners
- Langmuir isotherm fitting with R² quality control (>0.6 threshold)
- Automatic outlier detection and KD extraction (0.1-1000 µg/mL range)
- Handles block-to-concentration mapping across dilution series
- Per-polymer performance profiling (polyG: R²=0.56, polyA: R²=0.32)
- One-hot encoding captures base-specific interaction patterns
- Charge-based binding hypothesis validated: +5% improvement for K/R-rich peptides
- Cross-validation prevents overfitting (train R²=0.94 → test R²=0.69)
- Baseline Model: 23 biochemical features + Random Forest (300 trees)
- ESM-2 Transfer Learning: 320-dim protein embeddings from 250M sequences
- Hybrid Model: ESM-2 + hand-crafted features (343-dim space)
- Stratified train/test split preserves polymer distribution
- Query interface for single peptides or batch CSV upload
- Per-prediction confidence intervals based on ensemble variance
- Binding strength classification: tight (<10 µg/mL), medium, weak (>100 µg/mL)
- Export ranked lists with polymer-specific KD estimates
Rigorous Science Meets Cutting-Edge AI
RNA as a Drug Target
RNA molecules play critical roles in gene regulation, making them attractive therapeutic targets for:
- Genetic disorders (Huntington’s, muscular dystrophy)
- Viral infections (COVID-19, HIV, influenza)
- Cancer (oncogenic mRNA targeting)
- Neurodegenerative diseases
The Peptide Advantage
- Higher specificity than small molecules
- Lower immunogenicity than antibodies
- Tunable pharmacokinetics
- Easier to synthesize and modify
Machine Learning Architecture
Enterprise-Grade Features for Biotech & Pharma
Confidence Scoring
- R² > 0.75 on validation sets
- RMSE ±0.26 log units (~2-3x fold accuracy)
- Per-prediction uncertainty estimates
- Quality flags for out-of-domain sequences
High-Throughput Screening
- Screen millions of peptide candidates in silico
- Filter by binding strength (tight/medium/weak)
- Export ranked lists for experimental validation
- Batch processing API for automation
Multi-Target Prediction
- Simultaneous predictions for polyA, polyC, polyG, polyU
- Custom RNA sequence analysis (coming soon)
- Cross-reactivity profiling
- Selectivity scoring
Active Learning Pipeline
- Integrate your proprietary screening data
- Model retraining with custom datasets
- Transfer learning for novel RNA targets
- Continuous model improvement








