Artificial intelligence (AI) is driving advancements in genome editing, from predictive modeling to generative design. Emerging generative AI tools such as RFdiffusion, AlphaFold 3, and ESM now facilitate the de novo design of linkers, inhibitors, and enzymes. Recently, a commentary article titled "Expansion of artificial intelligence for genome editing" published in Nature Structural & Molecular Biology reviewed the recent work by Lu et al., who utilized AI to improve the precision of mitochondrial cytosine base editors.
Over the past decade, the influence of AI in biological research has grown rapidly. Similarly, AI technologies have profoundly impacted the field of genome editing. Researchers have applied traditional AI techniques, including machine learning and deep learning, to leverage large-scale experimental data for various predictive tasks, such as predicting guide RNA (gRNA) activity, off-target analysis, and editing outcomes. These predictive models have facilitated the selection of appropriate gRNAs for various clustered regularly interspaced short palindromic repeats (CRISPR) nucleases, base editors, and prime editors (Figure 1).
Figure 1. From prediction to design in genome-editing technologies. (Uhm H, Bae S., 2025)
The Rise of Generative AI
The emergence of generative AI is significantly influencing this field. Building on AlphaFold, RoseTTAFold, AlphaFold 2, and the recently released AlphaFold 3-which utilizes generative AI-protein structure prediction capabilities have expanded to include nucleic acids and ligand complexes, thereby revealing new possibilities. Structure-based generative AI tools, such as RFdiffusion, have demonstrated the ability to go beyond replicating known structures and design novel proteins not found in nature. Similarly, protein language models (PLMs), including ESM3, facilitate comprehensive analysis of the vast sequence space of naturally occurring proteins across multiple layers and generate new functional sequences. Overall, these advancements mark AI's transition from a purely "predictive" phase to a "design" phase, fundamentally altering the paradigm of biological research.
AI-Enhanced Mitochondrial Cytosine Base Editor (DdCBE-TOD)
The authors first pointed out the limitations of existing mitochondrial cytosine base editors (such as DdCBE). In these editors, the TALE DNA-binding domain and the deaminase (a DddA variant) are connected via a flexible linker, creating an editing window of approximately 14-18 base pairs-wide enough to cause frequent bystander edits. Recognizing that this structural flexibility compromises precision, the authors used RFdiffusion to design and incorporate a completely new targeting domain that rigidly fixes the relative positions of TALE and the deaminase.
Figure 2. De novo design of orienting domains enabling precise and efficient base editing. (Mi L, et al., 2025)
The resulting TALE-oriented deaminase (TOD) aligns its catalytic pocket with a single target cytosine. In an E. coli system, TOD exhibited 4.5 times higher efficiency than previous editors while narrowing the editing window to only 3 base pairs. Cryo-electron microscopy structures revealed that the designed domain forms a rigid linker structure between TALE and the deaminase, stabilizing their conformation. To reduce off-target activity, the authors ultimately developed a split version, DdCBE-TOD, which activates only when TALE simultaneously binds DNA. This design effectively reduced unwanted cellular editing while maintaining high targeting efficiency. In patient-derived fibroblasts carrying the pathogenic MT-TK m.A8344G mutation associated with myoclonic epilepsy with ragged-red fibers (MERRF) syndrome, DdCBE-TOD corrected approximately 41% of the mutant mtDNA, fully restoring normal cellular function. It demonstrated therapeutic potential in a mouse disease model.
Broader Applications of AI in Enzyme Design
The DdCBE-TOD study exemplifies how AI can design protein structures to achieve specific functional goals. Recently, more cases involving the application of AI to construct novel core enzymes or auxiliary proteins have been reported in the genome editing field. While DdCBE-TOD highlights the potential of structural generative AI, OpenCRISPR underscores the efficacy of PLMs in sequence-based design. Profluent Bio's OpenCRISPR-1 was developed using large language models to design Cas9 variants not found in nature. This artificial enzyme achieved editing efficiency comparable to Streptococcus pyogenes Cas9 while exhibiting enhanced off-target specificity and compatibility with base editing contexts.
A key advancement is that the PLM introduced a new functional sequence unattainable through any natural evolutionary experiment. Instead of relying on random mutagenesis libraries, the AI generated and evaluated hundreds of thousands of candidate sequences, directly delivering a functional sequence. This result suggests that AI-based sequence design can be extended to other enzyme families beyond Cas9, including reverse transcriptases, transposases, and integrases.
Prime Editing with Small Binder (PE-SB)
Another example utilizing generative AI is the prime editing-small binder (PE-SB) study. Prime editing efficiency is limited by the mismatch repair pathway. To overcome this, researchers used RFdiffusion and AlphaFold 3 to design a completely new small binder, termed MutL homolog 1 (MLH1)-SB, which disrupts the MLH1-PMS2 interaction, thereby inhibiting mismatch repair. The final PE7-SB2 system exhibited approximately 19-fold higher efficiency than PEmax (an improved version of PE2) in HeLa cells, about 2.5-fold higher than traditional PE7, and approximately 3.4-fold improvement in a mouse model. Unlike previous approaches that utilized natural inhibitory proteins (including dominant-negative MLH1 mutants to suppress mismatch repair), PE-SB represents a novel structure (MLH1-SB) created by generative AI that achieves high efficiency and specificity.
Challenges and Future Prospects
To date, the integration of CRISPR with AI has primarily focused on predictive models developed by machine learning and deep learning, which remain active research areas. Specifically, gRNA design has improved due to models based on large-scale screening data and off-target measurements, enabling simultaneous assessment of targeting efficacy and off-target risk. Other frameworks have enhanced predictive capabilities by simulating post-cleavage DNA repair outcomes and estimating probability distributions of base editing and prime editing events. Moving beyond simple rule-based heuristics, these models increasingly incorporate sequence context, chromatin accessibility, and other auxiliary features through multimodal learning strategies.
However, these prediction-centric approaches still focus on optimizing outcomes within the boundaries of existing knowledge. Generative AI and structure-based design enable researchers to develop novel protein structures and functions. As described above, DdCBE-TOD precisely redesigned the spatial orientation between the catalytic domain and DNA-binding domain to achieve single-nucleotide precision, while the PE-SB binder introduced a completely new protein that selectively disrupts DNA repair pathways. Consequently, AI is evolving from a predictive assistant to an active designer of enzyme structures and a generator of proteins that transcend natural evolution. This marks a paradigm shift that can substantially enhance the precision and functional diversity of genome editing tools.
Challenges remain. Although DdCBE-TOD demonstrates impressive single-nucleotide precision, editing efficiency is still constrained by sequence context, and clinical applications face challenges such as delivery and immune responses. These are solvable problems expected to diminish with ongoing research.
The ability of AI to enhance the toolkit of editing enzymes holds broad promise. CRISPR effectors, including Cas9, Cas12, and Cas13 variants, have been fused with various auxiliary domains-including deaminases, reverse transcriptases, and protein binders-to generate diverse platforms such as base editors, prime editors, epigenome editors, and targeted integration systems. In the future, AI may enhance existing architectures-for example, by optimizing spatial arrangements as in the DdCBE-TOD study-and generate novel classes of editors with no counterparts in nature.
References
- Mi L, et al. Computational design of a high-precision mitochondrial DNA cytosine base editor. Nature Structural & Molecular Biology, 2025: 1-12.
- Uhm H, Bae S. Expansion of artificial intelligence for genome editing: Protein design. Nature Structural & Molecular Biology, 2025: 1-3.
