About Splicify

A quiet tool
for a precise craft.

Splicify is a quiet, deterministic design tool for molecular cloning. You describe the plasmid you want to build in plain language — or upload the GenBank files of what you already have — and Splicify classifies the request, resolves named parts against an in-house knowledge base, scores every cloning method that could build the target, and runs the chosen workflow end-to-end. The result is a primer set, a protocol, an annotated plasmid map, and a workflow trace that documents every decision.

The intent classifier and predesign pipeline are fully deterministic — keyword and regex rules in the front; Primer3, SBOL3, and a clean-room six-tier annotation pipeline in the back. Plain-language plasmid descriptions are matched to a corpus of >7,000 LLM-annotated reference plasmids by semantic retrieval, then edited deterministically: insertions under 40 bp ride on primer tails, insertions of 40 bp and longer become synthesis fragments. An optional LLM orchestrator slot is reserved for the cases where the deterministic edit set leaves gaps; today it ships as a no-op so every reply is reproducible.

The primer-design algorithm uses Primer3 to calculate primer characteristics and carefully weighs the optimal extensions to maximise the probability of successful PCR and assembly — annealing Tm, overlap Tm, mispriming, primer-dimer risk, secondary structures, fragment count, and length. The result is a full picture of the factors contributing to experimental success.

Open-source acknowledgements

Splicify stands on the shoulders of the scientific software community. We are grateful to the authors and maintainers of every project below for making their work openly available.

Software & libraries

Primer3 — primer design, thermodynamic calculations, hairpin / homodimer scoring.
SBOL3 (Synthetic Biology Open Language v3) — standardised export of modules and SBO-typed interactions; round-trip via the pySBOL3 reference implementation.
BioPython — GenBank / FASTA parsing, sequence record manipulation, feature handling.
Sentence-Transformers (UKPLab) and the all-MiniLM-L6-v2 model — embedding plasmid token streams and natural-language descriptions for semantic retrieval.
HNSWlib — approximate nearest-neighbour index used during corpus build (runtime queries are brute-force cosine over a NumPy array).
SeqViz (Lattice Automation) — interactive DNA sequence visualisation in the linear viewer.
BLAST+, MMseqs2, Infernal — feature search across the six annotation reference tiers.
FastAPI, Next.js, React, PyTorch — the application and ML stack.

Sequence & feature data

SnapGene — 1,767 reference plasmid sequences span nine functional families (basic cloning vectors, CRISPR plasmids, fluorescent-protein vectors, Gateway destination / entry vectors, I.M.A.G.E. Consortium plasmids, insect-cell vectors, luciferase vectors, Lucigen vectors, mammalian expression vectors). Only the DNA sequences from the SnapGene-distributed GenBank files were used — not the SnapGene-authored features, maps, or notes; every annotation rendered in Splicify is generated by our own clean-room annotation pipeline. This corpus is the regression set for the annotation pipeline and part of the retrieval corpus for plain-language plasmid design.
NCBI RefSeq and NCBI engineered plasmids — 41 RefSeq + 5,414 engineered records contribute to the 7,256-plasmid retrieval corpus, with sequence and metadata fetched via Entrez.
VectorBuilder — 34 representative vectors plus 26 shorthand description ↔ token pairs that seeded the description-conditioned generative model.
GenoLIB — 1,062 main-tier nucleotide features and 706 GenoLIB CDS translations underpin the clean-room feature reference (post-pLannotate, 2026-04-19).
FPbase — 721 fluorescent-protein records; identifies and classifies reporter CDSs.
UniProt / SwissProt — 66,221 curated PE-1 and whitelisted protein entries for protein-level feature search.
Rfam — 1,737 curated families covering riboswitches, ribozymes, cis-elements, and structured non-coding RNAs.
Gene Ontology — Sequence Ontology (SO) and Systems Biology Ontology (SBO) — role and interaction URIs that flow through to SBOL3 export.

Contact

Splicify was created by Devon Fitzpatrick, with advice on automation and business development from Rishij Mewada and help from many friends in the molecular-biology community.

General inquiries, product, customer service

Devon Fitzpatrick

devon@splicify.ai

Business and legal inquiries

Rishij Mewada

rishij@splicify.ai