There is no single "neoantigen prediction AI." Turning a patient's tumor into a shortlist of vaccine targets is a pipeline of distinct prediction problems stacked on top of one another, and the tools that matter occupy different rungs of that stack. Some are end-to-end pipelines that run the whole sequence; others are the specialized models the pipelines call underneath. Comparing them like-for-like only works once you know which rung each one is on.
This page is a map of that toolchain — the widely-used open-source options, what each is good for, and where the genuine differentiation lives. It is deliberately not a leaderboard: published accuracy numbers are measured on different datasets with different definitions of "correct," so a single ranking would be misleading. What follows is the more useful thing — where each tool fits, and how to pick.
Every neoantigen pipeline walks the same path. First, variant calling compares tumor to healthy DNA to find tumor-specific mutations. Second, HLA typing determines which HLA alleles the patient carries. Third, candidate mutated peptides are generated and scored for HLA binding and presentation — the step people usually mean by "neoantigen prediction." Fourth, the harder question: immunogenicity — will a T cell actually respond? Fifth, the candidates are ranked and filtered down to the handful worth putting in a vaccine.
HLA-binding prediction (rung three) is the mature, commoditized rung. Immunogenicity (rung four) is the unsolved one. That single fact explains the whole landscape: the end-to-end pipelines mostly agree on the binding step because they call the same underlying predictors, and they differentiate on how they handle variants, presentation, and ranking.
These take you from sequencing data (or called variants) to a ranked candidate list. The dominant open-source option is pVACtools, from the Griffith Lab — it handles SNVs, insertions/deletions and gene fusions, covers both MHC class I and class II, plugs into many binding predictors, and ships pVACview for human-in-the-loop prioritization. It is the de facto reference most academic pipelines are measured against.
The others trade breadth for specific strengths: Seq2Neo runs truly end-to-end from raw reads and folds in its own immunogenicity model; pTuneos emphasizes a refined immunogenicity-aware ranking; NeoPredPipe is lightweight and adds tumor-clonality context; ProGeo-neo v2 brings mass-spectrometry proteogenomic validation; OmniNeo widens the net to splicing- and fusion-derived candidates across both MHC classes.
| Pipeline | Variants | MHC class | Immunogenicity model | Type |
|---|---|---|---|---|
| pVACtools / pVACseq | SNV, indel, fusion | I + II | Via plugins + pVACview ranking | Open source |
| Seq2Neo | SNV, indel, fusion | I | Yes (built-in CNN) | Open source |
| pTuneos | SNV | I | Yes (feature-based refinement) | Open source |
| NeoPredPipe | SNV, indel | I | No (binding + clonality) | Open source |
| ProGeo-neo v2 | SNV | I | MS-validated presentation | Open source |
| TSNAD v2 | SNV, indel | I + II | Limited | Open source |
| OmniNeo | SNV, indel, fusion, splice | I + II | Partial | Open source |
| NeoFox | Annotation layer | I + II | Aggregates many features | Open source |
Whatever pipeline you run, it ultimately calls a peptide–HLA predictor to score the binding/presentation step. This is the rung where the field is most settled. NetMHCpan-4.1 is the canonical baseline: a pan-allele neural network that generalizes to rare alleles and outputs both a binding-affinity score and an eluted-ligand (presentation) score — the latter trained on mass-spec data and usually the better signal. NetMHCIIpan covers the class II side.
MHCflurry 2.0 is the leading open-source alternative — competitive, considerably faster, models antigen processing, and is trivially scriptable into Python pipelines. MixMHCpred takes a motif-based route. In practice these tools agree closely on strong binders and diverge on edge cases, which is why mature pipelines often run more than one and combine the scores rather than trusting a single model.
- NetMHCpan-4.1 — Pan-allele class I binding + eluted-ligand presentation (the baseline)
- NetMHCIIpan-4.0 — Class II presentation prediction
- MHCflurry 2.0 — Open-source, fast, models antigen processing
- MixMHCpred — Motif-based class I predictor
- pVACtools — The reference end-to-end open-source pipeline
A peptide can be perfectly presented on HLA and still provoke no T-cell response — and current tools predict presentation far better than they predict immunogenicity. This is the gap the whole field is racing to close, and it is where the newest AI is concentrated: protein language models adapted to peptide–HLA data, structure-based scoring from the AlphaFold/ESM lineage, and TCR–pMHC models that try to ask the real question — will this patient's T cells recognize it?
For anyone evaluating tools, this is the honest bottom line: the binding rung is solved well enough to be a commodity, so a pipeline's value lives in how it handles variants, presentation, and especially how it ranks for immunogenicity. Be skeptical of any tool that markets a high binding accuracy as if it were the same as picking neoantigens that work — it isn't.
If you want a maintained, well-documented reference with the widest variant and MHC-class coverage, start with pVACtools. If you want a single command from raw reads with immunogenicity built in, look at Seq2Neo. If immunogenicity-aware ranking is your priority, pTuneos and the feature-aggregation layer NeoFox are worth a look. If you need mass-spec presentation validation, ProGeo-neo. For the binding step beneath any of them, NetMHCpan-4.1 is the safe default and MHCflurry the fast, open, scriptable companion — and running both is a reasonable hedge.
Commercial platforms (for example the AI-driven pipelines inside Evaxion, BioNTech and Moderna's programs) layer proprietary immunogenicity models and larger private datasets on top of this same conceptual stack, but they are not publicly runnable; the open-source tools above are what an external team can actually deploy and benchmark today.
This landscape is maintained — capabilities, versions and benchmarks move quickly, and any specific claim here should be checked against each project's current documentation. Last reviewed 2026-05-30.