If you are building a neoantigen pipeline, picking an MHC class I predictor is one of the first forks in the road — and the public guidance is thin. Most search results still point to a 2017 preprint or a years-old GitHub issue. This is a current, neutral comparison of the two tools that dominate the field: NetMHCpan-4.1 from DTU and MHCflurry 2.0 from OpenVax.
The short version: neither is uniformly better. They are trained on overlapping data, agree closely on strong binders, and diverge at the edges — non-9-mers, rare alleles, and the boundary between binding and presentation. The right choice depends on your licensing constraints, whether you need MHC class II, and how deeply you need to script the tool into a reproducible pipeline. We will also name the caveat that outranks the choice itself: presentation is not immunogenicity.
Both tools are pan-allele neural-network predictors: they take an HLA sequence and a peptide and score the pairing, generalizing across alleles rather than training one model per allele. The substantive differences are in training data, outputs, and packaging.
NetMHCpan-4.1 is trained on a combined set of more than 850,000 peptides spanning both quantitative binding-affinity (BA) measurements and mass-spectrometry eluted-ligand (EL) data, drawn from single-allele and multi-allele sources. It outputs both a BA score and an EL (presentation-likelihood) score, and it covers a broad set of MHC molecules across human and several non-human species. Its sibling NetMHCIIpan handles MHC class II — a capability MHCflurry does not provide.
MHCflurry 2.0 is structured as two stacked models. A binding predictor scores peptide–MHC affinity; an allele-independent antigen-processing predictor models effects such as proteasomal cleavage from the peptide's flanking sequence. A small logistic-regression "presentation" model combines the two into a composite presentation score, with the processing and binding components trained on mass-spec ligand data. The whole thing is an Apache-licensed Python package — pip-installable, fast over large peptide sets, and easy to call from code.
So the philosophies converge more than they differ: both now lean on mass-spec immunopeptidomics to predict presentation rather than affinity alone. NetMHCpan folds EL data directly into one pan-allele network; MHCflurry factors processing out as an explicit, separately inspectable module.
| Dimension | NetMHCpan-4.1 | MHCflurry 2.0 |
|---|---|---|
| Approach | Single pan-allele neural net over BA + EL data | Stacked binding + antigen-processing models, combined into a presentation score |
| Training data | >850k peptides: in-vitro binding affinity and MS eluted ligands | MS eluted ligands for binding and processing; logistic combiner |
| MHC class | Class I; class II via companion NetMHCIIpan | Class I only |
| Rare-allele coverage | Broad; predicts any MHC with a known sequence, incl. many non-human species | Pan-allele over human HLA; narrower species/allele scope |
| Speed / scriptability | Standalone binary or web server; usable in pipelines | Pip-installable Python; fast on large sets, easy to script |
| Output | Both BA and EL (presentation) scores, with %rank | Binding affinity, processing, and composite presentation score |
| License | Free for academic/non-commercial; commercial license required; closed source | Apache 2.0 open source; commercial use allowed |
| Best for | Class II needs, rare/non-human alleles, established academic workflows | Open-source/commercial pipelines, reproducibility, programmatic batch scoring |
Reach for NetMHCpan-4.1 when you need MHC class II — its companion NetMHCIIpan is the standard, and MHCflurry simply does not cover class II. It is also the safer default for rare or non-human alleles, since it predicts for any MHC molecule with a known sequence, and for fitting into established academic workflows where reviewers and collaborators already expect NetMHCpan outputs and %rank thresholds.
Reach for MHCflurry 2.0 when licensing and engineering matter. It is Apache-licensed, so a startup can put it in a commercial product without a separate agreement. It installs with pip, runs fast over large peptide libraries, and exposes its binding, processing, and presentation scores as separate, inspectable values — which makes it easy to drop into a reproducible, version-pinned pipeline and to reason about why a peptide scored the way it did.
In practice, many teams run both and look for agreement. The tools concur closely on strong binders, so consensus calls are robust. Disagreement clusters at the edges — longer or shorter than 9-mers, rare alleles with sparse training data, and peptides near the presentation threshold — and those are exactly the cases worth flagging for wet-lab follow-up rather than trusting either score alone.
Two adjacent tools are worth knowing. MixMHCpred (GfellerLab) is a motif-based presentation predictor trained on eluted-ligand data, free for academic use with a separate commercial license; it is a useful third opinion. NetCTL bundles MHC binding with proteasomal cleavage and TAP transport for an older-style integrated CTL-epitope score. Neither displaces NetMHCpan or MHCflurry as the primary class I predictor, but both are reasonable cross-checks.
Here is the caveat that matters more than the NetMHCpan-vs-MHCflurry decision: a high binding or presentation score predicts that a peptide will be displayed on the cell surface. It does not predict that a T cell will recognize it. These are different questions, and conflating them is the most common way neoantigen pipelines overpromise.
The literature is blunt about the gap. Most peptides predicted to be presented never trigger a T-cell response, because immunogenicity also depends on TCR recognition of the peptide–MHC complex, central tolerance, and physicochemical peptide features that presentation models do not capture. Mutations sometimes raise binding affinity, sometimes leave it untouched while still creating a peptide different enough to be seen by T cells — affinity and immunogenicity are only loosely coupled.
The practical implication: use NetMHCpan or MHCflurry as a presentation filter, then layer dedicated immunogenicity or TCR-recognition models (PRIME, MixMHCpred's immunogenicity-oriented companions, and TCR-pMHC binding predictors) on top before committing wet-lab resources. The binding predictor narrows the field from thousands of candidates to a tractable shortlist; it does not pick the winners.
Treat both NetMHCpan-4.1 and MHCflurry 2.0 as state-of-the-art MHC class I predictors and choose on constraints, not on a leaderboard. Need class II, rare alleles, or a familiar academic workflow: NetMHCpan. Need open-source licensing, easy scripting, and inspectable component scores for a reproducible or commercial pipeline: MHCflurry. Running both and trusting the consensus is a defensible default. And whichever you pick, remember it answers "will this be presented?" — not "will this be immunogenic?"
Maintained note: this comparison was last reviewed on 2026-05-30 against NetMHCpan-4.1 and MHCflurry 2.0. If you spot a version change or a benchmark result that shifts the guidance, the recommendation here is to re-validate on your own alleles and peptides rather than to assume the ranking holds.
- NetMHCpan-4.1 (DTU Health Tech) — Pan-allele MHC class I predictor; BA + EL output. Free academic use.
- NetMHCIIpan-4.1 (DTU Health Tech) — Companion MHC class II predictor.
- MHCflurry (OpenVax, GitHub) — Apache-licensed open-source class I binding + presentation predictor.
- MHCflurry 2.0 paper (Cell Systems) — Antigen-processing model and presentation-score benchmarks.
- NetMHCpan-4.1 / NetMHCIIpan-4.0 paper (NAR) — Training data, EL-vs-BA integration, allele coverage.
- MixMHCpred (GfellerLab, GitHub) — Motif-based presentation predictor; useful third opinion.