Top Open Source Summarization Models

navaneeth · August 19, 2025, 2:01pm

Hey All,

We just wrapped a hands-on round with our JudgeLock evaluation: here’s what I discussed in the video and my current top open-source picks for summarization on Hugging Face:

Top 3 (with quick stats)

OpenHermes-2.5-Mistral-7B — https://huggingface.co/teknium/OpenHermes-2.5-Mistral-7B
- Score 9.69 | C/A/H/R/B = 9/10/10/9/10 | ~8k ctx | Apache-2.0 | EN
FLAN-T5-Base — https://huggingface.co/google/flan-t5-base
- Score 9.15 | C/A/H/R/B = 8/9/9/9/10 | 512 ctx | Apache-2.0 | Multilingual
SummLlama3.2-3B (DISLab) — https://huggingface.co/DISLab/SummLlama3.2-3B
- Score 9.10 | C/A/H/R/B = 9/9/7/9/8 | ~8k ctx | Llama 3.2 community | 8 langs

How we ranked them (JudgeLock by BrainDrive)
Five practical signals: Coverage(C), Alignment(A), Hallucination(H), Relevance(R), Bias/Toxicity(B).

Docs (scoring & math): https://github.com/BrainDriveAI/ModelMatch/tree/main/Summeval/Docs

Workflow we used
Model shortlist (30+ HF candidates) → Article set (Tech/Business/News/Science) → Summaries per model → JudgeLock scoring → Weighted aggregation → Leaderboard.

Try it yourself
Code toolkit: https://github.com/BrainDriveAI/ModelMatch/tree/main/Summeval
No-code evaluator: https://huggingface.co/spaces/BrainDrive/Summary-Evaluator

ModelMatch is our way to help you pick the right open-source model for real tasks. If you test other models or get different results, ping us; happy to compare notes.

Regards,
Navaneeth