Top Open Source Finance Models

Hey All,

We just wrapped a hands-on round with our FinanceEval framework: here’s what I discussed in the video and my current top open-source picks for finance-advice–focused models on Hugging Face:


Top Open Source Finance Models – BrainDrive

Top 3 (with quick stats)

:1st_place_medal: meta-llama/llama-3-70b-instruct · Hugging Face
Score 6.26 | 6-metric profile (Trust, Accuracy, Explainability, Client-First, Risk Safety, Clarity) | 70B params | Meta license | EN
:link: https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct

:2nd_place_medal: meta-llama/llama-3.3-70b-instruct · Hugging Face
Score 5.87 | 6-metric profile | 70B params | Meta license | EN
:link: https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct

:3rd_place_medal: nvidia/llama-3.1-nemotron-70b-instruct · Hugging Face
Score 5.78 | 6-metric profile | 70B params | NVIDIA Open Model License | EN
:link: https://huggingface.co/nvidia/Llama-3.1-Nemotron-70B-Instruct


How we ranked them (FinanceEval by BrainDrive)

FinanceEval is our evaluation workflow for AI-generated financial advice.
We score models on 6 practical metrics—from Trust & Transparency, Competence & Accuracy, Explainability, and Client-Centeredness to Risk Safety and Clarity & Financial Literacy Support.
Individual scores roll up into a weighted total, which determines ranking.

:page_facing_up: Docs (scoring & math):
:backhand_index_pointing_right: https://github.com/BrainDriveAI/ModelMatch/tree/main/FinanceEval/Docs


Workflow we used

Model shortlist (20+ HF candidates) → Multi-domain prompts (budgeting, investing, taxation, retirement) → Responses per model → FinanceEval scoring → Weighted aggregation → Ranking.


Try it yourself

:laptop: Code toolkit: https://github.com/BrainDriveAI/ModelMatch/tree/main/FinanceEval
:desktop_computer: No-code evaluator: https://huggingface.co/spaces/BrainDrive/FinanceEval


About ModelMatch

ModelMatch helps you discover the most suitable open-source model for your domain and task—starting with summarization, expanding into therapy, email generation, and now finance evaluation.

If you test other models or get different results, ping us; happy to compare notes.

Regards,
Navaneeth

1 Like