
How Recommendations Work
The quiz above collects two signals: what you’re building (your use case) and what you can spend (your cost constraint). Those two answers map to a short list of models that score well on the relevant benchmarks. The decision matrix below shows how the recommendation is produced — so you can see the logic behind the answer, not just the answer itself.
Decision Matrix
| Use Case ↓ / Cost Constraint → | Free / Very Low | Balanced | Performance-First |
|---|---|---|---|
| Coding & Engineering | DeepSeek V3 open-weight, low cost | Claude Sonnet 4.6 strong coding + agent tools | Claude Opus 4.7 top coding benchmarks |
| Writing & Content | Llama 4 Maverick open-weight, good quality | GPT-5 mini fast, low latency | Claude Opus 4.7 nuance & long-form |
| Research & Analysis | Gemini 2.5 Flash huge context window | GPT-5 strong reasoning | Claude Opus 4.7 w/ extended thinking deepest analysis |
| Chat / General Assistant | Gemini 2.5 Flash free tier generous | GPT-5 mini snappy, cheap | GPT-5 best-in-class chat |
Example Walkthroughs
- “I’m a student writing a research paper and I don’t want to pay.” → Research & Analysis + Free tier → Gemini 2.5 Flash, because the long context handles whole papers without chunking.
- “I ship a coding agent at work and latency matters.” → Coding + Balanced → Claude Sonnet 4.6, the current leader on agentic coding benchmarks at a mid-tier price.
- “I run a small business and want a chatbot for customer emails.” → Chat + Balanced → GPT-5 mini, balancing quality with cost for high-volume use.
Want to see the raw benchmark data behind these recommendations? Browse the full Data page → Or get a plain-language primer on how the benchmarks are scored at the Learn page →
Note: This redesign concept demonstrates the Model Finder’s intended flow. In a production build, quiz answers would be computed client-side and the matching recommendation card would be surfaced automatically.