Not all free models are equal. Some excel at coding, others at long research documents, and some are fast enough for quick chat. We tested the top free models on OpenRouter and ranked them by real-world task performance so you don't have to guess.
New to this? If you haven't set up OpenCode with OpenRouter yet, start with our Free Local AI Setup Guide first. It takes about 10 minutes.
Model Comparison Table
All models below are completely free on OpenRouter. No credit card required. Rate limits apply (typically 20 requests/minute, 200 requests/day).
| Model | Best For | Speed | Quality | Context |
|---|---|---|---|---|
qwen/qwen3-coder:free |
Coding | 262K | ||
mistralai/devstral-2512:free |
Coding (multi-file) | 262K | ||
meta-llama/llama-3.3-70b-instruct:free |
General / Writing | 66K | ||
nvidia/nemotron-3-super-120b-a12b:free |
Agents / Tool use | 262K | ||
openai/gpt-oss-120b:free |
Reasoning / Analysis | 131K | ||
google/gemma-3-27b-it:free |
Multimodal / Vision | 131K | ||
qwen/qwen3.6-plus:free |
General / Long context | 1M | ||
nousresearch/hermes-3-llama-3.1-405b:free |
Research / Long-form | 131K | ||
openai/gpt-oss-20b:free |
Fast / Lightweight | 131K | ||
stepfun/step-3.5-flash:free |
General (MoE) | 256K |
Best for Coding
Qwen3 Coder — qwen/qwen3-coder:free
Currently the strongest free coding model on OpenRouter. This is a 480B Mixture-of-Experts model with state-of-the-art code generation. It handles multi-file refactors, test generation, and debugging with ease. The 262K context window means it can hold large codebases in a single conversation.
Best for: Code generation, refactoring, debugging, writing tests
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "qwen/qwen3-coder:free"
}
Devstral 2 — mistralai/devstral-2512:free
Mistral's dedicated coding model with strong agentic features. Excels at multi-file projects where you need the model to understand how files relate to each other. Strong on SWE-Bench and particularly good at following complex instructions for project-wide changes.
Best for: Multi-file edits, project scaffolding, agentic coding workflows
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "mistralai/devstral-2512:free"
}
Best for Writing & Email
Llama 3.3 70B — meta-llama/llama-3.3-70b-instruct:free
Meta's flagship open model matches GPT-4 level performance on writing tasks. It produces natural, well-structured prose and is excellent at drafting emails, blog posts, and documentation. The 66K context window is plenty for most writing tasks.
Best for: Emails, blog posts, documentation, copywriting, summaries
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "meta-llama/llama-3.3-70b-instruct:free"
}
Qwen 3.6 Plus — qwen/qwen3.6-plus:free
With a massive 1M token context window, Qwen 3.6 Plus can process entire books or long document collections. It uses efficient linear attention with sparse mixture-of-experts routing, delivering strong results for summarization, editing, and creative writing tasks.
Best for: Long documents, summarization, creative writing, translation
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "qwen/qwen3.6-plus:free"
}
Best for Research & Analysis
GPT-OSS 120B — openai/gpt-oss-120b:free
OpenAI's open-source 120B model offers near o4-mini parity on reasoning tasks. It excels at structured analysis, tool use, and breaking down complex problems. The 131K context window is ideal for feeding in research papers and getting structured analysis back.
Best for: Data analysis, reasoning, structured output, research synthesis
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "openai/gpt-oss-120b:free"
}
Hermes 3 405B — nousresearch/hermes-3-llama-3.1-405b:free
The largest free model available. Nous Research's Hermes 3 is a fine-tuned Llama 3.1 405B that excels at long-form, nuanced research tasks. It follows complex instructions precisely and produces thorough, well-cited analysis. Slower than smaller models, but the quality is hard to beat for free.
Best for: Deep research, long-form analysis, complex reasoning, detailed reports
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "nousresearch/hermes-3-llama-3.1-405b:free"
}
Best for General Tasks
Nemotron 3 Super — nvidia/nemotron-3-super-120b-a12b:free
NVIDIA's hybrid Mamba-Transformer architecture activates just 12B of its 120B parameters per token, making it extremely fast while maintaining enterprise-grade quality. It supports tool use natively, making it excellent for agentic workflows where you need a reliable all-rounder.
Best for: Agentic tasks, tool calling, general chat, fast responses
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "nvidia/nemotron-3-super-120b-a12b:free"
}
Step 3.5 Flash — stepfun/step-3.5-flash:free
StepFun's sparse MoE model activates only 11B of its 196B parameters per token, giving you fast inference with solid quality. The 256K context window handles long conversations well. A good default choice if you want speed and don't need specialized coding or reasoning.
Best for: Quick questions, brainstorming, everyday tasks, fast iteration
// ~/.config/opencode/config.json
{
"provider": "openrouter",
"model": "stepfun/step-3.5-flash:free"
}
How to Switch Models
Switching models in OpenCode takes one config change. Edit your config file and restart:
# Open your OpenCode config
nano ~/.config/opencode/config.json
Change the model field to any model ID from the table above:
{
"provider": "openrouter",
"model": "qwen/qwen3-coder:free"
}
You can also use the openrouter/free meta-model, which automatically routes to the best available free model based on your request type (text, vision, tool use):
{
"provider": "openrouter",
"model": "openrouter/free"
}
Rate limits: Free models allow roughly 20 requests per minute and 200 requests per day. If you hit limits, either wait a few minutes or switch to a different free model. Limits reset daily.
Quick Recommendations
- Just starting out? Use
nvidia/nemotron-3-super-120b-a12b:free— fast, reliable, handles everything well. - Writing code all day? Use
qwen/qwen3-coder:free— the best free coding model, period. - Working with large files? Use
qwen/qwen3.6-plus:free— 1M context window fits entire codebases. - Need thorough analysis? Use
openai/gpt-oss-120b:free— strong reasoning, near paid-model quality. - Want speed above all? Use
stepfun/step-3.5-flash:freeoropenai/gpt-oss-20b:free— fast MoE models.
For the full setup walkthrough including getting your OpenRouter API key, installing OpenCode, and connecting Obsidian, see the Free Local AI Setup Guide.