mini-coder
: small models for agentic SWE research
pass@1 | pass@100 | |
---|---|---|
Qwen 3 Coder 30B-A3B | 33.2 | 67.4 |
mini-coder-4b |
26.8 | 60.2 |
gpt-oss-120b | 26.0 | - |
mini-coder-1.7b |
18.6 | 50.4 |
SWE-agent-LM 7B | 15.2 | - |
Qwen 3 4B Instruct 2507 | 4.0 | 25.1 |
To lower this entry barrier, we trained mini-coder
: two small but performant agentic SWE models. We follow a straightforward training recipe: distillation from a larger, more capable model. We distill from Qwen 3 Coder 30B, which strikes a good balance between performance and inference cost. Using the SWE-smith dataset of GitHub issues, together with the lightweight mini-swe-agent scaffolding, we generated 400k training trajectories (~5.5B tokens). We then fine-tuned Qwen 3 1.7B and Qwen 3 4B Instruct on these trajectories.
The mini-coder
models deliver SOTA performance on SWE-Bench Verified Bash-only at their size. Remarkably, mini-coder-4b
matches the performance of the much larger gpt-oss-120b, while mini-coder-4b
outperforms SWE-Agent-LM 7B. The two models also achieve much higher pass@k than their corresponding base models. This indicates that the mini-coder
models are strong candidates for RL fine-tuning, since pass@k reflects the fraction of problems from which effective supervision can be derived.
Unlike existing agentic SWE models, the mini-coder
models can be post-trained on a single 80GB GPU—or smaller. They work seamlessly with mini-swe-agent, a lightweight, scalable, and developer-friendly agentic framework well-suited for RL fine-tuning. And because they are dense rather than MoE models, they benefit from a more mature fine-tuning ecosystem. Additionally, researchers can incorportate our dataset of 400k training trajectories in their post-training recipes. All in all, we hope that the mini-coder
models will accelerate progress in agentic SWE research.