$ head -n 1
A compact routing doctrine for coding agents: planning versus acting, local versus hosted, fast versus deep, and when to split a task instead of feeding one giant context window.
$ grep -i "routing beats ranking"
Provider leaderboards are a weak routing plan for day-to-day agent work. Planning, patching, reviewing, summarizing, and broad codebase exploration all stress different parts of an agent stack.
A useful routing policy asks what the next loop needs: broad reasoning, fast patch iteration, structured JSON, careful code review, or a small classification. Then it picks the smallest capable model and context budget.
$ grep -i "plan and act"
Plan mode wants room to inspect, compare, and choose a path. Act mode wants determinism, low temperature, and a bounded file surface. Keeping those jobs separate reduces both latency and drift.
The same idea applies to local and hosted models. Local-first keeps common loops private and fast enough; hosted reasoning is reserved for situations where the extra headroom changes the outcome.
$ grep -i "failure mode"
Quality usually drops when the task is too broad and the context keeps growing until the agent becomes expensive, slow, and less disciplined. Good routing includes a stop rule: summarize state, split the work, then continue in a narrower phase.