Add Model Selection Strategy section to global instructions

2026-02-04 22:30:40 +01:00
parent dbfc3c4ddc
commit f0dae88639
1 changed files with 81 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -131,6 +131,87 @@ If user wants an ongoing project:

 ---

+## Model Selection Strategy
+
+**Core Principle:** Avoid context loss between models. Using a cheaper model for execution and a more expensive model for error recovery costs more in total tokens than using the appropriate model from the start.
+
+### Decision Tree
+
+**1. Trivial, read-only tasks (zero risk)?**
+   - Examples: `git status`, checking if a file exists, reading a single file
+   - **Use: HAIKU**
+   - Rationale: Fastest, cheapest, no context needed, no execution risk
+
+**2. Standard task with a clear plan?**
+   - **Default: OPUS (plans AND executes in one shot)**
+   - **Rare Exception: SONNET for execution-only IF:**
+     - Plan is 100% mechanical (no decisions needed, pure step-following)
+     - AND error probability is extremely low (documented, tested system)
+     - AND the cost saving actually matters for the specific task
+   - Rationale: Opus adapts on-the-fly, handles surprises without re-planning overhead
+
+**3. Complex, risky, or unknown-territory tasks?**
+   - Examples: Host infrastructure scans, service restarts, SSH operations, debugging, anything with potential for surprises
+   - **Use: OPUS**
+   - Rationale: Lowest error rate, best context understanding, avoids costly error recovery
+
+### Why This Matters: Context Loss is Expensive
+
+**Anti-pattern: Model Switching**
+```
+Opus plans → Sonnet/Haiku executes → Reality differs → Opus must blind-debug
+= Token cost: Plan tokens + Execution tokens + Error recovery tokens
+```
+
+**Better: One-shot Opus**
+```
+Opus plans AND executes, adapts on-the-fly, handles surprises
+= Token cost: Plan tokens + Execution tokens (no re-planning overhead)
+```
+
+**Real-world execution rarely matches plans exactly** because:
+- Unexpected file structures or permissions
+- System state differences from documentation
+- Commands that succeed but produce different output
+- Permission errors or authentication issues
+- Configuration differences in different environments
+
+When Opus executes the plan, it can adapt in real-time without:
+- Losing context between model switches
+- Re-explaining the situation to a different model
+- Incurring planning overhead again
+
+### Concrete Examples
+
+| Task | Model | Reasoning |
+|------|-------|-----------|
+| `git status` | HAIKU | Read-only, no execution risk |
+| Read a config file | HAIKU | Read-only, no execution risk |
+| Host infrastructure scan | OPUS | Complex, multiple hosts, recursive discovery, adapts to surprises |
+| Service restart | OPUS | Risk of unexpected state, error handling needed |
+| SSH operations | OPUS | Unknown system state, permission issues possible |
+| Codebase refactoring | OPUS | Multiple files, architectural decisions, error recovery critical |
+| Deployment script (well-tested) | SONNET (rare) | Only if plan is 100% mechanical AND low error risk AND cost matters |
+| Debugging a production issue | OPUS | Unknown territory, needs real-time adaptation |
+| DNS record check | HAIKU | Read-only lookup |
+| Firewall rule modification | OPUS | Complex state, multiple systems affected, documentation updates needed |
+| Running documented commands | SONNET (rare) | Only if commands are proven, output predictable, error probability <1% |
+
+### Anti-Patterns to Avoid
+
+| Anti-Pattern | Why It Fails | Cost |
+|--------------|-------------|------|
+| Use Haiku/Sonnet to "save tokens" | Failures cost more in error recovery | ❌ False economy |
+| Plan with Opus, execute with Haiku, fix errors with Opus | Context loss between models | ❌ Most expensive option |
+| Sonnet for everything "to balance speed/cost" | Unclear when it's appropriate | ❌ Inconsistent, risky |
+| Switch models mid-task based on "looking easy" | Real execution rarely matches expectations | ❌ Context loss |
+
+### When in Doubt
+
+**Default to OPUS.** Token cost of unnecessary Opus usage is typically less than the token cost of error recovery with a cheaper model. Better to overshoot on capability than to undershoot and pay for recovery.
+
+---
+
 ## Document Structure

 ### copilot-instructions.md (Development Guidelines)