Add Model Selection Strategy section to global instructions

2026-02-04 22:30:40 +01:00
parent dbfc3c4ddc
commit f0dae88639
1 changed files with 81 additions and 0 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -131,6 +131,87 @@ If user wants an ongoing project:
 ---
 ## Model Selection Strategy
 **Core Principle:** Avoid context loss between models. Using a cheaper model for execution and a more expensive model for error recovery costs more in total tokens than using the appropriate model from the start.
 ### Decision Tree
 **1. Trivial, read-only tasks (zero risk)?**
   - Examples: `git status`, checking if a file exists, reading a single file
   - **Use: HAIKU**
   - Rationale: Fastest, cheapest, no context needed, no execution risk
 **2. Standard task with a clear plan?**
   - **Default: OPUS (plans AND executes in one shot)**
   - **Rare Exception: SONNET for execution-only IF:**
     - Plan is 100% mechanical (no decisions needed, pure step-following)
     - AND error probability is extremely low (documented, tested system)
     - AND the cost saving actually matters for the specific task
   - Rationale: Opus adapts on-the-fly, handles surprises without re-planning overhead
 **3. Complex, risky, or unknown-territory tasks?**
   - Examples: Host infrastructure scans, service restarts, SSH operations, debugging, anything with potential for surprises
   - **Use: OPUS**
   - Rationale: Lowest error rate, best context understanding, avoids costly error recovery
 ### Why This Matters: Context Loss is Expensive
 **Anti-pattern: Model Switching**
 ```
 Opus plans → Sonnet/Haiku executes → Reality differs → Opus must blind-debug
 = Token cost: Plan tokens + Execution tokens + Error recovery tokens
 ```
 **Better: One-shot Opus**
 ```
 Opus plans AND executes, adapts on-the-fly, handles surprises
 = Token cost: Plan tokens + Execution tokens (no re-planning overhead)
 ```
 **Real-world execution rarely matches plans exactly** because:
 - Unexpected file structures or permissions
 - System state differences from documentation
 - Commands that succeed but produce different output
 - Permission errors or authentication issues
 - Configuration differences in different environments
 When Opus executes the plan, it can adapt in real-time without:
 - Losing context between model switches
 - Re-explaining the situation to a different model
 - Incurring planning overhead again
 ### Concrete Examples
 | Task | Model | Reasoning |
 |------|-------|-----------|
 | `git status` | HAIKU | Read-only, no execution risk |
 | Read a config file | HAIKU | Read-only, no execution risk |
 | Host infrastructure scan | OPUS | Complex, multiple hosts, recursive discovery, adapts to surprises |
 | Service restart | OPUS | Risk of unexpected state, error handling needed |
 | SSH operations | OPUS | Unknown system state, permission issues possible |
 | Codebase refactoring | OPUS | Multiple files, architectural decisions, error recovery critical |
 | Deployment script (well-tested) | SONNET (rare) | Only if plan is 100% mechanical AND low error risk AND cost matters |
 | Debugging a production issue | OPUS | Unknown territory, needs real-time adaptation |
 | DNS record check | HAIKU | Read-only lookup |
 | Firewall rule modification | OPUS | Complex state, multiple systems affected, documentation updates needed |
 | Running documented commands | SONNET (rare) | Only if commands are proven, output predictable, error probability <1% |
 ### Anti-Patterns to Avoid
 | Anti-Pattern | Why It Fails | Cost |
 |--------------|-------------|------|
 | Use Haiku/Sonnet to "save tokens" | Failures cost more in error recovery | ❌ False economy |
 | Plan with Opus, execute with Haiku, fix errors with Opus | Context loss between models | ❌ Most expensive option |
 | Sonnet for everything "to balance speed/cost" | Unclear when it's appropriate | ❌ Inconsistent, risky |
 | Switch models mid-task based on "looking easy" | Real execution rarely matches expectations | ❌ Context loss |
 ### When in Doubt
 **Default to OPUS.** Token cost of unnecessary Opus usage is typically less than the token cost of error recovery with a cheaper model. Better to overshoot on capability than to undershoot and pay for recovery.
 ---
 ## Document Structure
 ### copilot-instructions.md (Development Guidelines)