Add Model Selection Strategy section to global instructions
This commit is contained in:
81
CLAUDE.md
81
CLAUDE.md
@@ -131,6 +131,87 @@ If user wants an ongoing project:
|
||||
|
||||
---
|
||||
|
||||
## Model Selection Strategy
|
||||
|
||||
**Core Principle:** Avoid context loss between models. Using a cheaper model for execution and a more expensive model for error recovery costs more in total tokens than using the appropriate model from the start.
|
||||
|
||||
### Decision Tree
|
||||
|
||||
**1. Trivial, read-only tasks (zero risk)?**
|
||||
- Examples: `git status`, checking if a file exists, reading a single file
|
||||
- **Use: HAIKU**
|
||||
- Rationale: Fastest, cheapest, no context needed, no execution risk
|
||||
|
||||
**2. Standard task with a clear plan?**
|
||||
- **Default: OPUS (plans AND executes in one shot)**
|
||||
- **Rare Exception: SONNET for execution-only IF:**
|
||||
- Plan is 100% mechanical (no decisions needed, pure step-following)
|
||||
- AND error probability is extremely low (documented, tested system)
|
||||
- AND the cost saving actually matters for the specific task
|
||||
- Rationale: Opus adapts on-the-fly, handles surprises without re-planning overhead
|
||||
|
||||
**3. Complex, risky, or unknown-territory tasks?**
|
||||
- Examples: Host infrastructure scans, service restarts, SSH operations, debugging, anything with potential for surprises
|
||||
- **Use: OPUS**
|
||||
- Rationale: Lowest error rate, best context understanding, avoids costly error recovery
|
||||
|
||||
### Why This Matters: Context Loss is Expensive
|
||||
|
||||
**Anti-pattern: Model Switching**
|
||||
```
|
||||
Opus plans → Sonnet/Haiku executes → Reality differs → Opus must blind-debug
|
||||
= Token cost: Plan tokens + Execution tokens + Error recovery tokens
|
||||
```
|
||||
|
||||
**Better: One-shot Opus**
|
||||
```
|
||||
Opus plans AND executes, adapts on-the-fly, handles surprises
|
||||
= Token cost: Plan tokens + Execution tokens (no re-planning overhead)
|
||||
```
|
||||
|
||||
**Real-world execution rarely matches plans exactly** because:
|
||||
- Unexpected file structures or permissions
|
||||
- System state differences from documentation
|
||||
- Commands that succeed but produce different output
|
||||
- Permission errors or authentication issues
|
||||
- Configuration differences in different environments
|
||||
|
||||
When Opus executes the plan, it can adapt in real-time without:
|
||||
- Losing context between model switches
|
||||
- Re-explaining the situation to a different model
|
||||
- Incurring planning overhead again
|
||||
|
||||
### Concrete Examples
|
||||
|
||||
| Task | Model | Reasoning |
|
||||
|------|-------|-----------|
|
||||
| `git status` | HAIKU | Read-only, no execution risk |
|
||||
| Read a config file | HAIKU | Read-only, no execution risk |
|
||||
| Host infrastructure scan | OPUS | Complex, multiple hosts, recursive discovery, adapts to surprises |
|
||||
| Service restart | OPUS | Risk of unexpected state, error handling needed |
|
||||
| SSH operations | OPUS | Unknown system state, permission issues possible |
|
||||
| Codebase refactoring | OPUS | Multiple files, architectural decisions, error recovery critical |
|
||||
| Deployment script (well-tested) | SONNET (rare) | Only if plan is 100% mechanical AND low error risk AND cost matters |
|
||||
| Debugging a production issue | OPUS | Unknown territory, needs real-time adaptation |
|
||||
| DNS record check | HAIKU | Read-only lookup |
|
||||
| Firewall rule modification | OPUS | Complex state, multiple systems affected, documentation updates needed |
|
||||
| Running documented commands | SONNET (rare) | Only if commands are proven, output predictable, error probability <1% |
|
||||
|
||||
### Anti-Patterns to Avoid
|
||||
|
||||
| Anti-Pattern | Why It Fails | Cost |
|
||||
|--------------|-------------|------|
|
||||
| Use Haiku/Sonnet to "save tokens" | Failures cost more in error recovery | ❌ False economy |
|
||||
| Plan with Opus, execute with Haiku, fix errors with Opus | Context loss between models | ❌ Most expensive option |
|
||||
| Sonnet for everything "to balance speed/cost" | Unclear when it's appropriate | ❌ Inconsistent, risky |
|
||||
| Switch models mid-task based on "looking easy" | Real execution rarely matches expectations | ❌ Context loss |
|
||||
|
||||
### When in Doubt
|
||||
|
||||
**Default to OPUS.** Token cost of unnecessary Opus usage is typically less than the token cost of error recovery with a cheaper model. Better to overshoot on capability than to undershoot and pay for recovery.
|
||||
|
||||
---
|
||||
|
||||
## Document Structure
|
||||
|
||||
### copilot-instructions.md (Development Guidelines)
|
||||
|
||||
Reference in New Issue
Block a user