Skip to content

Contextune Haiku Agent Architecture

Version: 0.3.0 (Agent-Enhanced) Date: 2025-10-21 Status: 🚀 Revolutionary Cost Optimization


🎯 Strategic Insight: The Three-Tier Intelligence Model

Current Problem (v0.2.0)

All execution happens in main Sonnet conversation:

Main Agent (Sonnet 4.5):
├─ Planning (necessary - complex)
├─ Spawning subagents via Task tool (EXPENSIVE!)
│   ├─ Subagent 1 (Sonnet) - Creates issue, worktree, executes
│   ├─ Subagent 2 (Sonnet) - Creates issue, worktree, executes
│   └─ Subagent 3 (Sonnet) - Creates issue, worktree, executes
└─ All subagents use Sonnet context!

Cost for 5 parallel tasks:
- Main agent: ~10K tokens (Sonnet) = $0.03
- 5 subagents: ~50K tokens each (Sonnet) = 5 × $0.15 = $0.75
Total: $0.78 per parallel workflow

Revolutionary Solution (v0.3.0)

Hybrid Intelligence: Sonnet orchestrates, Haiku executes:

Main Agent (Sonnet 4.5):
├─ Planning (complex reasoning)
├─ Orchestration (decision-making)
└─ Delegates to Haiku agents ⚡

Haiku Agents (separate contexts):
├─ parallel-task-executor (Haiku) - Autonomous execution
├─ worktree-manager (Haiku) - Git operations
├─ issue-orchestrator (Haiku) - GitHub management
├─ test-runner (Haiku) - Test execution
└─ performance-analyzer (Haiku) - Benchmarking

Cost for 5 parallel tasks:
- Main agent: ~10K tokens (Sonnet) = $0.03
- 5 Haiku agents: ~30K tokens each (Haiku) = 5 × $0.024 = $0.12
Total: $0.15 per parallel workflow

Savings: 81% cost reduction! 🎉


💰 Cost Analysis

Per-Token Pricing (Claude API)

Model Input ($/MTok) Output ($/MTok) Use Case
Sonnet 4.5 $3.00 $15.00 Complex reasoning, planning, orchestration
Haiku 4.5 $0.80 $4.00 Execution, testing, reporting
Savings 73% 73% Use Haiku for 80% of work!

Typical Parallel Workflow (5 Tasks)

Current (All Sonnet):

Main Agent Planning:
- Input: 8K tokens × $3.00/MTok = $0.024
- Output: 2K tokens × $15.00/MTok = $0.030
Subtotal: $0.054

5 Subagents Execution (Sonnet):
- Input per agent: 40K tokens × $3.00/MTok = $0.120
- Output per agent: 10K tokens × $15.00/MTok = $0.150
- Per agent: $0.270
- Total 5 agents: $1.350

Total Cost: $1.404 per workflow

Optimized (Sonnet + Haiku):

Main Agent Planning (Sonnet):
- Input: 8K tokens × $3.00/MTok = $0.024
- Output: 2K tokens × $15.00/MTok = $0.030
Subtotal: $0.054

5 Haiku Agents Execution:
- Input per agent: 30K tokens × $0.80/MTok = $0.024
- Output per agent: 5K tokens × $4.00/MTok = $0.020
- Per agent: $0.044
- Total 5 agents: $0.220

Total Cost: $0.274 per workflow

Savings: $1.13 per workflow (80% reduction!)

Projected Annual Savings

Assuming: - 100 parallel workflows per month - 1,200 workflows per year

Current cost: 1,200 × \(1.40 = **\)1,680/year** Optimized cost: 1,200 × \(0.27 = **\)324/year**

Annual savings: $1,356 (81% reduction!)


🏗️ Architecture Overview

Three-Tier Intelligence Model

┌──────────────────────────────────────────────────┐
│  Tier 1: SKILLS (Sonnet - Main Context)         │
│  ├─ parallel-development-expert                 │
│  ├─ intent-recognition                          │
│  ├─ git-worktree-master                         │
│  └─ performance-optimizer                       │
│                                                  │
│  Purpose: Autonomous guidance & teaching        │
│  Model: Sonnet 4.5 (part of main conversation)  │
│  Cost: Minimal (educational value)              │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│  Tier 2: ORCHESTRATION (Sonnet - Main Agent)    │
│  ├─ Planning & decomposition                    │
│  ├─ Complex decision-making                     │
│  ├─ Conflict resolution                         │
│  └─ Agent coordination                          │
│                                                  │
│  Purpose: High-level intelligence               │
│  Model: Sonnet 4.5                              │
│  Cost: ~$0.05 per workflow                      │
└──────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────┐
│  Tier 3: EXECUTION (Haiku - Agents)             │
│  ├─ parallel-task-executor (Haiku)              │
│  ├─ worktree-manager (Haiku)                    │
│  ├─ issue-orchestrator (Haiku)                  │
│  ├─ test-runner (Haiku)                         │
│  └─ performance-analyzer (Haiku)                │
│                                                  │
│  Purpose: Autonomous execution                  │
│  Model: Haiku 4.5 (isolated contexts)           │
│  Cost: ~$0.04 per agent                         │
└──────────────────────────────────────────────────┘

Key Principles

  1. Sonnet for Thinking, Haiku for Doing
  2. Complex reasoning → Sonnet
  3. Repetitive execution → Haiku
  4. Teaching & guidance → Sonnet (Skills)

  5. Context Isolation

  6. Each Haiku agent has its own context
  7. No pollution of main conversation
  8. Clean, focused execution

  9. Cost Optimization

  10. 80% of work done by Haiku
  11. 20% of work done by Sonnet
  12. 81% cost reduction overall

  13. Performance Preservation

  14. Haiku 4.5 is fast (~2x faster than Sonnet)
  15. Parallel execution still works
  16. Context window benefits (200K)

🤖 Haiku Agents Design

1. parallel-task-executor

Purpose: Autonomous execution of independent tasks in parallel

Model: haiku

Capabilities: - Creates own GitHub issue - Creates own git worktree - Executes task independently - Runs tests - Pushes changes - Reports completion

Tool Access:

allowed-tools:
  - Bash      # Git, npm, etc.
  - Read      # Read files
  - Write     # Write code
  - Edit      # Modify code
  - Grep      # Search
  - Glob      # Find files

Why Haiku: - Repetitive, well-defined workflow - No complex decision-making needed - Fast execution required - Cost-sensitive (runs 3-10 concurrently)

Cost per agent: ~$0.04 (vs $0.27 for Sonnet)


2. worktree-manager

Purpose: Specialized git worktree operations

Model: haiku

Capabilities: - Creates worktrees - Diagnoses worktree issues - Cleans up completed worktrees - Handles locks and conflicts - Prunes stale references

Tool Access:

allowed-tools:
  - Bash      # Git commands only
  - Read      # Diagnostic reading
  - Grep      # Find issues

Why Haiku: - Well-defined git operations - No complex reasoning needed - Fast diagnostic + fix workflow - Cost-sensitive (called frequently)

Cost per operation: ~$0.02


3. issue-orchestrator

Purpose: GitHub issue creation and management

Model: haiku

Capabilities: - Creates issues with templates - Updates issue status - Adds labels - Links to PRs - Closes completed issues

Tool Access:

allowed-tools:
  - Bash      # gh CLI only
  - Read      # Read templates

Why Haiku: - Templated issue creation - Repetitive operations - No decision-making needed - Cost-sensitive (many issues)

Cost per issue: ~$0.01


4. test-runner

Purpose: Autonomous test execution and reporting

Model: haiku

Capabilities: - Runs test suites - Collects test results - Creates GitHub issues for failures - Benchmarks performance - Reports coverage

Tool Access:

allowed-tools:
  - Bash      # Test commands
  - Read      # Read test files
  - Write     # Write reports

Why Haiku: - Repetitive test execution - Well-defined reporting format - Fast execution critical - Cost-sensitive (run frequently)

Cost per test run: ~$0.03


5. performance-analyzer

Purpose: Benchmark and analyze workflow performance

Model: haiku

Capabilities: - Measures timing - Identifies bottlenecks - Calculates metrics - Generates reports - Compares to baselines

Tool Access:

allowed-tools:
  - Bash      # Timing commands
  - Read      # Read results
  - Write     # Write reports

Why Haiku: - Data collection and analysis - Repetitive benchmarking - No complex reasoning needed - Cost-sensitive (run frequently)

Cost per analysis: ~$0.02


📊 Performance Comparison

Execution Speed

Agent Type Model Avg Response Time
Complex reasoning Sonnet 4.5 3-5s
Simple execution Haiku 4.5 1-2s
Speedup Haiku ~2x faster

Context Window

Both models have 200K context windows, so no functional limitation.

Quality

When Haiku is appropriate: - ✅ Well-defined tasks - ✅ Templated operations - ✅ Repetitive workflows - ✅ Simple decision trees

When Sonnet is needed: - ⚠️ Complex reasoning - ⚠️ Ambiguous requirements - ⚠️ Creative problem-solving - ⚠️ Multi-step planning


🔄 Updated Parallel Execution Workflow

Before (All Sonnet)

User: "Work on auth, dashboard, analytics in parallel"
Main Agent (Sonnet):
├─ Plans tasks
└─ Spawns 3 subagents via Task tool
Subagent 1 (Sonnet): Full autonomous execution
Subagent 2 (Sonnet): Full autonomous execution
Subagent 3 (Sonnet): Full autonomous execution

Cost: ~$1.40
Time: ~3 hours work time + setup

After (Sonnet + Haiku Hybrid)

User: "Work on auth, dashboard, analytics in parallel"
Skill: parallel-development-expert (Sonnet)
├─ Analyzes tasks
├─ Recommends parallelization
└─ Quantifies savings
User: "Yes, do it"
Main Agent (Sonnet):
├─ Creates parallel execution plan
├─ Validates independence
└─ Delegates to Haiku agents
Agent: parallel-task-executor (Haiku) × 3 instances
├─ Instance 1: Auth task
├─ Instance 2: Dashboard task
└─ Instance 3: Analytics task

Each Haiku agent:
1. Creates GitHub issue (via issue-orchestrator Haiku)
2. Creates worktree (via worktree-manager Haiku)
3. Executes task autonomously
4. Runs tests (via test-runner Haiku)
5. Reports completion

Cost: ~$0.27 (81% savings!)
Time: ~3 hours work time + setup (same)
Quality: Same (Haiku perfect for execution)

🎯 Decision Matrix: When to Use Which Model

Task Type Complexity Model Why
Planning High Sonnet Requires complex reasoning
Guidance High Sonnet Educational, teaching
Orchestration High Sonnet Coordination, decisions
Execution Low Haiku Repetitive, well-defined
Testing Low Haiku Automated, templated
Reporting Low Haiku Data collection
Git Operations Low Haiku Simple commands
Issue Creation Low Haiku Templated
Conflict Resolution High Sonnet Requires judgment
Architecture Design High Sonnet Creative problem-solving

Rule of Thumb: - If task can be described in a template → Haiku - If task requires "figuring it out" → Sonnet


🚀 Implementation Strategy

Phase 1: Create Haiku Agents (Week 1)

  1. parallel-task-executor (highest impact)
  2. Replaces current Task tool subagents
  3. 81% cost reduction
  4. Test with 2-3 tasks first

  5. worktree-manager (high frequency)

  6. Handles all git worktree operations
  7. Called by parallel-task-executor
  8. Reduces main agent pollution

  9. issue-orchestrator (high frequency)

  10. Creates/updates all GitHub issues
  11. Called by parallel-task-executor
  12. Consistent formatting

Phase 2: Integrate with Skills (Week 2)

  1. Update parallel-development-expert skill
  2. Recommends Haiku agent usage
  3. Explains cost savings to users
  4. Delegates to agents after planning

  5. Update performance-optimizer skill

  6. Uses performance-analyzer Haiku agent
  7. Benchmarks with minimal cost
  8. Reports to user via main conversation

  9. Update git-worktree-master skill

  10. Delegates to worktree-manager Haiku
  11. Focuses on guidance in main conversation
  12. Agent handles execution

Phase 3: Advanced Agents (Week 3)

  1. test-runner (testing workflows)
  2. Autonomous test execution
  3. Issue creation for failures
  4. Performance benchmarking

  5. merge-coordinator (Sonnet!)

  6. Complex merge conflict resolution
  7. Decision-making for integration
  8. Uses Sonnet for reasoning

Phase 4: Optimization (Week 4)

  • Monitor cost savings
  • Measure performance
  • Gather user feedback
  • Tune agent prompts
  • Add more specialized agents

📈 Expected Impact

Cost Savings

Workflow Current Cost New Cost Savings
3 parallel tasks $0.84 $0.16 81%
5 parallel tasks $1.40 $0.27 81%
10 parallel tasks $2.80 $0.54 81%
Annual (1200) $1,680 $324 $1,356

Performance Improvements

  • 2x faster execution (Haiku response time)
  • Cleaner main context (agents isolated)
  • Better debugging (agent logs separate)
  • Same quality (Haiku perfect for execution)

User Experience

  • Transparent cost savings (show users the savings!)
  • Faster responses (Haiku is quick)
  • More parallelization (cost no longer prohibitive)
  • Better explanations (main agent focused on guidance)

🎓 Best Practices

1. Agent Design

DO: - ✅ Single responsibility per agent - ✅ Well-defined inputs/outputs - ✅ Minimal tool access - ✅ Clear success criteria - ✅ Explicit error handling

DON'T: - ❌ Create mega-agents - ❌ Give unnecessary tool access - ❌ Assume complex reasoning - ❌ Skip validation

2. Model Selection

Use Haiku for: - Repetitive tasks - Well-defined workflows - Templated operations - Data collection - Simple decision trees

Use Sonnet for: - Complex planning - Creative problem-solving - Teaching & guidance - Conflict resolution - Architecture decisions

3. Cost Optimization

Minimize token usage: - Short, focused prompts - Clear instructions - Avoid unnecessary context - Use templates - Cache common patterns

Batch operations: - Group related tasks - Reuse agent instances - Minimize agent spawning - Consolidate reporting


🔒 Safety & Quality

Agent Validation

Each agent must: - Validate inputs - Handle errors gracefully - Report failures clearly - Never make assumptions - Ask for clarification when uncertain

Quality Assurance

Testing: - Unit test each agent independently - Integration test agent interactions - Benchmark performance - Monitor cost in production - Track error rates

Monitoring: - Log all agent executions - Track cost per agent - Measure success rates - Identify failure patterns - Optimize based on data


📚 Migration Guide

From v0.2.0 (Skills Only) to v0.3.0 (Skills + Agents)

No breaking changes! Agents are additive.

What changes: 1. Parallel execution now uses Haiku agents 2. Cost drops by 81% 3. Execution speed increases 2x 4. Main context stays cleaner

What stays the same: 1. Skills still provide guidance 2. Natural language still works 3. User experience unchanged 4. All features available

Migration steps:

# 1. Create agents directory
mkdir -p .claude/agents

# 2. Copy Haiku agent files
cp agents/*.md .claude/agents/

# 3. Test with 2-3 tasks
"work on task A and task B in parallel"

# 4. Monitor cost savings
Check Claude Code usage dashboard

# 5. Rollout to all workflows
Update parallel-execute command


🌟 Future Enhancements

Short-term (v0.4.0)

  • dependency-analyzer (Haiku)
  • Analyzes code dependencies
  • Identifies conflicts
  • Reports findings

  • conflict-resolver (Sonnet)

  • Complex merge conflicts
  • Requires judgment
  • Uses Sonnet for reasoning

Medium-term (v0.5.0)

  • Agent pools
  • Pre-warmed Haiku agents
  • Faster spawn time
  • Better resource utilization

  • Adaptive model selection

  • Automatically choose Haiku vs Sonnet
  • Based on task complexity
  • Learn from outcomes

Long-term (v1.0.0)

  • Multi-model optimization
  • Opus for super-complex tasks
  • Sonnet for standard tasks
  • Haiku for execution
  • Automatic selection

  • Cost monitoring dashboard

  • Real-time cost tracking
  • Optimization suggestions
  • Comparative analysis

🎉 Conclusion

The Haiku Agent Architecture represents a quantum leap in cost efficiency:

Key Achievements: - ✅ 81% cost reduction - ✅ 2x performance improvement - ✅ Cleaner context management - ✅ Same quality of execution - ✅ Zero user impact - ✅ Fully backward compatible

Strategic Impact: - Makes parallel development accessible to all users - Removes cost as a limiting factor - Enables more aggressive parallelization - Preserves main agent context - Sets new standard for Claude Code plugins

The Future: - Haiku for execution (80% of work) - Sonnet for thinking (20% of work) - Skills for teaching (priceless) - Cost-optimized from the ground up

Contextune v0.3.0 = Natural UX + Autonomous Guidance + Cost Efficiency


Version: 0.3.0 (Haiku-Enhanced) Status: 🚀 Revolutionary Impact: 81% cost reduction, 2x speed improvement License: MIT

Questions? See agent implementations in .claude/agents/