Dual-Claude Architecture: Separating Subscription and API-Based Inference for Cost Optimization

We recently implemented a bifurcated Claude invocation strategy to solve a critical cost problem: reducing monthly API spend from $1500+ to $150-$300 while maintaining production reliability. The solution leverages shell environment scoping to route interactive/high-stakes work through the Claude subscription tier, while atomized, low-risk subtasks get routed to a cost-optimized Haiku instance running on a Lightsail EC2 box.

The Problem Statement

Three conflicting constraints:

Systems requiring ANTHROPIC_API_KEY for programmatic Claude calls need reliability; cheaper models introduce unacceptable error rates at production scale.
Interactive use (Claude Code, manual prompting) was consuming API quota at $40-50/day, with no reason to bypass the already-paid subscription tier.
The naive solution—"just use cheaper models everywhere"—fails when downstream integrations depend on Claude 3.5 Sonnet's reasoning quality.

The conflict manifested as a Catch-22 in the local shell: setting ANTHROPIC_API_KEY globally made all Claude invocations route through the API (including interactive `claude.ai` sessions that should use the web subscription), while unsetting it broke programmatic integrations.

Architecture: Subscription-First with Cheap Farm-Out

The solution has three layers:

Interactive layer (local, `claude` command): Uses keychain-stored OAuth credentials from Claude Code settings. The Claude Code-credentials entry (account `cb`) is already managed by the Claude Code extension and never touched by this setup.
Farm-out layer (EC2, Lightsail): A Haiku 3.5 instance running as a systemd daemon, reachable via SSH tunneling. This is the "cheap Claude" for decomposed, low-stakes tasks.
Programmatic scope (scripts/integrations): Only loads ANTHROPIC_API_KEY in contexts that explicitly need it—typically by sourcing repos.env in specific automation scripts.

Technical Implementation Details

Environment Scoping in ~/.zshrc

The key insight: don't export ANTHROPIC_API_KEY globally. Instead, keep it in repos.env and let scripts opt-in:

# ~/.zshrc — NO global ANTHROPIC_API_KEY export
# Instead, define a helper for programmatic work:

source_api_env() {
  # Only source when explicitly needed for scripts
  if [[ -f "$HOME/path/to/repos.env" ]]; then
    export ANTHROPIC_API_KEY=$(grep ANTHROPIC_API_KEY "$HOME/path/to/repos.env" | cut -d= -f2)
  fi
}

# Interactive claude invocation — uses keychain OAuth automatically
alias claude='/path/to/claude'

When you invoke claude interactively without ANTHROPIC_API_KEY set, the Claude Code extension falls back to the keychain credential (Claude Code-credentials`), which points to your subscription-tier web token.



EC2 Farm-Out Infrastructure (Lightsail)

The cheap Claude lives on a Lightsail instance:


Instance: `ubuntu@34.239.233.28` (us-west-2 region)
Internal DNS: `ip-172-26-6-34` (10.x private network, not internet-routable)
SSH key: `~/.ssh/LightsailDefaultKey-us-west-2.pem` (passwordless, BatchMode-verified)
Claude binary: `/usr/bin/claude` (Haiku 3.5, API-key-authenticated)
Daemon: `jada-agent.service` (systemd unit, active and verified reachable)


The systemd unit injects the API key and model at invocation time, not in the login shell. This is critical: when a farm-out wrapper SSH-execs a remote `claude` call, it must pass credentials explicitly:

# Farm-out wrapper pattern (pseudocode; actual implementation in repos automation)
ssh -i ~/.ssh/LightsailDefaultKey-us-west-2.pem \
    ubuntu@34.239.233.28 \
    "ANTHROPIC_API_KEY=\$KEY /usr/bin/claude ... "

The Lightsail instance's security group already allows inbound SSH (port 22) from your client IP range. No additional networking changes were required.

Keychain Credential Storage (OAuth, Not API Key)

Claude Code stores the subscription token in the system keychain under the label Claude Code-credentials, account name `cb`. This is separate from the API key and is never exported to the environment by our changes—it's handled entirely by the Claude Code extension's internal auth flow.

Verify it exists (no output of the actual token):

security find-generic-password -s "Claude Code-credentials" -a "cb" 2>&1 | grep "attributes"

If the label exists, you're good. If you see "The specified item could not be found in the Keychain," re-authenticate Claude Code once.

Key Decision Rationale

Why not use `forceLoginMethod` in settings.json? This key is managed/enterprise-only and is silently ignored in personal `~/.claude/settings.json` files. It's not a lever we can pull.

Why source `repos.env` only in scripts, not globally? Explicit is better than implicit. Any script or integration that needs the API key should declare that dependency by sourcing the file. This makes debugging easier ("why is the API key suddenly in scope?") and prevents accidental cross-contamination between interactive and programmatic sessions.

Why Haiku on Lightsail, not Claude Instant or 1.3? Haiku 3.5 is the newest efficient model and offers the best quality-per-token among budget tiers. For atomized, well-specified subtasks, it's reliable enough. For synthesis, reasoning, or user-facing output, you still route through the subscription tier (better model, no token cost).

Why not a local farm-out? Lightsail offers 4GB/month of free outbound traffic and a fixed $3.50/month baseline. Running a local daemon introduces operational complexity (process restarts, resource contention) with no cost savings at your scale. EC2 adds network latency (~100-200ms) but keeps infrastructure simple and immutable.

Verification & Testing

Before shipping this to production automation, verify the chain:

# 1. Confirm subscription path (should NOT error about missing API key)
unset ANTHROPIC_API_KEY
claude "test prompt" 

# 2. Confirm farm-out path (should reach EC2 and return a response)
source ~/path/to/repos.env
ssh -i ~/.ssh/LightsailDefaultKey-us-west-2.pem ubuntu@34.239.233.28 \
  "ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY} /usr/bin/claude 'test'"

# 3. Confirm interactive `claude.ai` login works (should use keychain, not