```html

Building a Carrier-Grade SMS Relay for Multi-Tenant Fleet Dispatch: QuickDumpNow's Twilio Integration

The Problem: Carrier Limitations in Call Forwarding

QuickDumpNow (QDN) operates a fleet dispatch system where incoming customer calls need to cascade through multiple team members—primary dispatcher → backup manager → final fallback—before reaching a human. The natural instinct was to use carrier-level call forwarding rules, but AT&T and other carriers impose strict limitations:

  • Forwarding chains are typically limited to 1–2 hops before the carrier breaks the chain
  • Geographic number portability rules make it risky to daisy-chain across team members' personal phones
  • No programmatic control—you can't adjust routing based on job status, availability, or time of day
  • No audit trail or logging for compliance purposes

The solution: build a Twilio-managed relay that treats incoming calls as API events, evaluates routing logic server-side, and bridges to the appropriate team member based on real-time job state.

Architecture Overview

The relay lives in a Lambda function (`qdn-data-crud`) that already handles QDN's job CRUD operations. When a call arrives at the QDN Twilio number, TwiML instructions from a webhook handler direct the call to the active dispatcher. If that person doesn't answer within 15 seconds, Twilio retries the next person in the chain.

The routing table is stored in S3 (`quickdumpnow.com` bucket) as part of the maintenance hub's state file:

/Users/cb/Documents/repos/sites/quickdumpnow.com/maintenance/maintenance.json

This file contains dispatcher assignments, on-call schedules, and fallback phone numbers. The Lambda function reads this file on each inbound call and returns TwiML that tells Twilio how to route the call.

Infrastructure Components

1. Twilio Account Setup

Credentials were securely stored in the team's shared environment file:

/Users/cb/Documents/repos/.secrets/repos.env

The file contains:

  • TWILIO_ACCOUNT_SID: The account identifier for all API calls
  • TWILIO_AUTH_TOKEN: Authentication credential for admin operations (account management, number provisioning)
  • TWILIO_API_KEY and TWILIO_API_SECRET: For runtime SDK calls (token generation, call management)

File permissions were locked down to 600 (owner read/write only). These credentials are loaded by both the Lambda runtime and local development scripts.

2. Lambda Function Updates

The existing `qdn-data-crud` Lambda was extended with two new endpoints:

POST /api/jobs/{job_id}/call-webhook
POST /api/jobs/{job_id}/sms-webhook

Both endpoints are triggered by Twilio webhooks. The call handler:

  • Extracts the incoming phone number and job context from Twilio's request
  • Looks up the current dispatcher assignment from maintenance.json
  • Generates TwiML that dials the dispatcher's phone number with a 15-second timeout
  • If the dispatcher doesn't answer, TwiML transfers to the next person in the fallback chain
  • Logs all call events back to the job record for audit purposes

Example TwiML logic (generated server-side):

<Response>
  <Dial timeout="15">
    <Number>+1-858-123-4567</Number>
  </Dial>
  <Redirect>/api/jobs/abc123/call-webhook?attempt=2</Redirect>
</Response>

3. API Gateway Routes

Four new routes were added to the QDN API Gateway:

POST /api/jobs/{job_id}/call-webhook
POST /api/jobs/{job_id}/sms-webhook
GET  /api/jobs/{job_id}/call-status
GET  /api/jobs/{job_id}/message-status

Each route includes CORS headers to allow cross-origin requests from the dashboard at dashboard.quickdumpnow.com. An OPTIONS preflight handler was added for browser compatibility.

4. State Management

The dispatcher routing state is stored in a seeded JSON file that was pushed to S3:

s3://quickdumpnow.com/maintenance/maintenance.json

This file structure includes:

  • dispatchers: array of team members with phone numbers and on-call hours
  • fallback_chain: ordered list of backup contacts (e.g., Sergio, then his backup manager)
  • job_routing: active job → dispatcher assignment mapping
  • last_updated: timestamp for cache invalidation

The Lambda function caches this file in memory for up to 60 seconds to reduce S3 API calls during high call volume.

Key Technical Decisions

Why Lambda + S3 Instead of a Dedicated Twilio Studio Flow?

Twilio Studio offers a visual workflow builder, but it's limited when you need to:

  • Query external state (job assignments, dispatcher availability) in real time
  • Log call events back to a job record for audit trails
  • Support complex fallback logic tied to job priority or dispatcher load

Using Lambda gives us programmatic control and integration with QDN's existing CRUD layer. The tradeoff is slightly higher latency (Lambda cold start ~500ms vs. Studio instant response), but Twilio's webhook timeout of 30 seconds absorbs this easily.

Why Store State in S3 Instead of DynamoDB?

The dispatcher roster changes infrequently (maybe once a week when someone goes on vacation). S3 is simpler to manage, cheaper, and the Maintenance Hub already reads/writes to this bucket. DynamoDB would add unnecessary operational complexity.

Why TwiML Instead of the REST API?

TwiML (Twilio Markup Language) is the declarative standard for call control. It's stateless, idempotent, and handles retry logic automatically. The REST API would require us to manage call state in a database and poll for updates—TwiML's event-driven model is cleaner.

Deployment Steps

The Lambda code was updated in-place:

/Users/cb/Documents/repos/sites/dashboard.quickdumpnow.com/lambda/lambda_function.py

After testing locally, the function was repackaged and deployed via the AWS CLI:

aws lambda update-function-code \
  --function-name qdn-data-crud \
  --zip-file fileb://lambda_function.zip

The S3 state file was then uploaded and verified via API:

aws s3 cp maintenance