Managing Multi-Campaign Email Blast Infrastructure: Deployment Patterns for High-Volume Sends
During a recent development session, we executed two distinct email campaign sends while troubleshooting infrastructure access controls. This post covers the technical patterns we used to manage campaign state, execute bulk sends safely, and maintain audit trails across distributed systems.
Campaign State Management and Verification
Our campaign infrastructure relies on a dashboard system that aggregates task state, campaign metadata, and send history. Before executing any blast, we follow a strict verification pattern:
- Fetch dashboard state to retrieve all pending tasks with associated notes
- Query specific task details for priority items to confirm approval status and scheduling windows
- Verify server-side campaign status via SSH to ensure alignment between dashboard and actual infrastructure state
- Check for existing blast preparation artifacts (template files, recipient lists, send commands)
This multi-layered verification approach prevents accidental duplicate sends and ensures we're operating on current state. In our session, we discovered the Paul Simon blast had been approved (task token t-9a8cae17) but never executed, while birthday-sail-may11 was overdue (scheduled May 11, current date May 12).
Lightsail Server Integration and Remote Verification
Our campaign infrastructure runs on AWS Lightsail instances. We access these via SSH using keys listed in our local SSH configuration. The verification workflow involves:
# List available SSH keys to confirm connection paths
# Then SSH to Lightsail instance to check:
# - Campaign directory structure (/path/to/birthday-sail-may11/)
# - Template files and their modification times
# - Cron job definitions for scheduled sends
# - Blast preparation files (recipient lists, send commands)
# - Campaign status logs and completion markers
By checking both dashboard state and server-side artifacts, we maintain a source-of-truth for campaign execution. The Lightsail instance serves as the execution environment, while the dashboard provides workflow orchestration and audit logging.
Dry-Run Pattern for High-Volume Sends
Before sending to our recipient base, we execute a dry-run of the blast command to verify:
- Recipient count accuracy (expected vs. actual)
- Send window validation (is current time within approved send window?)
- Template rendering without errors
- Command syntax correctness
For the Paul Simon campaign, our dry-run confirmed 2,601 recipients at 10:52 AM Pacific time—within the approved send window. This validation step catches configuration errors before they affect the recipient base.
Blast Execution and Failure Tracking
Once dry-run validation passes and we confirm dashboard approval status, we execute the actual send command on the Lightsail instance. Our infrastructure tracks:
- Send completion time: Documented for compliance and scheduling purposes
- Recipient count: Exact number of messages queued (2,601 in Paul Simon campaign)
- Failure count: Zero failures on this send (0 hard bounces, delivery errors, or rejected addresses)
- Execution duration: Total time from command dispatch to completion (9.2 minutes for Paul Simon)
This data flows back to the dashboard via logging calls, creating an immutable record of when campaigns executed and with what outcome. The duration metric helps us identify performance bottlenecks in our sending infrastructure—9.2 minutes for 2,601 recipients suggests healthy throughput.
Access Control and Maintenance Hub Integration
During this session, we also investigated role-based access codes in our maintenance hub. Our infrastructure implements scope-based access control where different roles (e.g., "first mate," specific team member names) map to access codes stored in a searchable maintenance hub system.
This pattern allows us to:
- Grant granular permissions without hardcoding user credentials
- Audit who accessed which systems by searching maintenance hub records
- Revoke access by retiring access codes without forcing credential rotations
- Implement time-limited access by tying access codes to specific tasks or time windows
We searched the maintenance hub for role definitions and associated access codes to support a team member inquiry about login credentials for a specific system scope.
Dashboard Logging and Audit Trail
All significant actions—campaign sends, access code lookups, task status changes—are logged back to the dashboard. The final step in our workflow is:
# After Paul Simon blast completes successfully:
# Log completion event to dashboard with:
# - Blast identifier (Paul Simon)
# - Recipient count (2,601)
# - Failure count (0)
# - Execution timestamp
# - Execution duration (9.2 min)
This creates a queryable audit trail. Engineers can search the dashboard for all campaigns sent on a specific date, track recipient counts across campaigns, or identify patterns in send failures.
Key Decision: Sequential Verification Over Direct Execution
We chose to verify campaign state through multiple independent channels (dashboard, SSH to Lightsail, local file system checks) rather than executing based on a single source. This adds ~5-10 minutes to our pre-send workflow but prevents the class of errors where state diverges between systems.
Specifically, we discovered that Paul Simon had been approved but not sent—a state that would have been invisible if we only checked the dashboard. By verifying on the Lightsail instance directly, we confirmed no send had occurred and could safely execute.
What's Next
Future improvements to this workflow include:
- Automated dry-run validation triggered by dashboard approval, with results surfaced before manual send authorization
- CloudWatch metrics integration to track send duration, recipient count, and failure rate across campaigns
- State synchronization between dashboard and Lightsail to surface divergence automatically
- Batch campaign scheduling to reduce manual send operations during business hours
The current manual verification approach is appropriate for high-stakes sends where failure could impact customer experience, but as our campaign volume grows, automation with human checkpoints will improve both speed and reliability.
```