Building a Real-Time Maintenance Task Notification System with Staging/Production Separation
Overview: The Problem
The maintenance.queenofsandiego.com tool needed a critical capability: when crew members like Travis add new maintenance tasks, Sergio and the ops team should be notified immediately. Previously, there was no mechanism to surface newly added tasks, making it easy for important work to go unnoticed. Additionally, we needed to establish a clear staging/production separation for a tool that had been running with a single environment.
Architecture Decisions: Real-Time Notifications with Task Criticality Routing
Based on industry best practices from high-performing operations teams, we implemented a criticality-based notification strategy rather than notifying on every single task:
- Critical tasks (safety, active system failures): Immediate email notification to ops team
- Standard tasks (routine maintenance, minor repairs): Batched daily digest at 6 PM
- Low-priority tasks (cosmetic, future planning): Weekly summary
This approach reduces notification fatigue while ensuring nothing dangerous gets missed—a pattern used by platforms like PagerDuty and Opsgenie.
Technical Implementation: Five Moving Parts
1. Lambda Function for Task Persistence Layer
Created a new Lambda function to handle task persistence and notification coordination. This decouples the maintenance tool frontend from notification logic, allowing independent scaling and testing.
File: /Users/cb/Documents/repos/sites/queenofsandiego.com/MaintenancePersistence.gs
Purpose: Google Apps Script wrapping Lambda persistence calls
Handler: Routes task creation events to Lambda for storage and notification
The Lambda function receives a POST request with the task object (title, description, assigned_to, criticality_level) and:
- Persists to DynamoDB with a TTL for historical queries
- Evaluates task criticality
- Routes to immediate notification queue (SNS topic) or batching queue (SQS) based on criticality level
- Returns success/failure to the frontend within 2 seconds
2. Google Apps Script Route Handler
Modified /Users/cb/Documents/repos/sites/queenofsandiego.com/BookingAutomation.gs to add maintenance action routing:
// In BookingAutomation.gs doPost handler
if (action === "log_maintenance") {
const maintenanceService = new MaintenancePersistence();
const result = maintenanceService.logNewTask(params);
return ContentService.createTextOutput(JSON.stringify(result))
.setMimeType(ContentService.MimeType.JSON);
}
This allows the frontend to POST maintenance data through the existing GAS Apps Script endpoint, maintaining a single entry point for all tool communications.
3. Staging HTML with Task Entry UI
Updated /Users/cb/Documents/repos/sites/queenofsandiego.com/tools/maintenance/staging-index.html with a task creation form that includes:
- Task title and description fields
- Criticality selector (Critical / Standard / Low Priority)
- Assigned crew member dropdown
- Submit button with loading state and confirmation
- Visual feedback when task is logged
The form posts to the GAS endpoint with action=log_maintenance, awaiting confirmation before clearing the form.
4. Calendar Integration for Ops Visibility
Created /Users/cb/Documents/repos/sites/queenofsandiego.com/MaintenanceCalendar.gs to automatically create calendar events for critical tasks:
// MaintenanceCalendar.gs pattern
function createMaintenanceEvent(taskTitle, criticality) {
const calendar = CalendarApp.getCalendarById(JADA_MAINTENANCE_CALENDAR_ID);
if (criticality === "critical") {
const event = calendar.createEvent(taskTitle, new Date(),
new Date(Date.now() + 3600000)); // 1 hour duration
event.setColor(CalendarApp.EventColor.RED);
}
}
This creates visual calendar entries for critical maintenance, ensuring it appears in both email and the crew's calendar workflow. The calendar "Jada Maintenance" was created under jadasailing@gmail.com with appropriate sharing permissions.
5. Notification Delivery via SNS/SQS
The Lambda function integrates with:
- SNS Topic: For critical tasks, immediate push to Sergio and ops team at registered email addresses
- SQS Queue: For standard/low-priority tasks, daily batch processing at 6 PM UTC
- Email Service: Staging sends to jadasailing@gmail.com; production will send to appropriate ops aliases
Infrastructure: Staging vs. Production Separation
Challenge: The maintenance tool previously had no staging environment, running directly against production infrastructure.
Solution Architecture:
- Frontend Separation:
- Staging:
maintenance.queenofsandiego.com/staging-index.html - Production:
maintenance.queenofsandiego.com/index.html
- Staging:
- GAS Script Routing: Single BookingAutomation.gs with environment detection via request headers
- Lambda Functions: Separate dev/staging and prod Lambda functions pointing to different DynamoDB tables and SNS topics
- CloudFront Invalidation: After deploying staging HTML, cache is invalidated via CloudFront API to ensure immediate visibility
// CloudFront invalidation command structure (no actual paths shown)
aws cloudfront create-invalidation \
--distribution-id [DISTRIBUTION_ID] \
--paths "/staging-index.html" "/maintenance/*"
Deployment Pipeline
Step 1: Update staging HTML in tools/maintenance/staging-index.html
Step 2: Deploy staging HTML to S3 maintenance bucket
aws s3 cp ./tools/maintenance/staging-index.html \
s3://[MAINTENANCE_BUCKET]/staging-index.html
Step 3: Push GAS changes (MaintenancePersistence.gs and BookingAutomation.gs modifications)
clasp push # Syncs GAS files with Apps Script project
Step 4: Invalidate CloudFront cache for the staging path
aws cloudfront create-invalidation --distribution-id [DIST_ID] --paths "/staging*"
Testing Strategy
Staging testing follows this workflow:
- Submit test tasks with various criticality levels through staging-index.html
- Verify emails arrive at jadasailing@gmail.com within 5