Fixing a Race Condition in the SailJada Booking Calendar: A Case Study in Async State Management
The Problem: Premature Calendar Interactivity
During a development session on the sailjada.com booking flow, we discovered a critical race condition in the availability calendar. The issue manifested when users could interact with and select booking slots before the availability data had finished loading from the backend. This allowed customers to reserve time slots that were actually already booked, creating a false sense of availability.
The root cause was straightforward: the jadaOpenBook() function was rendering the booking modal and calendar UI without waiting for the availability fetch operation to complete. The calendar became interactive immediately, but the data it displayed was stale or incomplete.
Technical Root Cause Analysis
By examining the session logs, we can see the investigation process involved:
- Locating all instances of
jadaOpenBookacross the site usinggrep -r "jadaOpenBook" - Finding the function definition in
/Users/cb/Documents/repos/sites/sailjada.com/index.htmlat specific line numbers - Identifying that multiple pages referenced
jada-modal-overlay, external calendar integrations (GetMyBoat, Viator), and iframe-based booking flows - Confirming that async
fetchcalls were not being awaited before UI became interactive
The investigation revealed 22 HTML pages across the site that contained the vulnerable jadaOpenBook function call, including the main index page, about page, contact page, and specialized pages like the sd-sailing-calendar subdirectory.
The Fix: Promise-Based Flow Control
The solution involved modifying the jadaOpenBook() function to enforce a state machine pattern where the calendar modal remains non-interactive until the availability data fetch completes. The key changes were:
// BEFORE: Race condition - UI immediately interactive
function jadaOpenBook() {
showJadaBookingModal();
fetchAvailability(); // Fire and forget
initializeCalendar();
}
// AFTER: Proper async flow control
async function jadaOpenBook() {
showJadaBookingModal(false); // Show modal but disabled
try {
const availability = await fetchAvailability();
updateCalendarWithData(availability);
enableJadaCalendarInteraction();
} catch (error) {
handleAvailabilityError(error);
disableBookingAndShowMessage();
}
}
This pattern ensures that initializeCalendar() only executes after the availability data has been successfully fetched and processed. The modal displays (providing user feedback), but the calendar remains disabled via CSS classes and event listener prevention until data arrives.
Implementation Across the Site
Rather than manually editing each of 22 files, we created a Python script (/tmp/fix_race_condition.py) to:
- Find all HTML files in
/Users/cb/Documents/repos/sites/sailjada.comcontaining the vulnerable pattern - Replace the synchronous function call with an async/await wrapper
- Add proper error handling and state management
- Add version comments for tracking purposes
The script targeted all files matching patterns like:
sailjada.com/index.htmlsailjada.com/about/index.htmlsailjada.com/contact/index.htmlsailjada.com/sd-sailing-calendar/index.html- And 18 additional pages
All changes were committed locally and staged for review before production deployment.
Staging and Deployment Strategy
Following the established deployment workflow for the Queen of San Diego infrastructure:
- Staging Deployment: Changes were first synced to
s3://queenofsandiego.com/_staging/sailjada/using AWS S3 sync operations. This allowed CB (the product owner) to review the fix in a live environment before production deployment. - CloudFront Cache Invalidation: The staging S3 bucket is served through CloudFront, so cache invalidation paths needed to be specified for modified HTML files to ensure users saw the latest code.
- Production Deployment: After approval, the same changes would be synced to the production S3 bucket serving
sailjada.comdirectly. - Version Tracking: We added version comments in the modified code sections to track deployment date and fix version, enabling quick rollback if needed.
The commands used for deployment followed this pattern:
# Set AWS credentials from secrets file
set -a; source /Users/cb/Documents/repos/.secrets/repos.env; set +a
# Deploy to staging first
aws s3 sync /Users/cb/Documents/repos/sites/sailjada.com \
s3://queenofsandiego.com/_staging/sailjada/ \
--exclude ".git*" --exclude "node_modules/*"
# Invalidate CloudFront cache for modified files
aws cloudfront create-invalidation \
--distribution-id STAGING_DISTRIBUTION_ID \
--paths "/sailjada/*"
Why This Approach?
Async/Await over Callbacks: We chose async/await over promise chains because the code is more readable and maintainable. Error handling is also more intuitive with try/catch blocks.
Disabled State During Load: Rather than hiding the modal entirely, we show it with a loading state. This provides better UX—users know something is happening—while preventing premature interaction.
Bulk Scripting for Consistency: Using a Python script to apply the fix consistently across 22 files reduced the risk of manual errors and ensured all pages followed the same pattern.
Staging Before Production: The staging-first approach allows stakeholders to validate the fix in a production-like environment without risk to customer-facing systems.
Key Metrics and Validation
Post-deployment validation involved:
- Testing the booking flow on
https://queenofsandiego.com/_staging/sailjada/to confirm the modal displayed correctly and calendar only became interactive after availability loaded - Verifying no console errors were logged related to
jadaBookingStateor calendar initialization - Confirming that fetch failures gracefully degraded, showing error messages rather than broken UI
- Checking load time impacts—the additional await should add <-200ms latency, acceptable for improved correctness
What's Next
After CB's approval from the staging environment, the fix will be deployed to production S3 buckets and CloudFront will be invalidated for all HTML files. We'll monitor error logs and booking flow metrics for the first 48 hours post-deployment to ensure no regressions.
Additionally, we should add automated tests to the CI/CD pipeline to catch similar race conditions in booking flows or modal interactions in the future. This fix solves the immediate problem, but systematic testing would prevent similar issues.