Building Multi-Site Technical Blog Infrastructure with Auto-Generated Session Transcripts
This session involved architecting and implementing a comprehensive technical documentation system across four distinct web properties: queenofsandiego.com, sailjada.com, dangerouscentaur.com, and burialsatseasandiego.com. The goal was to create an automated mechanism that captures development work at a granular level and generates blog posts in real-time, providing complete transparency into technical changes and infrastructure updates.
System Architecture Overview
The solution consists of three primary components:
- Session Capture Hook: A shell script that runs when Claude Code sessions end, extracting transcript data
- Blog Generator: Python script that parses session transcripts and converts them into detailed technical blog posts
- Infrastructure Layer: S3 buckets, CloudFront distributions, and DNS records for each tech blog domain
Infrastructure Provisioning
Created four independent tech blog deployments using existing wildcard SSL certificates:
- tech.queenofsandiego.com: Deployed on S3 bucket
qos-tech-blogwith CloudFront distribution, leveraging existing*.queenofsandiego.comwildcard certificate - tech.sailjada.com: Deployed on S3 bucket
sailjada-tech-blogwith CloudFront distribution, using existing*.sailjada.comwildcard certificate - tech.dangerouscentaur.com: Deployed on S3 bucket
dc-sites(existing wildcard distribution E2Q4UU71SRNTMB) with Namecheap CNAME pointing to CloudFront domain - tech.burialsatseasandiego.com: Deployed on S3 bucket
bats-tech-blogwith ACM certificate validation through GoDaddy DNS API integration
Each deployment includes an index.html landing page with navigation structure ready to receive generated blog posts. The infrastructure initialization script handles bucket creation, CloudFront distribution setup, and DNS record provisioning.
Session Transcript Processing Pipeline
The system captures Claude Code session data through a Stop hook located at /Users/cb/.claude/hooks/tech_blog_stop.sh. This hook:
- Reads the most recent session transcript from Claude's projects directory structure
- Extracts tool use entries that represent actual file modifications and commands executed
- Filters sensitive information (credentials, API keys, personal data)
- Passes cleaned transcript data to the blog generator
The hook is configured in /Users/cb/.claude/settings.json under the hooks section, triggering automatically when development sessions complete.
Blog Generation Logic
The blog generator (/Users/cb/Documents/repos/tools/tech_blog_generator.py) performs several key operations:
- Transcript Parsing: Reads JSONL-formatted session data and extracts meaningful work units
- Site Detection: Analyzes file paths to determine which tech blog(s) should receive the post based on repository structure
- Content Generation: Converts raw file modifications and commands into structured HTML with semantic sections
- S3 Publishing: Uploads generated post to appropriate S3 bucket and triggers CloudFront invalidation
- Index Updates: Maintains reverse-chronological index of all posts with timestamps and summaries
The generator creates posts following a consistent format: specific file paths, exact AWS resource identifiers (bucket names, distribution IDs), command examples without credentials, and architectural rationale. This granularity ensures technical stakeholders like Sergio can understand exactly what changed and why.
Multi-Site Domain Mapping
The infrastructure initialization script handles DNS differently based on domain provider:
- Route53-managed domains (queenofsandiego.com, sailjada.com): Creates alias records pointing CloudFront distributions to the wildcard subdomains
- Namecheap-managed domains (dangerouscentaur.com): Provisions CNAME records pointing tech subdomain to CloudFront endpoint
- GoDaddy-managed domains (burialsatseasandiego.com): Integrates with GoDaddy API to add ACM certificate validation records and sets up CloudFront CNAME
This multi-provider approach required abstraction in the infrastructure code, with conditional logic determining appropriate DNS operations based on domain configuration stored in /Users/cb/.claude/projects/memory/project_tech_blogs.md.
Content Filtering and Security
A critical requirement was preventing credential leakage while maintaining technical detail. The blog generator implements several filtering mechanisms:
- Strips environment variables and credential references from command output
- Redacts file paths containing sensitive tokens or keys
- Validates against patterns matching common credential formats before publishing
- Maintains a sanitization log for audit purposes
Session data review in /Users/cb/.claude/projects/memory/feedback_proof_email_salutation.md and related memory files ensures human oversight can catch any edge cases the automated filters miss.
Integration with Existing Navigation
Updated /Users/cb/Documents/repos/sites/queenofsandiego.com/index.html to include a "Technical Blog" link in the Ship's Papers menu, making tech documentation discoverable alongside other site resources. Similar navigation updates were applied across all four properties to surface the tech blogs prominently.
Operational Workflow
The complete workflow operates as follows:
- Developer works on a session, modifying files across the repository
- Session ends; Stop hook executes automatically
- Hook extracts transcript and invokes blog generator with site context
- Generator creates post with filtered, structured technical details
- Post uploads to appropriate S3 bucket
- CloudFront distribution invalidates to serve fresh content
- Post becomes immediately available at tech.{domain}.com
Key Decisions and Rationale
Why automated generation over manual posts? Manual documentation introduces latency and selection bias. Automated capture ensures every technical change is documented regardless of perceived importance, providing complete audit trails.
Why separate S3 buckets for each domain? Maintains clear separation of concerns, simplifies access control policies, and allows independent scaling if certain tech blogs receive high traffic.
Why preserve exact file paths and resource IDs? When reviewing infrastructure changes, exact identifiers allow rapid verification: grep the post for a CloudFront dist ID to confirm which deployment was modified.
What's Next
Initial testing confirmed the infrastructure is live and accessible. The blog generator is functional and integrated with Claude Code settings. Upcoming work includes:
- Image inventory work for burialsatseasandiego.com (tracked in progress board)
- Analytics audit across all properties with recommendations for booking conversion optimization
- Refinement of generated post formatting based on actual session output patterns
- Monitoring dashboard for tech blog post generation success