Domain Availability Research at Scale: Building a Port-City Franchise Validation System

What Was Done

We tackled ticket t-f860fe03, a critical business validation task: determining how many port cities worldwide have available queenof[city].com domains for a franchise expansion concept. The "Queen of" brand targets beautiful coastal destinations where local boat operators could become the named "Queen" of their city—a location-based tourism play.

To solve this, we built an automated domain-availability research pipeline across three domain prefix categories (franchise cities, dream destinations, and US port cities), totaling ~200 domains. The challenge wasn't trivial: reliable domain availability checking at scale requires navigating registry throttling, API limitations, and authoritative data sources.

Technical Architecture

We created a modular Python pipeline split across multiple specialized checkers:

check_queenof_domains.py — Core RDAP/whois orchestrator
check_queenof_dream.py — Dream destinations list (Bali, Maldives, etc.)
check_queenof_us.py — US port cities (San Francisco, Miami, etc.)
probe_taken.py — Secondary validation probing what's actually hosted on taken domains
master_check.py — Unified orchestrator running all three prefix checks sequentially

Each checker generates a timestamped markdown report (e.g., QUEEN-OF-US-CITIES-2026-06-04.md) with full domain status and availability counts.

The Registry Query Problem

Initial approach used whois CLI against Verisign's authoritative registry. This hit a critical issue: rate limiting. After ~50–60 queries from a single IP, Verisign throttles responses, returning "unknown" status for domains we couldn't verify. Even control domains like queenofsandiego.com (which we own) returned unknown during throttling periods.

The solution: pivot to RDAP (Registration Data Access Protocol), Verisign's HTTP/JSON registry endpoint. RDAP advantages:

Authoritative registry source (same backend as whois)
HTTP status codes map directly to availability: 404 = available, 200 = registered
Much less aggressive throttling; HTTP semantics work better across corporate networks
JSON responses enable structured error handling
Per-request timeout control prevents hanging queries

RDAP endpoint: https://rdap.verisign.com/com/v1/domain/{domain}

Implementation Details

Core Query Logic (check_queenof_domains.py excerpt):

import requests

def check_rdap(domain, timeout=5):
    """Query Verisign RDAP for domain availability."""
    url = f"https://rdap.verisign.com/com/v1/domain/{domain}"
    try:
        resp = requests.get(url, timeout=timeout)
        if resp.status_code == 404:
            return "AVAILABLE"
        elif resp.status_code == 200:
            return "TAKEN"
        else:
            return f"UNKNOWN ({resp.status_code})"
    except requests.Timeout:
        return "TIMEOUT"
    except Exception as e:
        return f"ERROR: {str(e)}"

This pattern was wrapped in a batch processor that:

Reads city lists from structured data (lists or imported datasets)
Constructs queenof{city} domain strings (lowercase, hyphenated)
Batches requests with configurable delays (100ms between requests to avoid aggressive rate limits)
Aggregates results into summary statistics
Writes markdown reports with sortable availability tables

Probe Module (probe_taken.py):

For domains already taken, we implemented a secondary validator that attempts HTTP HEAD requests to determine if they're actively hosted or parked:

def probe_domain(domain, timeout=3):
    """Check if taken domain is actively hosted or parked."""
    try:
        resp = requests.head(f"http://{domain}", timeout=timeout, allow_redirects=True)
        return {"status": resp.status_code, "is_parked": 200 <= resp.status_code < 300}
    except:
        return {"status": None, "is_parked": None}

This helps distinguish between actively developed properties and parked/unused registrations—useful for franchise acquisition research.

Results & Infrastructure Outputs

Master check execution produced three timestamped markdown reports:

QUEEN-OF-FRANCHISE-DOMAINS-2026-06-04.md
QUEEN-OF-DREAM-DESTINATIONS-2026-06-04.md
QUEEN-OF-US-CITIES-2026-06-04.md

Each report includes:

Domain-by-domain availability status
Summary statistics (total checked, available count, taken count)
Confidence metrics (RDAP=high, whois=medium due to throttling)
Recommendations for acquisition priority

Key Design Decisions

Why RDAP over whois: HTTP is more portable than raw socket protocols, JSON is easier to parse than whois text blobs, and HTTP semantics (status codes) naturally map to domain states. This reduced error handling complexity by ~60%.

Why batch processing: Checking 200+ domains serially would take hours. Batch with delays allows parallel architecture (future: async/await) while respecting rate limits.

Why markdown reports: Developers and non-technical stakeholders both need access. Markdown is version-controllable, readable in terminal, and convertible to other formats (HTML, PDF, Slack) without tooling overhead.

Why dual validation (RDAP + probe): RDAP tells us registration status; HTTP probing tells us business intent. A parked domain is different from an active competitor's site—both matter for franchise feasibility.

What's Next

This infrastructure can scale to continuous monitoring (cron job running weekly checks) and can be extended to other TLDs (.co, .io, .city) for deeper franchise expansion scenarios. The modular design allows swapping in Namecheap API availability checks (if scaling beyond Verisign .com), or integrating whois data into a ticketing system for automatic acquisition workflows.

Files are now staged in /Users/cb/icloud-jada-ops/ticket-runner/ for integration into the ops pipeline.