Building a Local SMS Sync Bridge: Extracting Samsung Messages Without Cloud Dependencies

Over the past development session, I built out a local SMS synchronization system that bridges Samsung Android devices with macOS without requiring Twilio credentials or external cloud infrastructure. This post documents the architecture, implementation decisions, and lessons learned.

The Problem Statement

The existing SMS infrastructure relied on Twilio for message ingestion and retrieval. However, for certain use cases—particularly extracting historical SMS threads from local Android backups—we needed a lightweight, credential-free approach that could:

  • Read SMS directly from Android device databases
  • Parse message threads without API dependencies
  • Export digests in a format suitable for email delivery
  • Run on a local machine with minimal setup

The challenge: Android SMS data lives in the chat.db SQLite database, typically locked by the Messages app. We needed both the tooling and the extraction strategy.

Architecture: Three-Layer Approach

The solution implements a three-layer architecture:

  • Layer 1: Device Bridge — ADB (Android Debug Bridge) or Messages.app integration on macOS
  • Layer 2: Data Extraction — SQLite query scripts to parse message threads
  • Layer 3: Processing & Delivery — Python daemon that digests threads and sends summaries via SES

This approach keeps everything local until the final step (email delivery), reducing external dependencies and improving privacy.

Implementation Details

File Structure

Created two primary files:

  • /Users/cb/Documents/repos/tools/samsung_sms_sync.py — Main extraction and digest logic
  • /Users/cb/Library/LaunchAgents/com.cb.samsung-sms-sync.plist — macOS daemon configuration for scheduled runs

The Python script handles:

  • Querying the local SMS export file (previously generated from the Messages app)
  • Parsing conversation threads by participant phone number
  • Filtering messages by date range
  • Formatting thread summaries with context headers
  • Sending digests via AWS SES

Message Query Strategy

Rather than directly accessing chat.db (which requires unlocking the Messages app and handling file locks), we leveraged an existing SMS export file that had already been extracted from the device. The export format is text-based, making it easy to parse with standard string operations:


# Pseudocode for thread extraction
threads = {}
for line in sms_export:
    if line matches participant pattern:
        current_participant = extract_phone_number(line)
    elif line matches timestamp pattern:
        message_date = parse_timestamp(line)
        if message_date in target_range:
            threads[current_participant].append({
                'timestamp': message_date,
                'body': extract_body(line),
                'direction': 'inbound' or 'outbound'
            })

This approach avoids the complexity of SQLite locking and lets us reuse existing export infrastructure.

Daemon Configuration

The LaunchAgent plist was configured to run the sync script at regular intervals:


<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" 
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.cb.samsung-sms-sync</string>
    <key>ProgramArguments</key>
    <array>
        <string>/usr/bin/python3</string>
        <string>/Users/cb/Documents/repos/tools/samsung_sms_sync.py</string>
    </array>
    <key>StartInterval</key>
    <integer>3600</integer>
    <key>StandardOutPath</key>
    <string>/var/log/samsung-sms-sync.log</string>
    <key>StandardErrorPath</key>
    <string>/var/log/samsung-sms-sync.err</string>
</dict>
</plist>

The StartInterval key runs the script every 3600 seconds (1 hour), though this can be tuned based on message volume and digest frequency requirements.

Key Infrastructure Decisions

Why Not Direct SQLite Access?

Direct access to chat.db requires either:

  • Closing the Messages app (breaks user experience)
  • Using file locking mechanisms (fragile across OS updates)
  • Extracting a copy at sync time (adds latency)

Using the pre-existing export file sidesteps these issues and works within macOS's existing constraints.

Why SES for Delivery?

AWS SES was chosen because:

  • Already integrated into the codebase for other services
  • No rate-limiting issues for digest-scale volumes (<10 emails/hour)
  • Credentials already available in the shared environment config
  • Avoids depending on Twilio for outbound communication

Date Range Filtering

The script filters messages by a configurable window (e.g., last 5 days) to avoid re-digesting old threads. This is controlled via command-line arguments or environment variables, allowing flexible digest schedules without code changes.

Workflow Integration

The extraction process integrates with existing infrastructure:

  • SMS export file location is centralized in repos.env
  • Email recipients are configured in a simple list (no hardcoding)
  • Digest format matches existing message summary templates
  • Logs are written to standard locations for monitoring

This makes the system maintainable and debuggable without requiring deep knowledge of the specific Android setup.

What's Next

Future improvements could include:

  • Bidirectional sync: Writing replies back to the SMS export or Android device via ADB
  • Thread deduplication: Tracking which digests have been sent to avoid duplicates across restarts
  • Participant aliasing: Mapping phone numbers to human-readable names in digest output
  • Metrics: Publishing message counts and thread latency to CloudWatch for monitoring
  • ADB integration: If direct Android access becomes necessary, ADB scripts are already stubbed in the tools directory