Building a Local SMS Digest Pipeline: Samsung Messages Integration Without Twilio
This post documents the architecture and implementation of a local SMS digest system that extracts conversation threads from Samsung Messages, processes them, and delivers summaries via email—all without requiring Twilio credentials or cloud SMS infrastructure.
Problem Statement
The original SMS infrastructure relied on Twilio for inbox management and message retrieval. However, for operational use cases where messages are already stored locally (Android device backups, Mac Messages database, or Samsung SMS exports), pulling Twilio credentials and paying per-message fees becomes unnecessary overhead. The goal was to build a lightweight, local-first pipeline that could:
- Read SMS data from local sources (Mac Messages chat.db, Android SMS backups, or exported Samsung SMS files)
- Extract specific conversation threads by phone number
- Generate human-readable digests of recent activity
- Send summaries via existing email infrastructure (AWS SES)
- Require zero Twilio integration
Architecture Overview
The solution uses a three-stage pipeline:
- Stage 1: SMS Source Detection & Ingestion — Identifies available SMS databases (Mac Messages chat.db, Android backups, Samsung SMS exports)
- Stage 2: Thread Extraction & Processing — Filters messages by phone number, sorts by timestamp, and structures conversation metadata
- Stage 3: Digest Generation & Delivery — Formats summaries and sends via SES
Implementation Details
SMS Source Detection
The first step involved identifying where SMS data actually lives on the system. Commands run included:
find ~ -name "chat.db" -type f 2>/dev/null
find ~ -name "*messages*" -type f 2>/dev/null
ls -lah ~/Documents/repos/tools/ | grep -i sms
This revealed multiple sources:
- Mac Messages:
~/Library/Messages/chat.db— SQLite database with all iMessage and SMS conversations - JADA SMS Export: A pre-existing export file containing Samsung SMS data
- Android Backups: Potential ADB bridge or KDE Connect data (explored but not needed for this use case)
Mac Messages Database Schema
The Mac Messages chat.db is a standard SQLite database with the following key tables:
chat— Conversation metadata (chat identifiers, service type)message— Individual messages with timestamps, text, and handle_idhandle— Phone numbers and email addresses linked to messages
Querying the database requires understanding chat identifiers, which encode the recipient's phone number in a specific format. Example query to list all conversations:
sqlite3 ~/Library/Messages/chat.db "SELECT c.chat_identifier, COUNT(m.ROWID) as message_count FROM chat c LEFT JOIN chat_message_join cmj ON c.ROWID = cmj.chat_id LEFT JOIN message m ON cmj.message_id = m.ROWID GROUP BY c.chat_identifier ORDER BY MAX(m.date) DESC;"
Thread Extraction by Phone Number
Once the database was accessible, the next step was extracting full conversation threads by phone number. The query needed to:
- Find the chat_identifier matching the target phone number
- Retrieve all messages from that conversation, sorted by timestamp
- Join with handle table to get sender/recipient info
- Filter to recent activity (last 7 days, configurable)
Example for phone number +16194164690:
sqlite3 ~/Library/Messages/chat.db "SELECT m.text, m.date, h.id FROM message m JOIN chat_message_join cmj ON m.ROWID = cmj.message_id JOIN chat c ON cmj.chat_id = c.ROWID JOIN handle h ON m.handle_id = h.ROWID WHERE c.chat_identifier LIKE '%6194164690%' ORDER BY m.date DESC LIMIT 100;"
The m.date field uses macOS Cocoa timestamp format (seconds since 2001-01-01), requiring conversion to Unix timestamps or ISO 8601 for human-readable output.
Python Implementation: samsung_sms_sync.py
The main processing logic was implemented in /Users/cb/Documents/repos/tools/samsung_sms_sync.py. Key components:
- SMS Reader Module — Opens local SMS export or Mac Messages chat.db and extracts message objects
- Thread Extractor — Filters messages by phone number and date range
- Digest Generator — Summarizes conversation content with key themes and action items
- Email Dispatcher — Formats digest as HTML email and sends via boto3 SES client
The script is designed to be invoked from the command line with a phone number argument:
python3 /Users/cb/Documents/repos/tools/samsung_sms_sync.py --phone "+16194164690" --days 7
Daemon Setup with LaunchAgent
To automate digest generation on a schedule, a macOS LaunchAgent was created at:
/Users/cb/Library/LaunchAgents/com.cb.samsung-sms-sync.plist
This plist file configures:
- Label:
com.cb.samsung-sms-sync— Unique identifier for launchd - ProgramArguments: Python script path and arguments
- StartInterval: Frequency (e.g., 3600 for hourly runs)
- StandardOutPath / StandardErrorPath: Log file locations for debugging
- EnvironmentVariables: AWS credentials for SES access (via environment, not hardcoded)
To load the daemon:
launchctl load ~/Library/LaunchAgents/com.cb.samsung-sms-sync.plist
Email Delivery via AWS SES
Rather than reinventing email, the digest uses the existing SES infrastructure already configured in the application. The digest generator constructs an HTML email with:
- Summary section highlighting key themes (money, equipment issues, appointments)
- Full conversation thread with timestamps
- Call-to-action items for follow-up
SES send is handled via boto3:
import boto3
ses = boto3.client('ses', region_name='us-west-2')
response = ses.send_email(
Source='noreply@queenofsandiego.com',
Destination={'ToAddresses': ['c.b.ladd@gmail.com']},
Message={
'Subject': {'Data': f'SMS Digest: +{phone_number}'},
'Body': {'Html': {'Data': html_body}}
}
)