Project Overview
The platform ingested multi-network signals for 18-20M U.S. businesses and resolved approximately 20% with verified social mappings during the initial period, growing approximately 20% month-over-month. Data pipelines normalized, deduped, and scored entities on geography, recency, and interaction intensity. This enabled intent-rich audience building (e.g., pizzerias in downtown San Diego engaged within 90 days) and established the validated 5-10 mile radius as the cost/performance sweet spot for local SMB campaigns.
Key Challenges
- Entity resolution across noisy SMB records and duplicate/incorrect profiles
- Cross-network identity stitching with strict ToS/privacy constraints
- Freshness and recency weighting to avoid stale audience drift
- Rate-limit, anti-spam, and reliability constraints across platforms
- Provenance tracking and compliant data activation
Technologies & Solutions
Ruby ETL pipelines and schedulers
PostgreSQL for entity and interaction graphs
Redis for queues/caching
Semantic/keyword extraction for category and intent tagging
Geospatial indexing and radius queries
Event scoring and cohort builders
Key Metrics
18-20M U.S. businesses indexed
~20% verified social mappings in early years
~20% monthly growth in coverage during ramp
4 social networks ingested (FB/IG/Twitter/X/GooglePlus)
5-10 mile geo radius validated for best cost/result
Results & Impact
Action-verified local audiences powering higher-intent campaigns; validated 5-10 mile geo sweet spot for SMB conversion
Want Similar Results?
Let's discuss how we can help solve your engineering challenges.