Building a Transaction Enrichment API on Cloudflare Workers

How I built Triqai's transaction enrichment API on Cloudflare Workers, using Durable Objects for atomic billing, a distributed singleflight pattern, and multi-tier edge caching to achieve 96% latency reduction.

When I started building Triqai, a transaction enrichment API that takes raw bank transaction strings like `PP*DOORDASH*Convivo #1422 Santa Barbara, CA` and returns structured merchant data, I had a core constraint: every API call costs us money in upstream lookups, and our customers pay per-request. The system needed to be fast, globally distributed, and extremely precise about billing. Cloudflare Workers turned out to be the right fit, but the architecture that emerged involved patterns I hadn't seen documented elsewhere. ## Why Cloudflare Workers Over Traditional Node.js The obvious answer is edge latency, Workers run in 300+ locations, but that wasn't actually the primary driver. The real reason was the programming model. Triqai's enrichment pipeline is fundamentally I/O-bound: we receive a transaction string, normalize it, check caches, hit upstream enrichment sources, score confidence, and return structured data. There's almost no CPU-heavy work. A Node.js server would spend most of its time waiting on network calls, and I'd need to manage container scaling, health checks, and regional deployments myself. Workers gave me automatic global distribution with zero deployment configuration. But more importantly, the Durable Objects primitive solved a billing problem that would have required Redis + distributed locks in a traditional setup. ## Durable Objects as Single-Writer Billing Gates Here's the billing problem: a customer sends 50 concurrent requests. Each request needs to check "does this customer have enough credits?" before doing expensive upstream work. In a traditional architecture, you'd hit a race condition: 50 requests all read "balance: 100 credits," all proceed, and you've done 50 units of work but can only charge for what the customer actually had left. The standard solution is distributed locking with Redis, but that introduces a network hop to a centralized store, adds a failure mode, and requires careful lock timeout management. Durable Objects gave me something better: a single-writer actor per customer. Each customer's billing state lives in exactly one Durable Object instance, and Cloudflare guarantees that all requests to that object are serialized. No locks needed, it's single-threaded by design. ```typescript export class BillingGate extends DurableObject { private credits: number | null = null; async checkAndDeduct(requestId: string, cost: number): Promise<BillingResult> { if (this.credits === null) { this.credits = await this.loadCreditsFromStorage(); } if (this.credits < cost) { return { allowed: false, remaining: this.credits }; } this.credits -= cost; await this.state.storage.put("credits", this.credits); return { allowed: true, remaining: this.credits, requestId }; } } ``` Because the Durable Object processes requests sequentially, this `checkAndDeduct` call is atomic without any locking mechanism. The 50 concurrent requests queue up and execute one at a time. When the balance hits zero, subsequent requests get rejected immediately. The tradeoff is that the billing check becomes a serialization point. I optimized for this by keeping the Durable Object logic minimal, just the credit check and deduction, and doing all enrichment work outside the object. ## Distributed Singleflight: Preventing Duplicate Work A pattern I borrowed from Go's `singleflight` package, adapted for a distributed edge environment. The problem: when three requests arrive simultaneously for the same transaction string, I don't want to hit upstream enrichment APIs three times. I want to do the work once and share the result. On a single server, this is straightforward: you use an in-memory map of in-flight promises. But Workers are distributed across hundreds of locations, and there's no shared memory between isolates. My approach uses a two-layer singleflight: **Layer 1: Isolate-level deduplication.** Within a single Worker isolate, I maintain a `Map<string, Promise<EnrichmentResult>>` that collapses concurrent requests for the same normalized transaction key. This catches the common case where a client batch-sends requests that arrive at the same edge location. ```typescript const inflight = new Map<string, Promise<EnrichmentResult>>(); async function enrichWithSingleflight(key: string): Promise<EnrichmentResult> { const existing = inflight.get(key); if (existing) return existing; const promise = doEnrichment(key).finally(() => inflight.delete(key)); inflight.set(key, promise); return promise; } ``` **Layer 2: Cache-based deduplication across locations.** Before starting upstream work, I write a short-lived (5-second TTL) "pending" marker to KV. Other isolates that see this marker wait and poll briefly before starting their own upstream call. This isn't perfect (KV has eventual consistency) but it catches enough duplicates to meaningfully reduce upstream costs. The combination of both layers eliminated roughly 30% of redundant upstream calls in production. ## Multi-Tier Edge Caching Strategy Caching is where the 96% latency reduction comes from. The enrichment result for a given transaction string is essentially immutable, the merchant behind `PP*DOORDASH*Convivo #1422 Santa Barbara, CA` doesn't change. So I cache aggressively across three tiers: **Tier 1: In-memory isolate cache.** A simple LRU map within the Worker isolate. Zero-latency reads, but ephemeral since it's gone when the isolate is recycled. Sized at 5,000 entries with an hour TTL. This catches repeated lookups for popular merchants within a single edge location. **Tier 2: Cloudflare KV.** Global, eventually consistent key-value store. Reads are fast from the edge (~10ms), writes propagate globally within 60 seconds. I store enrichment results here with a 30-day TTL. This is the primary cache layer and handles ~80% of all requests without any upstream work. **Tier 3: Durable Object state.** For billing-critical data and customer-specific enrichment overrides, I use the Durable Object's transactional storage. This is strongly consistent but has higher latency. I only use it for data where eventual consistency would cause billing errors. The cache key is a normalized hash of the transaction string after stripping dates, amounts, and common noise patterns. Normalization is critical because raw transaction strings vary wildly between banks even for the same merchant. ```typescript function normalizeTxn(raw: string): string { return raw .toUpperCase() .replace(/\d{2}[-/]\d{2}([-/]\d{2,4})?/g, "") // strip dates .replace(/\b\d+[.,]\d{2}\b/g, "") // strip amounts .replace(/\b(POS|NR|REF|IBAN)\b/g, "") // strip noise tokens .replace(/\s+/g, " ") .trim(); } ``` ## Confidence Scoring Architecture Not all enrichment results are equally reliable. A transaction string like `ALBERT HEIJN 1"; "AMSTERDAM` is easy, high confidence match. But `AH TO GO SCHIPHOL 42` requires fuzzy matching and the result might be wrong. I built a confidence scoring system that returns a 0-100 score with every enrichment result, along with confidence reason tags that explain why the score is what it is. The score is determined by cross-referencing multiple sources: databases, live web data, and various other signals. For example, if the transaction contains "Albert Heijn Amsterdam," there are 50+ Albert Heijn locations in Amsterdam alone. Without more specific context like a street address or store number, the system gives back a low confidence score because there isn't enough information to pinpoint the exact location. The reason tags would include something like `multiple_location_matches` so the consumer understands why. When the system can cross-reference the merchant against web data, a verified business listing, and geographic signals that all agree, the confidence jumps to 90+. The key insight is that different source types have different reliability profiles, and the system weighs agreement between independent sources heavily. ## Cold vs. Warm Request Paths The performance difference between cache-hit and cache-miss paths is dramatic: **Warm path (cache hit):** ~8ms total. Normalize string, check in-memory cache, return. If in-memory misses but KV hits, add ~12ms. **Cold path (cache miss):** ~180ms total. Normalize, miss all caches, billing gate check via Durable Object (~15ms), upstream enrichment calls (~120ms, parallelized), confidence scoring (~5ms), write to KV + respond. The 96% latency reduction refers to the warm path vs. a naive implementation that hits upstream on every request. In practice, after the system has been running for a few weeks, the KV cache hit rate stabilizes around 85%, meaning the vast majority of requests resolve in under 20ms. ## What I'd Do Differently Durable Objects' cold start latency (~30ms) was a surprise. For the billing gate, which is on the critical path, this matters. I mitigated it by keeping billing objects warm through periodic pings, but it's an operational annoyance. I'd design around this from the start next time. The Cloudflare Workers platform has genuine constraints: no native TCP connections, 128MB memory limit, CPU time limits. For the most part these haven't been practical blockers for an I/O-bound API like Triqai, but there are small tasks that simply can't run on Workers. I ended up running external Node.js servers for specific jobs like fetching branding assets (logos, brand colors) from merchant websites, which require capabilities that Workers don't support. Accepting that hybrid architecture early would have saved some refactoring time. The combination of Workers for compute, KV for caching, and Durable Objects for coordination gave me a globally distributed system with strong billing guarantees, and I didn't have to think about servers once for the core API.

Blog

Browse posts in explorer mode. Click a post to open its article in a new window with metadata, images, and code blocks.

Post List

Loading workspace