System Overview
AI SpendOps is an AI API proxy that sits between customers' applications and upstream AI providers. It authenticates requests, routes them transparently, extracts usage telemetry, enriches events with pricing data, and writes everything to an analytics store for cost attribution and spend management.
Components
| Component | Runtime | Purpose |
|---|---|---|
Proxy Worker (apps/proxy-worker) | Cloudflare Worker | Authenticates API keys (HMAC-SHA-256 against KV), validates dimensions, routes requests to upstream providers, streams responses byte-for-byte, extracts usage from final chunks, and emits events to queues. |
Usage Consumer (apps/usage-consumer) | Cloudflare Worker (Queue consumer) | Consumes usage events from the queue, enriches them with pricing data (rates from KV), calculates costs, and writes finalised records to ClickHouse. Flags unknown models. |
Denial Consumer (apps/denial-consumer) | Cloudflare Worker (Queue consumer) | Consumes denial events (rejected/blocked requests) and writes them to ClickHouse for audit and analytics. |
| Cloudflare Queues | Cloudflare | usage-events-{env} and denial-events-{env} — at-least-once delivery between proxy and consumers. |
| Cloudflare KV | Cloudflare | Stores API key policies (HMAC hashes, rate limits, allowed providers, dimensions) and pricing data (base rates, aliases, tenant overrides). |
| ClickHouse | ClickHouse Cloud | Analytics database for usage events, denial events, and aggregated views. Single database shared across environments, differentiated by an env field. |
| Azure SQL | Azure | Relational store for tenants, API keys, pricing models, budgets, invoices, and all management-plane state. |
| .NET Management API | Azure App Service | Control plane REST API. Manages tenants, API keys, pricing overrides, budgets, billing. Serves the portal. |
| CloudflarePricingSync | Azure Function | Reads pricing data from SQL, computes a single blob with hashes, and writes it to Cloudflare KV. Runs on a timer and on-demand. |
| Portal | Next.js (Azure Static Web Apps) | Admin UI for tenant management, API key provisioning, usage dashboards, budget configuration, and billing. |
Environments
| Dev | Prod | |
|---|---|---|
| Proxy Worker | aispendops-proxy-dev | aispendops-proxy-prod |
| Usage Consumer | aispendops-usage-consumer-dev | aispendops-usage-consumer-prod |
| Denial Consumer | aispendops-denial-consumer-dev | aispendops-denial-consumer-prod |
| KV Namespaces | Separate (dev) | Separate (prod) |
| Queues | usage-events-dev, denial-events-dev | usage-events-prod, denial-events-prod |
| ClickHouse | Shared DB, env = 'dev' | Shared DB, env = 'prod' |
| Azure SQL | Shared DB | Shared DB |
Dev and Prod use separate Cloudflare resources (workers, KV namespaces, queues) but share the ClickHouse and SQL databases, using environment fields to partition data.
Architectural Principles
-
Accounting correctness — Every token, every cost, every denial is recorded. Usage extraction happens in
waitUntilso it never blocks the response but is guaranteed to execute. Pricing enrichment uses deterministic formulas with explicit handling for provider-specific quirks (Anthropic cache tokens, OpenRouterprovider_cost). -
Stateless hot path — The proxy worker holds no state between requests. All config (key policies, pricing) is read from KV at request time. This makes horizontal scaling trivial and eliminates cache-invalidation bugs.
-
At-least-once delivery — Events flow through Cloudflare Queues with at-least-once semantics. The usage consumer is idempotent (ClickHouse deduplicates on
event_id). This ensures no usage data is lost even if a consumer crashes mid-batch. -
Explicit auditability — Denial events are recorded with full context (reason, HTTP status, tenant, key, provider, model, dimensions). Missing pricing models are flagged in KV. Budget alerts are logged. Every decision the system makes is traceable.