Skip to main content

System Overview

AI SpendOps is an AI API proxy that sits between customers' applications and upstream AI providers. It authenticates requests, routes them transparently, extracts usage telemetry, enriches events with pricing data, and writes everything to an analytics store for cost attribution and spend management.

Components

ComponentRuntimePurpose
Proxy Worker (apps/proxy-worker)Cloudflare WorkerAuthenticates API keys (HMAC-SHA-256 against KV), validates dimensions, routes requests to upstream providers, streams responses byte-for-byte, extracts usage from final chunks, and emits events to queues.
Usage Consumer (apps/usage-consumer)Cloudflare Worker (Queue consumer)Consumes usage events from the queue, enriches them with pricing data (rates from KV), calculates costs, and writes finalised records to ClickHouse. Flags unknown models.
Denial Consumer (apps/denial-consumer)Cloudflare Worker (Queue consumer)Consumes denial events (rejected/blocked requests) and writes them to ClickHouse for audit and analytics.
Cloudflare QueuesCloudflareusage-events-{env} and denial-events-{env} — at-least-once delivery between proxy and consumers.
Cloudflare KVCloudflareStores API key policies (HMAC hashes, rate limits, allowed providers, dimensions) and pricing data (base rates, aliases, tenant overrides).
ClickHouseClickHouse CloudAnalytics database for usage events, denial events, and aggregated views. Single database shared across environments, differentiated by an env field.
Azure SQLAzureRelational store for tenants, API keys, pricing models, budgets, invoices, and all management-plane state.
.NET Management APIAzure App ServiceControl plane REST API. Manages tenants, API keys, pricing overrides, budgets, billing. Serves the portal.
CloudflarePricingSyncAzure FunctionReads pricing data from SQL, computes a single blob with hashes, and writes it to Cloudflare KV. Runs on a timer and on-demand.
PortalNext.js (Azure Static Web Apps)Admin UI for tenant management, API key provisioning, usage dashboards, budget configuration, and billing.

Environments

DevProd
Proxy Workeraispendops-proxy-devaispendops-proxy-prod
Usage Consumeraispendops-usage-consumer-devaispendops-usage-consumer-prod
Denial Consumeraispendops-denial-consumer-devaispendops-denial-consumer-prod
KV NamespacesSeparate (dev)Separate (prod)
Queuesusage-events-dev, denial-events-devusage-events-prod, denial-events-prod
ClickHouseShared DB, env = 'dev'Shared DB, env = 'prod'
Azure SQLShared DBShared DB

Dev and Prod use separate Cloudflare resources (workers, KV namespaces, queues) but share the ClickHouse and SQL databases, using environment fields to partition data.

Architectural Principles

  1. Accounting correctness — Every token, every cost, every denial is recorded. Usage extraction happens in waitUntil so it never blocks the response but is guaranteed to execute. Pricing enrichment uses deterministic formulas with explicit handling for provider-specific quirks (Anthropic cache tokens, OpenRouter provider_cost).

  2. Stateless hot path — The proxy worker holds no state between requests. All config (key policies, pricing) is read from KV at request time. This makes horizontal scaling trivial and eliminates cache-invalidation bugs.

  3. At-least-once delivery — Events flow through Cloudflare Queues with at-least-once semantics. The usage consumer is idempotent (ClickHouse deduplicates on event_id). This ensures no usage data is lost even if a consumer crashes mid-batch.

  4. Explicit auditability — Denial events are recorded with full context (reason, HTTP status, tenant, key, provider, model, dimensions). Missing pricing models are flagged in KV. Budget alerts are logged. Every decision the system makes is traceable.