Skip to main content

AI SpendOps Documentation

AI SpendOps is a drop-in AI API proxy that gives you perfect, auditable accounting of AI usage across providers — without changing your code beyond a base URL.

What does it do?

  • Proxies AI API calls to 14+ providers (OpenAI, Anthropic, Google, and more)
  • Tracks usage — tokens, cost, latency, and granular breakdowns (cache, reasoning, audio, images)
  • Attributes cost — tag requests with dimensions (team, app, environment) for chargeback and reporting
  • Enforces policies — provider/model allow-lists, mandatory dimensions, budget alerts
  • Zero overhead — usage extraction happens asynchronously after the response is returned

How it works

Replace your AI provider's base URL with the AI SpendOps proxy URL and add an X-ASO-API-Key header. That's it.

# Before (direct to OpenAI)
curl https://api.openai.com/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-d '{"model":"gpt-4.1","messages":[{"role":"user","content":"Hello"}]}'

# After (through AI SpendOps)
curl https://proxy.aispendops.com/v1/openai/v1/chat/completions \
-H "Authorization: Bearer sk-..." \
-H "X-ASO-API-Key: aso_k_yourkey.secret" \
-d '{"model":"gpt-4.1","messages":[{"role":"user","content":"Hello"}]}'

The proxy passes your request through to the provider, captures usage data, and returns the response unmodified.

Key features

FeatureDescription
14 providersOpenAI, Anthropic, Google, OpenRouter, xAI, Groq, DeepInfra, Novita, Fireworks, Perplexity, Cerebras, Mistral, DeepSeek, Nebius
Streaming supportFull SSE passthrough with automatic usage capture
Cost attributionTag requests with dimensions for team/app/environment chargeback
Policy enforcementProvider and model allow-lists, mandatory dimensions
Budget alertsSet spend thresholds with notifications
Edge-firstRuns on Cloudflare's global edge network for minimal latency
SDK compatibleWorks with OpenAI SDK, Anthropic SDK, LiteLLM, and any HTTP client

Next steps