Skip to main content

Usage Tracking

AI SpendOps captures detailed usage data from every API call, including tokens, cost, and granular breakdowns.

Principles

  • Provider usage is authoritative — we use the provider's reported token counts
  • Exactly one usage event per request — no duplicates
  • Estimation only when unavailable — if the provider doesn't report usage, tokens are estimated from character count (chars / 4)

What gets tracked

Core token counts

Every request captures:

FieldDescription
prompt_tokensInput tokens sent to the model
completion_tokensOutput tokens generated by the model
total_tokensTotal tokens (input + output)

Granular breakdowns

When the provider reports them, additional fields are captured:

FieldDescriptionProviders
cache_read_tokensTokens served from prompt cache (discounted)OpenAI, Anthropic, Google, xAI
cache_write_tokensTokens written to prompt cacheAnthropic
reasoning_tokensTokens used for reasoning/thinkingOpenAI (o3/o4-mini), Google
prompt_audio_tokensAudio input tokensOpenAI
prompt_image_tokensImage input tokensOpenAI
completion_audio_tokensAudio output tokensOpenAI
web_search_requestsNumber of web searches performedAnthropic

Cost tracking

FieldDescription
provider_costActual cost reported by the provider (OpenRouter only)
total_cost_usdCalculated cost based on token counts and pricing data

Timing metrics

FieldDescription
started_at_msWhen the request was received
first_byte_at_msTime to first byte from the provider
ended_at_msWhen the response completed

Usage sources

Each event records how usage data was obtained:

SourceDescription
providerUsage data from the provider's response
estimateTokens estimated from character count
partialSome fields from provider, others estimated
noneNo usage data available

Async extraction

Usage data extraction happens asynchronously in ctx.waitUntil after the response is returned to you. This means:

  • Zero overhead on your response latency
  • Usage events are processed in the background
  • Events are enriched with pricing data and written to the analytics database