Usage Tracking

AI SpendOps captures detailed usage data from every API call, including tokens, cost, and granular breakdowns.

Principles

Provider usage is authoritative — we use the provider's reported token counts
Exactly one usage event per request — no duplicates
Estimation only when unavailable — if the provider doesn't report usage, tokens are estimated from character count (chars / 4)

Every request captures:

Field	Description
`prompt_tokens`	Input tokens sent to the model
`completion_tokens`	Output tokens generated by the model
`total_tokens`	Total tokens (input + output)

When the provider reports them, additional fields are captured:

Field	Description	Providers
`cache_read_tokens`	Tokens served from prompt cache (discounted)	OpenAI, Anthropic, Google, xAI
`cache_write_tokens`	Tokens written to prompt cache	Anthropic
`reasoning_tokens`	Tokens used for reasoning/thinking	OpenAI (o3/o4-mini), Google
`prompt_audio_tokens`	Audio input tokens	OpenAI
`prompt_image_tokens`	Image input tokens	OpenAI
`completion_audio_tokens`	Audio output tokens	OpenAI
`web_search_requests`	Number of web searches performed	Anthropic

Field	Description
`provider_cost`	Actual cost reported by the provider (OpenRouter only)
`total_cost_usd`	Calculated cost based on token counts and pricing data

Field	Description
`started_at_ms`	When the request was received
`first_byte_at_ms`	Time to first byte from the provider
`ended_at_ms`	When the response completed

Each event records how usage data was obtained:

Source	Description
`provider`	Usage data from the provider's response
`estimate`	Tokens estimated from character count
`partial`	Some fields from provider, others estimated
`none`	No usage data available

Usage data extraction happens asynchronously after the response is returned to you. This means: