Per-Token Billing for LLM Inference in 2 Lines of Code
Meter every OpenAI, Anthropic, and Google API call. Set per-token budgets, track cross-provider costs, and bill your users automatically. Works with any LLM SDK.
How it works
import { settlegrid } from '@settlegrid/mcp'
import OpenAI from 'openai'
const sg = settlegrid.init({
toolSlug: 'my-llm-proxy',
pricing: { model: 'per-token', inputCostPer1k: 0.3, outputCostPer1k: 1.2 },
})
const openai = new OpenAI()
const billedCompletion = sg.wrap(async (args: { prompt: string }) => {
const response = await openai.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: args.prompt }],
})
return { content: [{ type: 'text', text: response.choices[0].message.content }] }
})Supported providers
SettleGrid works with any provider. Here are the most common ones for llm inference & ai models.
| Provider | Pricing |
|---|---|
| OpenAI | GPT-4o, GPT-4o-mini, o1, o3 -- $2-60/M tokens |
| Anthropic | Claude Opus, Sonnet, Haiku -- $0.25-75/M tokens |
| Google Gemini | Gemini 2.5 Pro, Flash -- $1.25-10/M tokens |
| DeepSeek | DeepSeek V3, R1 -- $0.14-2.19/M tokens |
| Groq | Llama, Mixtral on LPU -- $0.04-0.88/M tokens |
| Together AI | 100+ open models -- $0.10-18/M tokens |
| Fireworks AI | Optimized inference -- $0.10-3/M tokens |
Why per-token billing?
LLM inference costs scale with token usage. Per-token billing lets you pass through exact costs to end users, set per-user budget caps, and automatically track input vs output token spend across multiple providers. SettleGrid meters tokens from the response metadata and settles in real time, so you never eat costs from runaway prompts.
$106B
Total Addressable Market
7
Supported Providers
2 min
Setup Time
Frequently asked questions
How does per-token billing work with SettleGrid?
Can I set per-user budget caps?
Does this work with streaming responses?
Can I use different pricing for different models?
What if my LLM provider changes their pricing?
Start billing llm inference & ai models today
Add per-token billing to your llm inference & ai models service in under 2 minutes. No upfront costs, no contracts.