Springtech AI API
An OpenAI-compatible gateway to a curated catalog of LLMs — one API key, one wallet, every model.
Overview
Base URL: https://platform.springtech.ai/v1
The API mirrors OpenAI's chat.completions shape, so any OpenAI-compatible SDK works with just a base-URL and key swap. Behind the scenes, requests are routed to the appropriate upstream provider per the model catalog (browse all models).
Authentication
All requests require an API key in the Authorization header. Generate keys from Settings → API Keys.
Authorization: Bearer sk-…Each key can be restricted to a model allowlist and / or a monthly spend cap. SeePOST /v1/keys in the SDKs.
Chat completions
POST /v1/chat/completions
curl https://platform.springtech.ai/v1/chat/completions \
-H "Authorization: Bearer sk-…" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"messages": [{"role":"user","content":"Summarize this in one sentence."}]
}'from springtech_sdk import SpringtechClient
client = SpringtechClient(api_key="sk-…")
reply = client.chat.complete(
model="anthropic/claude-haiku-4-5",
messages=[{"role":"user","content":"Hello"}],
)
print(reply.content)import { SpringtechClient } from '@ai-platform/sdk';
const client = new SpringtechClient({ apiKey: process.env.SPRINGTECH_KEY! });
const reply = await client.chat.complete({
model: 'anthropic/claude-haiku-4-5',
messages: [{ role: 'user', content: 'Hello' }],
});Streaming
Set stream: true to receive incremental SSE chunks. The final chunk includes usage and cost_cents.
curl -N https://platform.springtech.ai/v1/chat/completions \
-H "Authorization: Bearer sk-…" \
-H "Content-Type: application/json" \
-d '{
"model": "anthropic/claude-haiku-4-5",
"messages": [{"role":"user","content":"Count to 5"}],
"stream": true
}'for await (const chunk of client.chat.complete({
model: 'springtech/flash',
messages: [{ role: 'user', content: 'Hello' }],
stream: true,
})) {
process.stdout.write(chunk.delta ?? '');
}Tool use / function calling
Pass OpenAI-shape tools and tool_choice; they are relayed verbatim to compatible upstreams. Anthropic's tool_use blocks are translated to OpenAI tool_calls automatically.
{
"model": "openai/gpt-4o-mini",
"messages": [{"role":"user","content":"What's the weather in Kuala Lumpur?"}],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"parameters": { "type":"object","properties":{"city":{"type":"string"}} }
}
}]
}List models
GET /v1/models — catalog filtered by your key's allowlist (if set).
curl https://platform.springtech.ai/v1/models -H "Authorization: Bearer sk-…"Available canonical model ids:
Credits
GET /v1/credits — wallet balance and this-month spend.
{
"balance_cents": 49995,
"this_month_spend_cents": 5,
"currency": "myr",
"monthly_limit_cents": null
}Errors
| Status | Type | Meaning |
|---|---|---|
| 401 | unauthorized | Missing / invalid API key. |
| 402 | monthly_cap_reached | Per-key monthly cap reached or workspace wallet empty. |
| 403 | model_not_allowed | Model not in this key's allowlist (or your plan). |
| 404 | not_found | Model id unknown. |
| 429 | rate_limited | Per-account rate limit (200 req/min by default). |
Pricing
Each model has three rates per 1 million tokens:cache_hit (small input fragment already cached upstream),cache_miss (regular input), and output. The platform applies a fixed margin on top of upstream USD pricing (see LLM_MARKUP_PERCENT) and bills your wallet in RM.
Cache-hit tokens are detected automatically from prompt_cache_hit_tokens /prompt_tokens_details.cached_tokens in the upstream response.
Self-hosted models (e.g. springtech/flash) follow a Flash-tier equivalent so pricing is consistent across providers.
Last updated: 2026-04-27 · Found a bug? hello@springtech.ai