Springtech AI API

An OpenAI-compatible gateway to a curated catalog of LLMs — one API key, one wallet, every model.

Overview

Base URL: https://platform.springtech.ai/v1

The API mirrors OpenAI's chat.completions shape, so any OpenAI-compatible SDK works with just a base-URL and key swap. Behind the scenes, requests are routed to the appropriate upstream provider per the model catalog (browse all models).

Authentication

All requests require an API key in the Authorization header. Generate keys from Settings → API Keys.

HTTP
Authorization: Bearer sk-…

Each key can be restricted to a model allowlist and / or a monthly spend cap. SeePOST /v1/keys in the SDKs.

Chat completions

POST /v1/chat/completions

cURL
curl https://platform.springtech.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5",
    "messages": [{"role":"user","content":"Summarize this in one sentence."}]
  }'
Python
from springtech_sdk import SpringtechClient
client = SpringtechClient(api_key="sk-…")

reply = client.chat.complete(
    model="anthropic/claude-haiku-4-5",
    messages=[{"role":"user","content":"Hello"}],
)
print(reply.content)
TypeScript
import { SpringtechClient } from '@ai-platform/sdk';
const client = new SpringtechClient({ apiKey: process.env.SPRINGTECH_KEY! });

const reply = await client.chat.complete({
  model: 'anthropic/claude-haiku-4-5',
  messages: [{ role: 'user', content: 'Hello' }],
});

Streaming

Set stream: true to receive incremental SSE chunks. The final chunk includes usage and cost_cents.

cURL
curl -N https://platform.springtech.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5",
    "messages": [{"role":"user","content":"Count to 5"}],
    "stream": true
  }'
TypeScript
for await (const chunk of client.chat.complete({
  model: 'springtech/flash',
  messages: [{ role: 'user', content: 'Hello' }],
  stream: true,
})) {
  process.stdout.write(chunk.delta ?? '');
}

Tool use / function calling

Pass OpenAI-shape tools and tool_choice; they are relayed verbatim to compatible upstreams. Anthropic's tool_use blocks are translated to OpenAI tool_calls automatically.

JSON
{
  "model": "openai/gpt-4o-mini",
  "messages": [{"role":"user","content":"What's the weather in Kuala Lumpur?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "parameters": { "type":"object","properties":{"city":{"type":"string"}} }
    }
  }]
}

List models

GET /v1/models — catalog filtered by your key's allowlist (if set).

cURL
curl https://platform.springtech.ai/v1/models -H "Authorization: Bearer sk-…"

Available canonical model ids:

Credits

GET /v1/credits — wallet balance and this-month spend.

HTTP
{
  "balance_cents": 49995,
  "this_month_spend_cents": 5,
  "currency": "myr",
  "monthly_limit_cents": null
}

Errors

StatusTypeMeaning
401unauthorizedMissing / invalid API key.
402monthly_cap_reachedPer-key monthly cap reached or workspace wallet empty.
403model_not_allowedModel not in this key's allowlist (or your plan).
404not_foundModel id unknown.
429rate_limitedPer-account rate limit (200 req/min by default).

Pricing

Each model has three rates per 1 million tokens:cache_hit (small input fragment already cached upstream),cache_miss (regular input), and output. The platform applies a fixed margin on top of upstream USD pricing (see LLM_MARKUP_PERCENT) and bills your wallet in RM.

Cache-hit tokens are detected automatically from prompt_cache_hit_tokens /prompt_tokens_details.cached_tokens in the upstream response.

Self-hosted models (e.g. springtech/flash) follow a Flash-tier equivalent so pricing is consistent across providers.

Last updated: 2026-04-27 · Found a bug? hello@springtech.ai