Springtech AI API

An OpenAI-compatible gateway to a curated catalog of LLMs — one API key, one wallet, every model.

Overview

Base URL: https://platform.springtech.ai/v1

The API mirrors OpenAI's chat.completions shape, so any OpenAI-compatible SDK works with just a base-URL and key swap. Behind the scenes, requests are routed to the appropriate upstream provider per the model catalog (browse all models).

Authentication

All requests require an API key in the Authorization header. Generate keys from Settings → API Keys.

HTTP

Authorization: Bearer sk-…

Each key can be restricted to a model allowlist and / or a monthly spend cap. SeePOST /v1/keys in the SDKs.

Chat completions

POST /v1/chat/completions

cURL

curl https://platform.springtech.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5",
    "messages": [{"role":"user","content":"Summarize this in one sentence."}]
  }'

Python

from springtech_sdk import SpringtechClient
client = SpringtechClient(api_key="sk-…")

reply = client.chat.complete(
    model="anthropic/claude-haiku-4-5",
    messages=[{"role":"user","content":"Hello"}],
)
print(reply.content)

TypeScript

import { SpringtechClient } from '@ai-platform/sdk';
const client = new SpringtechClient({ apiKey: process.env.SPRINGTECH_KEY! });

const reply = await client.chat.complete({
  model: 'anthropic/claude-haiku-4-5',
  messages: [{ role: 'user', content: 'Hello' }],
});

Streaming

Set stream: true to receive incremental SSE chunks. The final chunk includes usage and cost_cents.

cURL

curl -N https://platform.springtech.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-…" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-haiku-4-5",
    "messages": [{"role":"user","content":"Count to 5"}],
    "stream": true
  }'

TypeScript

for await (const chunk of client.chat.complete({
  model: 'springtech/flash',
  messages: [{ role: 'user', content: 'Hello' }],
  stream: true,
})) {
  process.stdout.write(chunk.delta ?? '');
}

Tool use / function calling

Pass OpenAI-shape tools and tool_choice; they are relayed verbatim to compatible upstreams. Anthropic's tool_use blocks are translated to OpenAI tool_calls automatically.

JSON

{
  "model": "openai/gpt-4o-mini",
  "messages": [{"role":"user","content":"What's the weather in Kuala Lumpur?"}],
  "tools": [{
    "type": "function",
    "function": {
      "name": "get_weather",
      "parameters": { "type":"object","properties":{"city":{"type":"string"}} }
    }
  }]
}

List models

GET /v1/models — catalog filtered by your key's allowlist (if set).

cURL

curl https://platform.springtech.ai/v1/models -H "Authorization: Bearer sk-…"

Available canonical model ids:

Credits

GET /v1/credits — wallet balance and this-month spend.

HTTP

{
  "balance_cents": 49995,
  "this_month_spend_cents": 5,
  "currency": "myr",
  "monthly_limit_cents": null
}

Errors

Status	Type	Meaning
401	unauthorized	Missing / invalid API key.
402	monthly_cap_reached	Per-key monthly cap reached or workspace wallet empty.
403	model_not_allowed	Model not in this key's allowlist (or your plan).
404	not_found	Model id unknown.
429	rate_limited	Per-account rate limit (200 req/min by default).

Pricing

Each model has three rates per 1 million tokens:cache_hit (small input fragment already cached upstream),cache_miss (regular input), and output. The platform applies a fixed margin on top of upstream USD pricing (see LLM_MARKUP_PERCENT) and bills your wallet in RM.

Cache-hit tokens are detected automatically from prompt_cache_hit_tokens /prompt_tokens_details.cached_tokens in the upstream response.

Self-hosted models (e.g. springtech/flash) follow a Flash-tier equivalent so pricing is consistent across providers.

Last updated: 2026-04-27 · Found a bug? hello@springtech.ai