Kallavy Kallavy
OpenAI-API compatible

API Documentation

Kallavy is an AI broker: your app talks to a single OpenAI-compatible endpoint and we route to the world's best models — billed in BRL via PIX, with a Brazilian invoice and local support. Already using the OpenAI SDK? Just swap base_url and api_key. No code changes.

Introduction

This reference covers the developer-facing endpoints. Everything is HTTPS with JSON bodies and follows the same contract as OpenAI's Chat Completions API. Account management — keys, balance, usage, governance and invoices — lives in the dashboard, not the API.

Privacy (LGPD): we only store usage metadata (model, tokens, cost). The content of your prompts and responses is never stored.

Base URL

All calls use the dedicated broker domain:

endpoint
https://api.kallavy.com/v1

Authentication

Every request needs the Authorization header with your API key as a Bearer token. Create and manage keys in the dashboard → API Keys.

header
Authorization: Bearer sk-...your-key
Keep your key secret. It grants access to your account balance. Never expose it in front-end code or public repos. Compromised? Revoke it in the dashboard and rotate.

Quickstart

A full chat call, in three languages:

# pip install openai
from openai import OpenAI

client = OpenAI(
    api_key="sk-...your-key",
    base_url="https://api.kallavy.com/v1",
)

resp = client.chat.completions.create(
    model="deepseek-chat",
    messages=[{"role": "user", "content": "Summarize this contract in 3 lines."}],
)
print(resp.choices[0].message.content)
GET /v1/models

Lists the models available to your account. Use this as the source of truth — the catalog changes as new providers come online and as your account governance evolves. Same shape as OpenAI.

response · 200
{
  "object": "list",
  "data": [
    { "id": "deepseek-chat", "object": "model", "owned_by": "kallavy" },
    { "id": "claude-sonnet", "object": "model", "owned_by": "kallavy" },
    // ...
  ]
}
POST /v1/chat/completions

Generates a chat response. Accepts the same fields as OpenAI; Kallavy authenticates, meters usage and forwards to the real provider behind the chosen model.

Body parameters

Field Type Description
model * string Model ID (e.g. deepseek-chat). See /v1/models.
messages * array List of { "role", "content" } messages (system, user, assistant).
stream boolean If true, streams tokens via SSE. Default false.
temperature, max_tokens, top_p… various Other standard OpenAI params are forwarded to the provider.

* required

Response

response · 200
{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "deepseek-chat",
  "choices": [{
    "index": 0,
    "message": { "role": "assistant", "content": "..." },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 42, "completion_tokens": 88, "total_tokens": 130 }
}

Streaming

With stream: true, the response arrives as Server-Sent Events chunks (data: {...}), ending with data: [DONE] — identical to OpenAI. The SDKs handle this for you.

for chunk in client.chat.completions.create(
    model="claude-sonnet",
    messages=[{"role": "user", "content": "Write a haiku."}],
    stream=True,
):
    print(chunk.choices[0].delta.content or "", end="")

Errors

Errors follow standard HTTP status codes with a descriptive JSON body.

CodeMeaning
401Missing, invalid or revoked key.
403Blocked by account governance (model/provider not allowed, or spend cap reached).
402Insufficient balance — top up via PIX in the dashboard.
429Rate limit exceeded. Back off and retry.
5xxUpstream provider failure. Retry.

Limits & billing

  • Usage is metered by input and output tokens, per model, and debited from your BRL balance.
  • Rate limits apply per key (tunable on your plan). On overflow the API returns 429.
  • Your company can set governance: allow/block models and providers and enforce spend caps — all in the dashboard.
  • Top-ups via PIX with NF-e invoicing. Balance and history live in the dashboard.

Support

Integration questions? Reach us at support@kallavy.com.

Ready for your first call?

Create an account, grab an API key, and start in minutes.

Get started →