Kallavy — Brazil's AI Broker
Get started
Brazil's AI broker

Every AI model. One call.

Dozens of models from the world's leading AI providers on a single endpoint — 100% OpenAI API-compatible. Kallavy intermediates every request, meters usage per client, and consolidates it all into a single invoice in BRL via PIX with automatic Brazilian e-invoicing (NF-e).

No international credit card
Automatic Brazilian e-invoice
LGPD-compliant
client.py
# Just swap base_url — zero refactor
from openai import OpenAI

client = OpenAI(
  api_key="sk-br-abc123",
  base_url="https://api.kallavy.com/v1"
)

resp = client.chat.completions.create(
  model="gemini-flash",
  messages=[{"role":"user",
               "content":"Summarize..."}]
)
POST /v1/chat/completions
200 OK ·gemini-flash ·1,214 tokens ·238 ms
Latency
~10 ms
Uptime
99.9%

Why Brazilian companies choose Kallavy

7+

AI models

4

Global providers

~10ms

Avg. latency in Brazil

100%

Infra hosted in Brazil

Advantages

What makes Kallavy different

Built by Brazilians, for Brazilian companies. We remove every friction between you and global AI.

3
barriers
removed

PIX & automatic NF-e

Top up your account via PIX (Brazil's instant payment) in seconds. Electronic invoice (NF-e) issued automatically via our Focus NFe integration. Never again stall an AI initiative for lack of a corporate card.

1

100% OpenAI-compatible

Already using the openai library? Just swap base_url and keep going. No new SDK, no refactor, no vendor lock-in.

2

Portuguese human support

Brazilian technical team in Brasília business hours. WhatsApp, email, and chat in Portuguese. Prompt troubleshooting, 429 errors, or model selection? We speak your language.

3
How it works

Three steps. Five minutes.

From signup to your first API call. No red tape.

1

Create your account

Company tax ID (CNPJ) or personal ID (CPF), email, and password. Your API key is generated on the spot and shown only once — keep it safe.

2 minutes
2

Top up via PIX

Instant QR code as soon as the proposal is set. Credit lands in your account in seconds, NF-e issued right after — no manual work required.

1 minute
3

Start using

Point your favorite SDK at api.kallavy.com/v1. If you were already using OpenAI, the code doesn't change. Kallavy intermediates every request: authenticates, meters tokens per client, forwards to the provider, and returns the response — all ready for a single BRL invoice.

Immediate
Infrastructure

The only AI broker with a direct route to your user

We're connected directly to PTT.br in São Paulo — Latin America's largest internet exchange. Native peering with Brazilian ISPs means your traffic takes fewer hops, arrives faster, and with fewer points of failure.

Kallavy
Kallavy
São Paulo
PTT.br
IX.br SP
Direct peering
Brazilian ISPs
Vivo · Claro · TIM · Oi · Algar
Your user
Brazil-wide

Your traffic never crosses the Atlantic. No international transit, no FX on every request, no RTT surprises.

~10 ms
Avg. latency to
SP and RJ
99.9%
Availability
SLA
10 Gbps
Dedicated backbone.
Zero overselling
PTT.br
Direct peering with
top Brazilian ISPs

Real speed

Domestic traffic stays in Brazil. Every request skips the transatlantic RTT — your chatbot feels like the AI is in the next room.

Resilience

Multi-path BGP routing and automatic fallback between AI providers. If an upstream goes down, we route to the next one — and you don't even notice.

Data sovereignty

Servers on Brazilian soil. Logs, metadata, and account data stay in Brazil, LGPD-compliant. Audits and DPA available on request.

Connected to the world's leading AI providers

OpenAI Anthropic Google DeepSeek Mistral Meta Llama Qwen Moonshot MiniMax Cohere Groq Perplexity Nvidia
Available models

The best global models, on a single endpoint

From premium GPT-4o to budget-friendly Gemini Flash — pick what fits your use case. Automatic fallback if a provider goes down.

OpenAI premium

GPT-4o

Top-of-the-line multimodal. Vision, text, and reasoning.

128k context
OpenAI economy

GPT-4o Mini

Fast and lightweight for high volume.

128k context
Anthropic premium

Claude Sonnet

Deep reasoning and high-quality writing.

200k context
Google premium

Gemini 1.5 Pro

Massive context for document analysis.

2M context
Google most popular

Gemini Flash

Ultra-fast. Perfect for support chatbots.

1M context
DeepSeek high efficiency

DeepSeek Chat

Outstanding price-performance for general use.

64k context
DeepSeek reasoning

DeepSeek R1

Step-by-step reasoning, o1-style.

64k context
Kallavy coming soon

Smart routing

You pick quality or cost. We route.

Multi-provider

The lineup is always expanding. Talk to us for pricing, SLAs, and specific use cases.

FAQ

Questions worth asking

If yours isn't here, ping us on WhatsApp.

Kallavy is your Brazilian intermediator between your application and the world's leading AIs. When your app sends a prompt, we authenticate the request, meter input and output tokens per client, forward it to the actual provider (OpenAI, Anthropic, Google, or DeepSeek), and return the response. At month-end, you receive a single BRL invoice with NF-e covering your team's entire usage. Think of it like a phone carrier — you don't talk to each tower, just to one company that handles everything.

No. Kallavy doesn't hold a token inventory. Your application makes requests through our API, we meter everything in real time and forward them to the actual providers (OpenAI, Anthropic, Google, DeepSeek). They bill us in USD for aggregated usage; we bill you in BRL for that cost plus an intermediation fee that varies by model and volume. That fee covers the Brazilian operation: NF-e, Portuguese support, FX risk, domestic infra, SLA, and per-client metering.

Yes. Service e-invoice (NFS-e) is issued automatically on every confirmed PIX top-up via our Focus NFe integration. Available in your dashboard as PDF and XML, and emailed to you.

No. Kallavy's API is 100% OpenAI-compatible. Just point base_url to https://api.kallavy.com/v1 and use your Kallavy API key. Works with the official openai library in Python, Node, Go, and others.

No. We never store prompt or response content — by LGPD design and internal policy. We only keep metadata: model used, token counts, timestamp, and cost. Financial and technical audits run on that metadata alone.

You're charged for what you actually consume: input and output tokens per model, metered in real time. On top of the provider's USD cost, Kallavy applies the FX conversion to BRL and adds an intermediation fee that varies by model and volume — that fee covers NF-e issuance, Portuguese support, Brazil-hosted infra, FX risk, and per-client metering. We support prepaid credits (PIX) or monthly invoicing for B2B accounts. Talk to us for a custom proposal.

Automatic fallback. If OpenAI throws 5xx, Kallavy routes to an equivalent model (e.g., Claude Sonnet) with no action required from you. You configure the fallback chain in your dashboard.

New · Desktop app

Bring Kallavy to your PC

The broker in a native app: chat with OpenAI, Claude, Gemini and DeepSeek straight from your desktop, with a global shortcut and quick replies. Same account, same balance.

Download for Windows
Windows 10/11 · 64-bit · .exe installer
macOS — coming soon Linux — coming soon
API live right now

Ready to call the AI?

Create your account, top up via PIX, and make your first request in under 5 minutes. No international card, no red tape.

Prepaid credits
No lock-in
Automatic NF-e