API429 Documentation

Client-facing API429 documentation: how to list models, send LLM requests, generate images, and check the current token balance.

Playground + Swagger

Interactive playground and Swagger for client APIs only

Use the playground for live requests and the separate Swagger view for schemas, parameters, and the client OpenAPI specification.

Open Playground Open Swagger

What's inside

Client endpoints only, request schemas, responses, and code samples by language.

Playground

Run requests directly from the browser with your own Bearer token.

Source of truth

Both Swagger and the playground use a separate public spec without internal or admin routes.

1. Base URL and authorization

The canonical base URL for client requests is https://balancer.api429.com/v1. Send your Bearer token in the Authorization header on every request.

curl https://balancer.api429.com/v1/models \
  -H "Authorization: Bearer YOUR_TOKEN"

Use https://balancer.api429.com/v1 as your main endpoint.
GET /v1/models returns the live model catalog for the current token.
Do not expose the token in frontend code or public repositories.

2. Model list: /v1/models

Before your first integration call, fetch the model list with the actual token you will use. Do not rely on a static list in documentation as the only source of truth.

curl https://balancer.api429.com/v1/models \
  -H "Authorization: Bearer YOUR_TOKEN"

Call GET /v1/models before production rollout and when syncing SDKs.
The response is token-specific and reflects the models actually available to that client.

3. Canonical parameter naming

The canonical API429 naming style is snake_case. The image route also supports camelCase compatibility aliases.

Image generation: canonical

aspect_ratio
resolution
response_format
reference_images
negative_prompt
google_search
safety_settings
project_id
veon_model_key

Image generation: compatibility aliases

aspectRatio
imageSize
responseFormat
referenceImages
negativePrompt
googleSearch
safetySettings
projectId
veonModelKey

LLM / chat: canonical only

messages
model
temperature
stream
max_tokens
top_p
top_k
stop
tools
tool_choice
response_format
reasoning_effort
response_modalities
thinking_level
safety_settings

For Gemini-oriented chat scenarios on /v1/chat/completions, these compatibility aliases are also supported:

responseModalities
thinkingLevel
safetySettings

Do not rely on camelCase for chat routes in general. These fields are not normalized right now and may be ignored:

maxTokens
topP
toolChoice
responseFormat
reasoningEffort

4. LLM generation: /v1/chat/completions

The main text endpoint is compatible with the OpenAI-style chat API. Use messages, model, and additional canonical snake_case fields.

curl -X POST https://balancer.api429.com/v1/chat/completions \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-3.1-pro",
    "messages": [
      {"role": "user", "content": "Briefly explain how to connect to API429."}
    ],
    "responseModalities": ["TEXT"],
    "thinkingLevel": "HIGH",
    "safetySettings": [
      {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"}
    ],
    "temperature": 0.3,
    "max_tokens": 300
  }'

import requests

url = "https://balancer.api429.com/v1/chat/completions"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}
payload = {
    "model": "gemini-3.1-pro",
    "messages": [
        {"role": "user", "content": "Write a short onboarding message for a client."}
    ],
    "responseModalities": ["TEXT"],
    "thinkingLevel": "MINIMAL",
    "safetySettings": [
        {"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"}
    ],
    "temperature": 0.2,
    "max_tokens": 200
}

response = requests.post(url, headers=headers, json=payload, timeout=180)
print(response.json())

Live-verified with gemini-3.1-pro and gpt-5.1.
Enable streaming with stream: true.
responseModalities, thinkingLevel, and safetySettings are normalized on the chat route.

Gemini-native protocol

For the native Gemini protocol, keep the native structure: generationConfig.responseModalities , generationConfig.thinkingConfig , and top-level safetySettings.

{
  "contents": [
    {
      "role": "user",
      "parts": [{"text": "Describe this concept briefly."}]
    }
  ],
  "generationConfig": {
    "responseModalities": ["TEXT", "IMAGE"],
    "thinkingConfig": {"thinkingLevel": "HIGH"}
  },
  "safetySettings": [
    {"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"}
  ]
}

5. Image generation: /v1/images/generations

Use /v1/images/generations for image generation. The canonical style is snake_case, but the route also supports camelCase aliases.

curl -X POST https://balancer.api429.com/v1/images/generations \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano-banana-pro",
    "prompt": "Futuristic megacity at night, cinematic lighting",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "response_format": "b64_json"
  }'

import requests

url = "https://balancer.api429.com/v1/images/generations"
headers = {
    "Authorization": "Bearer YOUR_TOKEN",
    "Content-Type": "application/json"
}
payload = {
    "model": "nano-banana-pro",
    "prompt": "Futuristic megacity at night, cinematic lighting",
    "aspect_ratio": "16:9",
    "resolution": "2K",
    "response_format": "b64_json"
}

response = requests.post(url, headers=headers, json=payload, timeout=300)
result = response.json()
print(result)

resolution works. Supported size tiers are 1K, 2K, and 4K.
aspect_ratio and aspectRatio both work.
response_format: "b64_json" is convenient for backend integrations.
Important: resolution sets the target size tier, not an exact returned pixel size. Aspect ratio is guaranteed; exact pixels are not.
Live-verified: snake_case and camelCase variants behave the same for image generation.

6. Balance check: /api/client/balance

If you need to check the available token balance before a request series, use the dedicated client endpoint.

curl https://balancer.api429.com/api/client/balance \
  -H "Authorization: Bearer YOUR_TOKEN"

The endpoint returns the current balance state for the Bearer token.
If the token balance is insufficient, generation endpoints return 402 Insufficient Balance.

7. Common errors

401 Unauthorized

Token missing or invalid. Check the Authorization: Bearer ... header.

402 Insufficient Balance

The client balance does not have enough funds.

422 Validation Error

Invalid JSON structure or incorrect data types. On chat routes, make sure you are using canonical snake_case fields where required.

429 Too Many Requests

Rate limits or burst traffic exceeded. Use retries with exponential backoff.

503 Service Unavailable

Temporary route, model, or service availability issue. Retry the request after a few seconds.

8. Practical recommendations

For text and LLM requests, set a client timeout of at least 60 to 180 seconds.
For image generation, use 120 to 300 seconds.
Before production rollout, always run GET /v1/models and 1 to 2 smoke requests with the real token.
If image dimensions matter, optimize around aspect ratio and size tier rather than exact returned pixels.

9. Support

If you hit a non-standard integration issue, send us the exact request, endpoint, and model id. That is the fastest way to verify routing and parameters without long back-and-forth.

@gangoneog on Telegram