Quick Start
API Documentation
Get started with the CLEX unified API in under 60 seconds. Access models through a single endpoint with provider routing handled by CLEX.
1 Set your base URL and API key
# Base URL for all CLEX API calls
CLEX_BASE_URL="https://api.clex.in/v1"
# Your CLEX API key (set as environment variable)
CLEX_API_KEY="clex_xxxxxxxxxxxxxxxxxxxx"
Note:
https://api.clex.in/v1 is the
API base URL for your SDK or HTTP client. It is not a browser
page.
Open the dashboard
to create keys and manage access.
2 Make your first API call
curl https://api.clex.in/v1/chat/completions \
-H "Authorization: Bearer $CLEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "meta/llama-3.3-70b-instruct",
"messages": [
{"role": "user", "content": "Hello!"}
]
}'
Important:
/v1/chat/completions expects a
POST request with JSON. If you
open it directly in a browser tab, it will not behave like a
web page.
3 Integrate into your app
# Install OpenAI SDK if not already
npm install openai
# Run your script
CLEX_API_KEY="clex_xxx" node app.js
# You are now running on CLEX proxy
Authentication
CLEX uses CLEX API keys for authentication. The CLEX routing layer handles upstream provider credentials and normalization.
How it works
Create a CLEX API key from api.clex.in
Set it as
CLEX_API_KEY environment
variable
CLEX validates your key and routes requests to the selected provider
// Send your CLEX API key as a Bearer token
Authorization: Bearer clex_xxxxxxxxxxxxxxxxxxxx
Chat Completions
The primary endpoint for generating AI responses. Compatible with the OpenAI chat completions format.
/v1/chat/completions
In this docs page, the API Explorer uses local
/api/chat as a proxy for
https://api.clex.in/v1/chat/completions.
Use
https://api.clex.in/v1/chat/completions
from code, cURL, Postman, or the OpenAI SDK. For browser
navigation and API keys, go to
api.clex.in.
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes |
CLEX model ID (e.g.
meta/llama-3.3-70b-instruct)
|
| messages | array | Yes |
Array of message objects with
role and
content
|
| temperature | number | No | Sampling temperature, 0–2. Default: 0.7 |
| max_tokens | integer | No | Maximum tokens to generate. Default is model-specific when omitted. |
| stream | boolean | No | Enable streaming. Default: true |
| top_p | number | No | Nucleus sampling threshold, 0–1. Default: 0.9 |
Response Format
Responses follow the OpenAI-compatible format. When streaming is enabled (default), responses arrive as Server-Sent Events.
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you today?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 12,
"completion_tokens": 9,
"total_tokens": 21
}
}
Streaming
Streaming uses Server-Sent Events (SSE). Each event contains a
JSON object with the incremental token. The stream is terminated
by a data: [DONE] event.
// Each SSE line:
data: {"choices":[{"delta":{"content":"Hello"}}]}
data: {"choices":[{"delta":{"content":" world"}}]}
data: [DONE]
Available Models
CLEX provides access to models models. Browse the full catalog on the Models page. Popular picks:
| Model ID | Publisher | Context | Max Output | Pricing | Use Case |
|---|
Code Examples
import requests
import json
import os
url = "https://api.clex.in/v1/chat/completions"
payload = {
"model": "meta/llama-3.3-70b-instruct",
"messages": [
{"role": "user", "content": "What is quantum computing?"}
],
"temperature": 0.7,
"max_tokens": 1024
}
# Streaming response
headers = {"Authorization": f"Bearer {os.environ['CLEX_API_KEY']}"}
response = requests.post(url, json=payload, headers=headers, stream=True)
for line in response.iter_lines():
if line:
line = line.decode('utf-8')
if line.startswith('data: '):
data = line[6:]
if data == '[DONE]':
break
chunk = json.loads(data)
content = chunk['choices'][0]['delta'].get('content', '')
print(content, end='')
Error Handling
Errors return a standardized JSON object containing an
error field with descriptive
information, helping developers debug integrations quickly.
{
"error": {
"message": "Model 'meta/llama-nonexistent' is not available. Check /v1/models for supported models.",
"type": "upstream_error",
"code": "provider_error",
"status": 404
}
}
| HTTP Status | Error Code | Cause & Resolution |
|---|---|---|
| 400 Bad Request |
invalid_request_error
|
Cause: Malformed JSON
or invalid parameter. Fix: Validate the messages array structure
and parameter types.
|
| 401 Unauthorized |
authentication_error
|
Cause: Missing or
invalid CLEX API key. Fix: Ensure the CLEX_API_KEY
environment variable is set on the server.
|
| 404 Not Found |
model_not_found
|
Cause: Requested model
ID is incorrect or deprecated. Fix: Verify the model exact string from the catalog. |
| 429 Rate Limit |
rate_limit_exceeded
|
Cause: Exceeded
allowed quota or request velocity. Fix: Pace your requests using exponential backoff. Wait a few seconds before retrying. |
| 500 Server Error |
internal_server_error
|
Cause: Unhandled
backend exception or upstream provider failure. Fix: Retry the request after a short delay. If it persists, check Support. |
| 503 Unavailable |
service_unavailable
|
Cause: Model is
overloaded or under maintenance. Fix: Wait and retry, or gracefully fall back to a smaller model (e.g. Llama 8B). |
Rate Limits
Rate limits are determined by your CLEX plan and current provider capacity. Typical starter limits are:
50
Requests / minute
10K
Tokens / minute
1000
Requests / day
💡 Best Practices
- • Implement exponential backoff on 429 errors
- • Cache responses when possible
- • Use streaming for long responses to improve UX
-
• Set appropriate
max_tokensto avoid wasting quota
Tutorials & Guides
Kickstart your production workflow with these step-by-step onboarding guides.
Build a Chatbot UI
Learn to wire up vanilla JS and standard HTML to stream tokens back to the client.
Read GuideAgentic Workflows
Use DeepSeek or Llama 3.3 for multi-step reasoning and tool-calling flows.
Read GuideRAG Setup
Build retrieval pipelines using CLEX embedding endpoints alongside conversational LLMs.
Read GuideAPI Changelog & Versioning
CLEX endpoints follow semantic versioning. Breaking changes will
always be introduced under a new version prefix (e.g.
/v2/chat/completions). Below are
recent platform updates.
Interactive Documentation
Released the new API Explorer inside our documentation, standard metadata view in our catalog, and an expanded Error Reference.
Extended Model Catalog
Added support for Mistral Large 3, DeepSeek R1 distillation variants, and Meta Llama 4 Scout. All endpoints maintained full backward compatibility.