Quick Start

API Documentation

Get started with the CLEX unified API in under 60 seconds. Access models through a single endpoint with provider routing handled by CLEX.

1 Set your base URL and API key

Environment Setup
# Base URL for all CLEX API calls
CLEX_BASE_URL="https://api.clex.in/v1"

# Your CLEX API key (set as environment variable)
CLEX_API_KEY="clex_xxxxxxxxxxxxxxxxxxxx"

Note: https://api.clex.in/v1 is the API base URL for your SDK or HTTP client. It is not a browser page. Open the dashboard to create keys and manage access.

2 Make your first API call

cURL
curl https://api.clex.in/v1/chat/completions \
  -H "Authorization: Bearer $CLEX_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "meta/llama-3.3-70b-instruct",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Important: /v1/chat/completions expects a POST request with JSON. If you open it directly in a browser tab, it will not behave like a web page.

3 Integrate into your app

Terminal
# Install OpenAI SDK if not already
npm install openai

# Run your script
CLEX_API_KEY="clex_xxx" node app.js

# You are now running on CLEX proxy

Authentication

CLEX uses CLEX API keys for authentication. The CLEX routing layer handles upstream provider credentials and normalization.

How it works

1

Create a CLEX API key from api.clex.in

2

Set it as CLEX_API_KEY environment variable

3

CLEX validates your key and routes requests to the selected provider

API authentication
// Send your CLEX API key as a Bearer token
Authorization: Bearer clex_xxxxxxxxxxxxxxxxxxxx

Chat Completions

The primary endpoint for generating AI responses. Compatible with the OpenAI chat completions format.

POST /v1/chat/completions

In this docs page, the API Explorer uses local /api/chat as a proxy for https://api.clex.in/v1/chat/completions.

Use https://api.clex.in/v1/chat/completions from code, cURL, Postman, or the OpenAI SDK. For browser navigation and API keys, go to api.clex.in.

Request Body

Parameter Type Required Description
model string Yes CLEX model ID (e.g. meta/llama-3.3-70b-instruct)
messages array Yes Array of message objects with role and content
temperature number No Sampling temperature, 0–2. Default: 0.7
max_tokens integer No Maximum tokens to generate. Default is model-specific when omitted.
stream boolean No Enable streaming. Default: true
top_p number No Nucleus sampling threshold, 0–1. Default: 0.9

Response Format

Responses follow the OpenAI-compatible format. When streaming is enabled (default), responses arrive as Server-Sent Events.

Non-streaming response
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 9,
    "total_tokens": 21
  }
}

Streaming

Streaming uses Server-Sent Events (SSE). Each event contains a JSON object with the incremental token. The stream is terminated by a data: [DONE] event.

Streaming event format
// Each SSE line:
data: {"choices":[{"delta":{"content":"Hello"}}]}

data: {"choices":[{"delta":{"content":" world"}}]}

data: [DONE]

Available Models

CLEX provides access to models models. Browse the full catalog on the Models page. Popular picks:

Model ID Publisher Context Max Output Pricing Use Case

Code Examples

Python — requests
import requests
import json
import os

url = "https://api.clex.in/v1/chat/completions"

payload = {
    "model": "meta/llama-3.3-70b-instruct",
    "messages": [
        {"role": "user", "content": "What is quantum computing?"}
    ],
    "temperature": 0.7,
    "max_tokens": 1024
}

# Streaming response
headers = {"Authorization": f"Bearer {os.environ['CLEX_API_KEY']}"}
response = requests.post(url, json=payload, headers=headers, stream=True)

for line in response.iter_lines():
    if line:
        line = line.decode('utf-8')
        if line.startswith('data: '):
            data = line[6:]
            if data == '[DONE]':
                break
            chunk = json.loads(data)
            content = chunk['choices'][0]['delta'].get('content', '')
            print(content, end='')

Error Handling

Errors return a standardized JSON object containing an error field with descriptive information, helping developers debug integrations quickly.

JSON Error Format
{
  "error": {
    "message": "Model 'meta/llama-nonexistent' is not available. Check /v1/models for supported models.",
    "type": "upstream_error",
    "code": "provider_error",
    "status": 404
  }
}
HTTP Status Error Code Cause & Resolution
400 Bad Request invalid_request_error Cause: Malformed JSON or invalid parameter.
Fix: Validate the messages array structure and parameter types.
401 Unauthorized authentication_error Cause: Missing or invalid CLEX API key.
Fix: Ensure the CLEX_API_KEY environment variable is set on the server.
404 Not Found model_not_found Cause: Requested model ID is incorrect or deprecated.
Fix: Verify the model exact string from the catalog.
429 Rate Limit rate_limit_exceeded Cause: Exceeded allowed quota or request velocity.
Fix: Pace your requests using exponential backoff. Wait a few seconds before retrying.
500 Server Error internal_server_error Cause: Unhandled backend exception or upstream provider failure.
Fix: Retry the request after a short delay. If it persists, check Support.
503 Unavailable service_unavailable Cause: Model is overloaded or under maintenance.
Fix: Wait and retry, or gracefully fall back to a smaller model (e.g. Llama 8B).

Rate Limits

Rate limits are determined by your CLEX plan and current provider capacity. Typical starter limits are:

50

Requests / minute

10K

Tokens / minute

1000

Requests / day

💡 Best Practices

  • Implement exponential backoff on 429 errors
  • Cache responses when possible
  • Use streaming for long responses to improve UX
  • Set appropriate max_tokens to avoid wasting quota

Tutorials & Guides

Kickstart your production workflow with these step-by-step onboarding guides.

API Changelog & Versioning

CLEX endpoints follow semantic versioning. Breaking changes will always be introduced under a new version prefix (e.g. /v2/chat/completions). Below are recent platform updates.

Interactive Documentation

Released the new API Explorer inside our documentation, standard metadata view in our catalog, and an expanded Error Reference.

Extended Model Catalog

Added support for Mistral Large 3, DeepSeek R1 distillation variants, and Meta Llama 4 Scout. All endpoints maintained full backward compatibility.