OpenAI-Compatible API

Base URL: https://api.langmart.ai/v1

LangMart provides a fully OpenAI-compatible API, allowing you to use existing OpenAI SDKs and tools with models from multiple providers.

Endpoints Overview

Endpoint	Method	Description
`/v1/chat/completions`	POST	Create chat completions
`/v1/completions`	POST	Create text completions (legacy)
`/v1/embeddings`	POST	Create embeddings
`/v1/models`	GET	List available models
`/v1/models/{model}`	GET	Get model details

Chat Completions

Create a chat completion with conversation history.

Endpoint

POST /v1/chat/completions

Request Headers

Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Model ID (e.g., `openai/gpt-4o`)
`messages`	array	Yes	Array of message objects
`temperature`	number	No	Sampling temperature (0-2). Default: 1
`top_p`	number	No	Nucleus sampling (0-1). Default: 1
`max_tokens`	integer	No	Maximum tokens to generate
`stream`	boolean	No	Enable streaming. Default: false
`stop`	string/array	No	Stop sequences
`presence_penalty`	number	No	Presence penalty (-2 to 2). Default: 0
`frequency_penalty`	number	No	Frequency penalty (-2 to 2). Default: 0
`tools`	array	No	Available tools/functions
`tool_choice`	string/object	No	Tool selection mode
`response_format`	object	No	Response format (e.g., JSON mode)
`seed`	integer	No	Random seed for reproducibility
`user`	string	No	User identifier for tracking

Message Object

{
  "role": "user | assistant | system | tool",
  "content": "Message content",
  "name": "optional_name",
  "tool_calls": [],
  "tool_call_id": "for tool responses"
}

Example Request

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "What is the capital of France?"}
    ],
    "temperature": 0.7,
    "max_tokens": 100
  }'

Response

{
  "id": "chatcmpl-9abc123def456",
  "object": "chat.completion",
  "created": 1704067200,
  "model": "openai/gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  }
}

Streaming Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Tell me a joke"}],
    "stream": true
  }'

Streaming Response (Server-Sent Events):

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Why"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" did"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Tool/Function Calling

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": "What is the weather in Paris?"}
    ],
    "tools": [
      {
        "type": "function",
        "function": {
          "name": "get_weather",
          "description": "Get current weather for a location",
          "parameters": {
            "type": "object",
            "properties": {
              "location": {
                "type": "string",
                "description": "City name"
              }
            },
            "required": ["location"]
          }
        }
      }
    ],
    "tool_choice": "auto"
  }'

Tool Call Response:

{
  "id": "chatcmpl-abc",
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"Paris\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

JSON Mode

curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {"role": "user", "content": "List 3 fruits as JSON"}
    ],
    "response_format": {"type": "json_object"}
  }'

Text Completions (Legacy)

Create a text completion. Note: Most providers now prefer chat completions.

Endpoint

POST /v1/completions

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Model ID
`prompt`	string/array	Yes	Text prompt(s)
`max_tokens`	integer	No	Maximum tokens
`temperature`	number	No	Sampling temperature
`top_p`	number	No	Nucleus sampling
`stop`	string/array	No	Stop sequences
`echo`	boolean	No	Echo prompt in response

Example

curl https://api.langmart.ai/v1/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-3.5-turbo-instruct",
    "prompt": "Once upon a time",
    "max_tokens": 50
  }'

Embeddings

Generate vector embeddings for text.

Endpoint

POST /v1/embeddings

Request Body

Parameter	Type	Required	Description
`model`	string	Yes	Embedding model ID
`input`	string/array	Yes	Text(s) to embed
`encoding_format`	string	No	`float` or `base64`
`dimensions`	integer	No	Output dimensions (some models)

Example

curl https://api.langmart.ai/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-small",
    "input": "The quick brown fox jumps over the lazy dog"
  }'

Response

{
  "object": "list",
  "data": [
    {
      "object": "embedding",
      "index": 0,
      "embedding": [0.0023, -0.0048, 0.0089, ...]
    }
  ],
  "model": "openai/text-embedding-3-small",
  "usage": {
    "prompt_tokens": 9,
    "total_tokens": 9
  }
}

Batch Embeddings

curl https://api.langmart.ai/v1/embeddings \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/text-embedding-3-small",
    "input": [
      "First text to embed",
      "Second text to embed",
      "Third text to embed"
    ]
  }'

List Models

Get a list of available models.

Endpoint

GET /v1/models

Example

curl https://api.langmart.ai/v1/models \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "object": "list",
  "data": [
    {
      "id": "openai/gpt-4o",
      "object": "model",
      "created": 1704067200,
      "owned_by": "openai",
      "permission": [],
      "root": "gpt-4o",
      "parent": null
    },
    {
      "id": "anthropic/claude-3-5-sonnet-20241022",
      "object": "model",
      "created": 1704067200,
      "owned_by": "anthropic",
      "permission": [],
      "root": "claude-3-5-sonnet",
      "parent": null
    }
  ]
}

Get Model

Get details for a specific model.

Endpoint

GET /v1/models/{model_id}

Example

curl https://api.langmart.ai/v1/models/openai/gpt-4o \
  -H "Authorization: Bearer YOUR_API_KEY"

Response

{
  "id": "openai/gpt-4o",
  "object": "model",
  "created": 1704067200,
  "owned_by": "openai",
  "permission": [],
  "root": "gpt-4o",
  "parent": null
}

Supported Models by Provider

OpenAI

Model ID	Description	Context
`openai/gpt-4o`	Most capable GPT-4	128K
`openai/gpt-4o-mini`	Fast, cost-effective	128K
`openai/gpt-4-turbo`	GPT-4 Turbo	128K
`openai/gpt-3.5-turbo`	Fast, affordable	16K
`openai/text-embedding-3-small`	Small embeddings	-
`openai/text-embedding-3-large`	Large embeddings	-

Anthropic

Model ID	Description	Context
`anthropic/claude-3-5-sonnet-20241022`	Most intelligent	200K
`anthropic/claude-3-opus-20240229`	Most powerful	200K
`anthropic/claude-3-haiku-20240307`	Fastest	200K

Google

Model ID	Description	Context
`google/gemini-1.5-pro`	Most capable	1M
`google/gemini-1.5-flash`	Fast	1M
`google/gemini-pro`	Balanced	32K

Groq (Ultra-Fast)

Model ID	Description	Context
`groq/llama-3.3-70b-versatile`	Llama 3.3 70B	128K
`groq/llama-3.1-70b-versatile`	Llama 3.1 70B	128K
`groq/mixtral-8x7b-32768`	Mixtral MoE	32K

Mistral

Model ID	Description	Context
`mistral/mistral-large-latest`	Most capable	128K
`mistral/mistral-medium-latest`	Balanced	32K
`mistral/mistral-small-latest`	Fast	32K

SDK Examples

Python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_LANGMART_API_KEY",
    base_url="https://api.langmart.ai/v1"
)

# Chat completion
response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"}
    ]
)
print(response.choices[0].message.content)

# Streaming
stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

# Embeddings
embeddings = client.embeddings.create(
    model="openai/text-embedding-3-small",
    input="Hello world"
)
print(embeddings.data[0].embedding[:5])

JavaScript/TypeScript

import OpenAI from 'openai';

const client = new OpenAI({
    apiKey: 'YOUR_LANGMART_API_KEY',
    baseURL: 'https://api.langmart.ai/v1'
});

// Chat completion
const response = await client.chat.completions.create({
    model: 'openai/gpt-4o',
    messages: [
        { role: 'system', content: 'You are a helpful assistant.' },
        { role: 'user', content: 'Hello!' }
    ]
});
console.log(response.choices[0].message.content);

// Streaming
const stream = await client.chat.completions.create({
    model: 'openai/gpt-4o',
    messages: [{ role: 'user', content: 'Tell me a story' }],
    stream: true
});
for await (const chunk of stream) {
    process.stdout.write(chunk.choices[0]?.delta?.content || '');
}

cURL

# Basic request
curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

# With streaming
curl https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Error Handling

Common Errors

Status	Error Type	Description
400	`invalid_request_error`	Malformed request
401	`authentication_error`	Invalid API key
402	`billing_error`	Insufficient credits
404	`model_not_found`	Model doesn't exist
429	`rate_limit_error`	Too many requests
500	`server_error`	Internal error
503	`gateway_unavailable`	No available gateway

Error Response Format

{
  "error": {
    "type": "authentication_error",
    "code": "invalid_api_key",
    "message": "Invalid API key provided",
    "param": null,
    "details": {}
  }
}

Retry Logic

import time
from openai import OpenAI, RateLimitError, APIError

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://api.langmart.ai/v1"
)

def make_request_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="openai/gpt-4o",
                messages=messages
            )
        except RateLimitError:
            wait_time = 2 ** attempt
            time.sleep(wait_time)
        except APIError as e:
            if e.status_code >= 500:
                time.sleep(2 ** attempt)
            else:
                raise
    raise Exception("Max retries exceeded")

Best Practices

1. Use Streaming for Long Responses

Streaming provides better user experience for longer outputs:

stream = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[{"role": "user", "content": "Write a poem"}],
    stream=True
)

2. Set Appropriate Max Tokens

Prevent runaway costs by limiting output length:

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=[...],
    max_tokens=500
)

3. Use System Messages Effectively

Guide model behavior with clear system prompts:

messages = [
    {"role": "system", "content": "You are a concise assistant. Keep responses under 100 words."},
    {"role": "user", "content": "Explain quantum computing"}
]

4. Handle Errors Gracefully

Always implement proper error handling:

try:
    response = client.chat.completions.create(...)
except openai.AuthenticationError:
    print("Check your API key")
except openai.RateLimitError:
    print("Rate limited, waiting...")
    time.sleep(60)
except openai.APIError as e:
    print(f"API error: {e}")

Platform Links

Feature	Direct Link
Browse Models	https://langmart.ai/models
API Keys	https://langmart.ai/api-keys
Request Logs	https://langmart.ai/requests
Usage & Costs	https://langmart.ai/usage

Authentication - API key management
Models - Full model catalog
Request Logs - View API request history
Errors - Error code reference

Previous Native API (Direct Routes) Next Organizations API

OpenAI-Compatible API

Endpoints Overview

Chat Completions

Endpoint

Request Headers

Request Body

Message Object

Example Request

Response

Streaming Example

Tool/Function Calling

JSON Mode

Text Completions (Legacy)

Endpoint

Request Body

Example

Embeddings

Endpoint

Request Body

Example

Response

Batch Embeddings

List Models

Endpoint

Example

Response

Get Model

Endpoint

Example

Response

Supported Models by Provider

OpenAI

Anthropic

Google

Groq (Ultra-Fast)

Mistral

SDK Examples

Python

JavaScript/TypeScript

cURL

Error Handling

Common Errors

Error Response Format

Retry Logic

Best Practices

1. Use Streaming for Long Responses

2. Set Appropriate Max Tokens

3. Use System Messages Effectively

4. Handle Errors Gracefully

Platform Links

Related Documentation