OpenAI-Compatible API
Base URL:
https://api.langmart.ai/v1
LangMart provides a fully OpenAI-compatible API, allowing you to use existing OpenAI SDKs and tools with models from multiple providers.
Endpoints Overview
| Endpoint | Method | Description |
|---|---|---|
/v1/chat/completions |
POST | Create chat completions |
/v1/completions |
POST | Create text completions (legacy) |
/v1/embeddings |
POST | Create embeddings |
/v1/models |
GET | List available models |
/v1/models/{model} |
GET | Get model details |
Chat Completions
Create a chat completion with conversation history.
Endpoint
POST /v1/chat/completionsRequest Headers
Authorization: Bearer YOUR_API_KEY
Content-Type: application/jsonRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | Model ID (e.g., openai/gpt-4o) |
messages |
array | Yes | Array of message objects |
temperature |
number | No | Sampling temperature (0-2). Default: 1 |
top_p |
number | No | Nucleus sampling (0-1). Default: 1 |
max_tokens |
integer | No | Maximum tokens to generate |
stream |
boolean | No | Enable streaming. Default: false |
stop |
string/array | No | Stop sequences |
presence_penalty |
number | No | Presence penalty (-2 to 2). Default: 0 |
frequency_penalty |
number | No | Frequency penalty (-2 to 2). Default: 0 |
tools |
array | No | Available tools/functions |
tool_choice |
string/object | No | Tool selection mode |
response_format |
object | No | Response format (e.g., JSON mode) |
seed |
integer | No | Random seed for reproducibility |
user |
string | No | User identifier for tracking |
Message Object
{
"role": "user | assistant | system | tool",
"content": "Message content",
"name": "optional_name",
"tool_calls": [],
"tool_call_id": "for tool responses"
}Example Request
curl https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
],
"temperature": 0.7,
"max_tokens": 100
}'Response
{
"id": "chatcmpl-9abc123def456",
"object": "chat.completion",
"created": 1704067200,
"model": "openai/gpt-4o",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 8,
"total_tokens": 33
}
}Streaming Example
curl https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Tell me a joke"}],
"stream": true
}'Streaming Response (Server-Sent Events):
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"Why"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" did"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]Tool/Function Calling
curl https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "user", "content": "What is the weather in Paris?"}
],
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name"
}
},
"required": ["location"]
}
}
}
],
"tool_choice": "auto"
}'Tool Call Response:
{
"id": "chatcmpl-abc",
"choices": [
{
"message": {
"role": "assistant",
"content": null,
"tool_calls": [
{
"id": "call_abc123",
"type": "function",
"function": {
"name": "get_weather",
"arguments": "{\"location\": \"Paris\"}"
}
}
]
},
"finish_reason": "tool_calls"
}
]
}JSON Mode
curl https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [
{"role": "user", "content": "List 3 fruits as JSON"}
],
"response_format": {"type": "json_object"}
}'Text Completions (Legacy)
Create a text completion. Note: Most providers now prefer chat completions.
Endpoint
POST /v1/completionsRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | Model ID |
prompt |
string/array | Yes | Text prompt(s) |
max_tokens |
integer | No | Maximum tokens |
temperature |
number | No | Sampling temperature |
top_p |
number | No | Nucleus sampling |
stop |
string/array | No | Stop sequences |
echo |
boolean | No | Echo prompt in response |
Example
curl https://api.langmart.ai/v1/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-3.5-turbo-instruct",
"prompt": "Once upon a time",
"max_tokens": 50
}'Embeddings
Generate vector embeddings for text.
Endpoint
POST /v1/embeddingsRequest Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string | Yes | Embedding model ID |
input |
string/array | Yes | Text(s) to embed |
encoding_format |
string | No | float or base64 |
dimensions |
integer | No | Output dimensions (some models) |
Example
curl https://api.langmart.ai/v1/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": "The quick brown fox jumps over the lazy dog"
}'Response
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023, -0.0048, 0.0089, ...]
}
],
"model": "openai/text-embedding-3-small",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9
}
}Batch Embeddings
curl https://api.langmart.ai/v1/embeddings \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/text-embedding-3-small",
"input": [
"First text to embed",
"Second text to embed",
"Third text to embed"
]
}'List Models
Get a list of available models.
Endpoint
GET /v1/modelsExample
curl https://api.langmart.ai/v1/models \
-H "Authorization: Bearer YOUR_API_KEY"Response
{
"object": "list",
"data": [
{
"id": "openai/gpt-4o",
"object": "model",
"created": 1704067200,
"owned_by": "openai",
"permission": [],
"root": "gpt-4o",
"parent": null
},
{
"id": "anthropic/claude-3-5-sonnet-20241022",
"object": "model",
"created": 1704067200,
"owned_by": "anthropic",
"permission": [],
"root": "claude-3-5-sonnet",
"parent": null
}
]
}Get Model
Get details for a specific model.
Endpoint
GET /v1/models/{model_id}Example
curl https://api.langmart.ai/v1/models/openai/gpt-4o \
-H "Authorization: Bearer YOUR_API_KEY"Response
{
"id": "openai/gpt-4o",
"object": "model",
"created": 1704067200,
"owned_by": "openai",
"permission": [],
"root": "gpt-4o",
"parent": null
}Supported Models by Provider
OpenAI
| Model ID | Description | Context |
|---|---|---|
openai/gpt-4o |
Most capable GPT-4 | 128K |
openai/gpt-4o-mini |
Fast, cost-effective | 128K |
openai/gpt-4-turbo |
GPT-4 Turbo | 128K |
openai/gpt-3.5-turbo |
Fast, affordable | 16K |
openai/text-embedding-3-small |
Small embeddings | - |
openai/text-embedding-3-large |
Large embeddings | - |
Anthropic
| Model ID | Description | Context |
|---|---|---|
anthropic/claude-3-5-sonnet-20241022 |
Most intelligent | 200K |
anthropic/claude-3-opus-20240229 |
Most powerful | 200K |
anthropic/claude-3-haiku-20240307 |
Fastest | 200K |
| Model ID | Description | Context |
|---|---|---|
google/gemini-1.5-pro |
Most capable | 1M |
google/gemini-1.5-flash |
Fast | 1M |
google/gemini-pro |
Balanced | 32K |
Groq (Ultra-Fast)
| Model ID | Description | Context |
|---|---|---|
groq/llama-3.3-70b-versatile |
Llama 3.3 70B | 128K |
groq/llama-3.1-70b-versatile |
Llama 3.1 70B | 128K |
groq/mixtral-8x7b-32768 |
Mixtral MoE | 32K |
Mistral
| Model ID | Description | Context |
|---|---|---|
mistral/mistral-large-latest |
Most capable | 128K |
mistral/mistral-medium-latest |
Balanced | 32K |
mistral/mistral-small-latest |
Fast | 32K |
SDK Examples
Python
from openai import OpenAI
client = OpenAI(
api_key="YOUR_LANGMART_API_KEY",
base_url="https://api.langmart.ai/v1"
)
# Chat completion
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
# Streaming
stream = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
# Embeddings
embeddings = client.embeddings.create(
model="openai/text-embedding-3-small",
input="Hello world"
)
print(embeddings.data[0].embedding[:5])JavaScript/TypeScript
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'YOUR_LANGMART_API_KEY',
baseURL: 'https://api.langmart.ai/v1'
});
// Chat completion
const response = await client.chat.completions.create({
model: 'openai/gpt-4o',
messages: [
{ role: 'system', content: 'You are a helpful assistant.' },
{ role: 'user', content: 'Hello!' }
]
});
console.log(response.choices[0].message.content);
// Streaming
const stream = await client.chat.completions.create({
model: 'openai/gpt-4o',
messages: [{ role: 'user', content: 'Tell me a story' }],
stream: true
});
for await (const chunk of stream) {
process.stdout.write(chunk.choices[0]?.delta?.content || '');
}cURL
# Basic request
curl https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Hello"}]
}'
# With streaming
curl https://api.langmart.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-N \
-d '{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Hello"}],
"stream": true
}'Error Handling
Common Errors
| Status | Error Type | Description |
|---|---|---|
| 400 | invalid_request_error |
Malformed request |
| 401 | authentication_error |
Invalid API key |
| 402 | billing_error |
Insufficient credits |
| 404 | model_not_found |
Model doesn't exist |
| 429 | rate_limit_error |
Too many requests |
| 500 | server_error |
Internal error |
| 503 | gateway_unavailable |
No available gateway |
Error Response Format
{
"error": {
"type": "authentication_error",
"code": "invalid_api_key",
"message": "Invalid API key provided",
"param": null,
"details": {}
}
}Retry Logic
import time
from openai import OpenAI, RateLimitError, APIError
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="https://api.langmart.ai/v1"
)
def make_request_with_retry(messages, max_retries=3):
for attempt in range(max_retries):
try:
return client.chat.completions.create(
model="openai/gpt-4o",
messages=messages
)
except RateLimitError:
wait_time = 2 ** attempt
time.sleep(wait_time)
except APIError as e:
if e.status_code >= 500:
time.sleep(2 ** attempt)
else:
raise
raise Exception("Max retries exceeded")Best Practices
1. Use Streaming for Long Responses
Streaming provides better user experience for longer outputs:
stream = client.chat.completions.create(
model="openai/gpt-4o",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)2. Set Appropriate Max Tokens
Prevent runaway costs by limiting output length:
response = client.chat.completions.create(
model="openai/gpt-4o",
messages=[...],
max_tokens=500
)3. Use System Messages Effectively
Guide model behavior with clear system prompts:
messages = [
{"role": "system", "content": "You are a concise assistant. Keep responses under 100 words."},
{"role": "user", "content": "Explain quantum computing"}
]4. Handle Errors Gracefully
Always implement proper error handling:
try:
response = client.chat.completions.create(...)
except openai.AuthenticationError:
print("Check your API key")
except openai.RateLimitError:
print("Rate limited, waiting...")
time.sleep(60)
except openai.APIError as e:
print(f"API error: {e}")Platform Links
| Feature | Direct Link |
|---|---|
| Browse Models | https://langmart.ai/models |
| API Keys | https://langmart.ai/settings |
| Request Logs | https://langmart.ai/requests |
| Usage & Costs | https://langmart.ai/usage |
Related Documentation
- Authentication - API key management
- Models - Full model catalog
- Request Logs - View API request history
- Errors - Error code reference