Inference Sessions

Inference sessions are the core of LangMart Automation, enabling persistent, tool-enabled conversations with AI models on remote gateways.

What is an Inference Session?

An inference session is a container for:

  • A conversation history between you and an AI model
  • Tool execution context and state
  • Configuration settings (model, system prompt, enabled tools)
  • Gateway connection for remote execution

Session Types

Chat Sessions

Direct conversations without tool support:

  • No gateway required
  • Standard chat functionality
  • Lightweight and fast
  • Suitable for simple queries

Remote Chat Sessions

Chat with full tool support:

  • Requires Type 3 gateway
  • Tools available during conversation
  • Secure remote execution
  • Best for interactive development

Automation Sessions

Template-driven headless sessions:

  • Based on pre-configured templates
  • Optimized for specific workflows
  • Can run without user interaction
  • Ideal for batch operations

Creating Sessions

Via Web Interface

  1. Navigate to Automation dashboard
  2. Select a gateway
  3. Choose a template (optional)
  4. Select a model
  5. Configure tools
  6. Click Start Session

Via API

POST /api/automation/sessions
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "gatewayId": "gateway-uuid",
  "sessionType": "remote_chat",
  "modelId": "groq/llama-3.3-70b-versatile",
  "systemPrompt": "You are a helpful coding assistant.",
  "name": "Code Review Session",
  "disabledTools": ["web_search"]
}

Session Parameters

Parameter Required Description
gatewayId Yes (for remote/automation) Target gateway UUID
sessionType No chat, remote_chat, or automation
modelId Yes Model identifier to use
connectionId No Specific connection to use
templateId No Template to base session on
systemPrompt No Custom system prompt
name No Display name for session
disabledTools No Array of tool names to disable

Session Lifecycle

Status States

Status Description
active Session is running and accepting messages
completed Session has been marked complete
cancelled Session was cancelled by user
error Session encountered an error

State Transitions

created -> active -> completed
                  -> cancelled
                  -> error

Reactivation

Completed sessions can be reactivated:

POST /api/automation/sessions/:id/messages

# Completed sessions are automatically reactivated when you send a message

Sending Messages

Basic Message

POST /api/automation/sessions/:id/messages
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "message": "Analyze the code in /app/main.py"
}

With Model Override

POST /api/automation/sessions/:id/messages
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "message": "Summarize this file",
  "modelId": "anthropic/claude-3-sonnet"
}

Streaming Responses

Enable real-time response streaming:

POST /api/automation/sessions/:id/messages?stream=true
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "message": "Write a long analysis"
}

Streaming response format (Server-Sent Events):

data: {"type":"content","content":"Here is"}
data: {"type":"content","content":" my analysis"}
data: {"type":"done","message":{...},"summary":"..."}

Managing Sessions

List Sessions

GET /api/automation/sessions?limit=50&offset=0
Authorization: Bearer <your-api-key>

Filter options:

Parameter Description
status Filter by status
gatewayId Filter by gateway
templateId Filter by template
sessionType Filter by type
startDate Sessions after this date
endDate Sessions before this date

Get Session Details

GET /api/automation/sessions/:id
Authorization: Bearer <your-api-key>

Update Session Name

PATCH /api/automation/sessions/:id
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "name": "New Session Name"
}

Complete Session

Mark a session as complete:

POST /api/automation/sessions/:id/complete
Authorization: Bearer <your-api-key>

Cancel Session

Cancel an active session:

DELETE /api/automation/sessions/:id
Authorization: Bearer <your-api-key>

Batch Delete Sessions

Delete multiple sessions at once:

DELETE /api/automation/sessions
Content-Type: application/json
Authorization: Bearer <your-api-key>

{
  "sessionIds": ["session-1", "session-2", "session-3"]
}

Conversation History

Get Conversation

Retrieve the full conversation history:

GET /api/automation/sessions/:id/conversation
Authorization: Bearer <your-api-key>

Response structure:

{
  "success": true,
  "session": {
    "id": "session-uuid",
    "sessionType": "remote_chat",
    "status": "active",
    "startedAt": "2024-01-15T10:30:00Z"
  },
  "conversation": [
    {
      "role": "user",
      "content": "Analyze main.py",
      "timestamp": "2024-01-15T10:30:05Z"
    },
    {
      "role": "assistant",
      "content": "I'll analyze the file...",
      "timestamp": "2024-01-15T10:30:08Z",
      "toolCalls": [...]
    },
    {
      "role": "tool",
      "content": "file contents...",
      "toolName": "read_file",
      "toolCallId": "call_123"
    }
  ],
  "totalMessages": 5
}

Message Structure

Field Description
role user, assistant, system, or tool
content Message content
timestamp When the message was created
model Model used (for assistant messages)
toolCalls Tool invocations (for assistant messages)
toolCallId ID linking to the tool call (for tool messages)
toolName Name of the executed tool
usage Token usage statistics

Tool Execution

How Tools Work

  1. User sends a message
  2. Model decides which tools to use
  3. Gateway executes the tools
  4. Tool results are sent back to model
  5. Model generates final response

Tool Call Format

{
  "role": "assistant",
  "content": "",
  "toolCalls": [
    {
      "id": "call_abc123",
      "type": "function",
      "function": {
        "name": "read_file",
        "arguments": "{\"path\":\"/app/main.py\"}"
      }
    }
  ]
}

Tool Result Format

{
  "role": "tool",
  "toolCallId": "call_abc123",
  "toolName": "read_file",
  "content": "def main():\n    print('Hello')"
}

Session Configuration

System Prompts

Customize AI behavior with system prompts:

{
  "systemPrompt": "You are a senior software engineer. Analyze code thoroughly and provide detailed explanations. Always consider security implications."
}

Environment Info

Include OS and environment information in the system prompt:

{
  "includeEnvironmentInfo": true
}

This adds details like:

  • Operating system
  • Current directory
  • Available tools

Disabled Tools

Restrict which tools are available:

{
  "disabledTools": [
    "write_file",
    "delete_file",
    "execute_command"
  ]
}

Web Interface Features

Session List

The Sessions tab shows:

  • Session ID and gateway
  • Status indicator
  • Start time
  • Request count
  • Action buttons (Continue, Cancel)

Filtering Sessions

Search sessions by:

  • Session ID
  • Gateway ID
  • Status

Continuing Sessions

Click "Continue" to:

  • Open session in Chat interface
  • Resume the conversation
  • Access full conversation history

Best Practices

Naming Sessions

  • Use descriptive names
  • Include project or task context
  • Add date for recurring tasks
  • Example: "API Refactor - Auth Module - Jan 2024"

Session Organization

  • Complete sessions when done
  • Delete old or failed sessions
  • Use templates for recurring tasks
  • Group related sessions by naming convention

Performance

  • Start with minimal enabled tools
  • Use specific system prompts
  • Choose appropriate models for tasks
  • Monitor token usage

Error Handling

  • Review error messages in session details
  • Check gateway health status
  • Verify model availability
  • Consider retry with different model

Troubleshooting

Session Won't Start

  1. Verify gateway is healthy
  2. Check API key permissions
  3. Confirm model is available
  4. Review gateway connection

Tool Execution Fails

  1. Check tool is enabled
  2. Verify gateway supports the tool
  3. Review tool arguments
  4. Check gateway logs

Slow Responses

  1. Consider model latency
  2. Check token count (long contexts are slower)
  3. Verify network connectivity
  4. Try a faster model

Session Lost State

  1. Check session status
  2. Review conversation history
  3. Verify gateway didn't restart
  4. Consider creating new session

Next Steps