C

Collections: Auto Free

Collection
N/A
Context
Free
Input /1M
Free
Output /1M
N/A
Max Output

Collections: Auto Free

Overview

Property Value
Model ID collection/auto-free
Display Name Auto Free Collection
Type System Collection
Access Method collection/auto-free OR auto-free OR auto(free)
Scope System-level
Routing Strategy Random Selection

Description

The Auto Free Collection is a dynamic system-level collection that automatically routes inference requests to one of the predefined free models available in the system. This collection provides cost-free access to language models for testing, development, and educational purposes.

Key Characteristics

  • Zero-Cost Access: Uses only free tier models configured in routing.auto_free_models system property
  • Automatic Selection: Randomly selects from available free models on each request
  • Health-Aware: Automatically skips rate-limited or unavailable models
  • System-Managed: Maintained by platform administrators, not user-configurable
  • Alias Support: Can be accessed via multiple identifiers: collection/auto-free, auto-free, or auto(free)

Specifications

Aspect Details
Collection Type System-level (managed by platform admins)
Access Level Public (available to all users)
Scope Organization and individual access
Routing Strategy Random selection with tool-use preference
Model Selection Dynamically configured from routing.auto_free_models property
Rate Limit Handling Automatically filters rate-limited models
Tool Support Prefers models with tool support when available
Failover Falls back to any available model if no tool-supporting models exist

Use Cases

1. Cost-Free Testing and Development

  • Test API integration without consuming credits
  • Verify request/response formats before production deployment
  • Educational prototyping

2. Production Cost Optimization

  • Route low-priority or non-critical requests to free models
  • Implement tiered service levels based on user plan
  • Reduce infrastructure costs for high-volume deployments

3. Educational and Research

  • Student projects with limited budgets
  • Research prototyping before production rollout
  • Demo applications for showcasing capabilities

4. Failover and Redundancy

  • Use as fallback when premium models are unavailable
  • Distribute load across multiple free model providers
  • Graceful degradation during service disruptions

Configuration

System Property

Free models are configured via the routing.auto_free_models system property:

SELECT property_value
FROM system_properties
WHERE property_name = 'routing.auto_free_models';

Expected format: JSON array of model IDs

[
  "groq/llama-3.3-70b-versatile",
  "openai/gpt-3.5-turbo",
  "anthropic/claude-3-haiku"
]

Activation as System Collection

Auto-free is automatically marked as a system collection:

UPDATE model_collections
SET is_system_collection = true
WHERE collection_name = 'auto-free' AND is_active = true;

Model Selection Process

Available Model Detection

  1. Query all configured free models from routing.auto_free_models
  2. Filter by:
    • Active status: is_active = true
    • Availability: is_available = true
    • System reserved: is_system_reserved IS NOT TRUE
    • Rate limits: rate_limit_until IS NULL OR rate_limit_until < NOW()
  3. Verify user access:
    • User-owned models
    • Organization models (scope matches user's org)
    • Public models
  4. Shuffle results for randomness
  5. Prefer models with tool support if available

Selection with Tool Requirements

  • If request requires tools: Select from models with supports_tool_use = true
  • Fallback: Use any available model if none support tools
  • Priority order: Tool support > Trust level > Availability

Usage

Basic Request

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer sk-your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "collection/auto-free",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Using Aliases

All three identifiers route to the same collection:

# Using collection/ prefix
"model": "collection/auto-free"

# Using system collection name (no prefix)
"model": "auto-free"

# Using alternate syntax
"model": "auto(free)"

Available Models

The specific models in this collection depend on the current routing.auto_free_models system property configuration. Common free models typically include:

  • Groq LLaMA variants (llama-3.3-70b, llama-3.1-70b)
  • OpenAI GPT-3.5-turbo
  • Anthropic Claude 3 Haiku
  • Google Gemini Flash
  • Mistral Nemo

Response Format

Responses include collection routing metadata:

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "groq/llama-3.3-70b-versatile",
  "choices": [...],
  "usage": {...},
  "_collection_routed": true,
  "_collection_name": "auto-free"
}

Error Handling

No Free Models Configured

{
  "error": {
    "code": "collection_no_models",
    "message": "No free models configured in routing.auto_free_models"
  }
}

All Models Rate-Limited

{
  "error": {
    "code": "collection_no_models",
    "message": "No free models currently available. All configured models may be rate-limited or inaccessible."
  }
}

User Access Denied

{
  "error": {
    "code": "access_denied",
    "message": "Your organization does not have access to any models in this collection"
  }
}

Performance Characteristics

Metric Value
Selection Latency <5ms (cached models)
Cache TTL 5 minutes
Fallback Strategy Any available model
Request Throughput Limited by selected model
Concurrent Requests Unlimited (model-dependent)

Database Schema

Model Collection Entry

SELECT *
FROM model_collections
WHERE collection_name = 'auto-free'
  AND is_active = true;

Collection Members

Currently, auto-free members are determined dynamically from routing.auto_free_models rather than stored in model_collection_members table. This allows for flexible configuration changes without database schema modifications.

System Integration

Model Resolution Flow

Request: model='auto(free)' or 'auto-free' or 'collection/auto-free'
    ↓
TieredModelResolutionService (detects auto-free pattern)
    ↓
AutoFreeRouter (selects random free model)
    ↓
Selected Model (e.g., groq/llama-3.3-70b-versatile)
    ↓
Cloud Gateway (routes to provider)

Configuration Management

Auto-free is configured at startup from system properties:

  • Not updated on every request (loaded at initialization)
  • Cache refreshed every 5 minutes
  • Clear cache manually: autoFreeRouter.clearCache()

Billing and Credits

  • Cost: Free (0 credits)
  • Included: Unlimited requests (provider rate limits apply)
  • Tracking: Logged to request_logs for analytics
  • Quotas: Applied per user/organization quotas still enforced

Restrictions and Limitations

  1. Rate Limits: Subject to provider rate limits on free tiers
  2. Model Selection: Random selection (no user preference control)
  3. Tool Support: Varies by selected model
  4. Context Window: Depends on selected model
  5. Features: Limited to free tier capabilities
  • System Property: routing.auto_free_models
  • AutoFreeRouter Service: /gateway-type1/lib/services/auto-free-router.ts
  • TieredResolution: /gateway-type1/lib/services/model-resolution-tiered.ts
  • Model Collections Migration: /datastore/migrations/20251220_add_system_collection_flag.sql

Admin Commands

View Free Models Configuration

docker exec langmart-postgres psql -U langmart_admin -d langmart -c \
  "SELECT property_value FROM system_properties \
   WHERE property_name = 'routing.auto_free_models';"

Update Free Models List

docker exec langmart-postgres psql -U langmart_admin -d langmart -c \
  "UPDATE system_properties \
   SET property_value = '[\"groq/llama-3.3-70b-versatile\", \"openai/gpt-3.5-turbo\"]' \
   WHERE property_name = 'routing.auto_free_models';"

Clear Router Cache

# Via API endpoint (if available)
curl -X POST https://api.langmart.ai/api/admin/cache/clear-auto-free \
  -H "Authorization: Bearer sk-your-api-key"

Monitoring and Logging

  • All auto-free requests logged to request_logs table
  • Routing decisions logged to gateway logs
  • Available models cached and refreshed every 5 minutes
  • Monitor free model availability via admin dashboard

Version History

  • v1.0 (2025-12-20): Initial system collection implementation
  • System collection flag added to enable prefix-less access
  • Auto-free router service created with caching support