Inference Providers

Quick Access: Organizations | Connection Pools | Analytics

Inference Providers are organizations that offer AI model access to their members. Whether you're a cloud provider with GPU infrastructure, an enterprise IT team, or a research institution, LangMart enables you to provide managed AI inference services to your users.

What is an Inference Provider?

An Inference Provider is an organization that:

  • Provides Model Access: Makes AI models available to organization members
  • Manages Billing: Controls whether the organization or members pay for usage
  • Centralizes API Keys: Members don't need their own provider API keys
  • Tracks Usage: Monitors consumption across all members
  • Sets Spending Limits: Controls costs with per-member and organization-wide limits

Provider vs Consumer Organizations

Aspect Consumer Organization Inference Provider
Primary Goal Use AI models for team projects Provide AI services to members
Billing Model Usually org-pays for approved tools Often mix of org-pays and member-pays
Member Relationship Employees/team members Customers/users/departments
Scale Typically 5-50 members Can scale to hundreds/thousands
API Key Management Few shared connections Multiple connections, pools, failover

Benefits for Inference Providers

For Cloud & GPU Providers

  • Monetize Infrastructure: Turn GPU clusters into revenue-generating inference services
  • No Platform Development: Skip building billing, authentication, and API management
  • Usage Analytics: Track consumption and generate customer invoices
  • Multi-tenancy: Serve multiple customers from shared infrastructure

For Enterprise IT Teams

  • Centralized Control: Single point of access for all AI models
  • Compliance: Audit trails, PII detection, and data governance
  • Cost Management: Budgets and limits per department/team
  • Model Governance: Control which models are available to which teams

For Research Institutions

  • Grant Management: Track usage against research budgets
  • Student Access: Provide AI access without individual API keys
  • Usage Reporting: Generate reports for grant compliance
  • Collaboration: Share resources across labs and departments

Key Capabilities

Model Access Control

Control which models members can access:

  • Organization Connections: Models available to all members
  • Model Categories: Group models by capability (coding, chat, embedding)
  • Connection Pools: Load balance across multiple API keys
  • Failover: Automatic routing when connections fail

Billing & Credits

Flexible billing options:

Mode Description Use Case
Org-Pays Organization covers all usage Approved tools, core workflows
Member-Pays Members pay from their credits Experimental models, personal use
Hybrid Some models org-paid, others member-paid Mixed environments

Usage Management

Comprehensive usage controls:

  • Per-Member Limits: Daily and monthly spending caps
  • Organization Limits: Total spending caps
  • Real-time Tracking: Monitor usage as it happens
  • Alerts: Notifications when limits approach

Security & Compliance

Enterprise-grade security:

  • KeyVault: Automatic API key redaction in prompts
  • PII Detection: Identify sensitive data in requests
  • Audit Logging: Complete request/response history
  • Role-Based Access: Control who can manage settings

Getting Started

Follow these guides to set up your organization as an Inference Provider:

  1. Quick Start Guide - Set up in 15 minutes
  2. Organization Setup - Detailed configuration
  3. Member Management - Invite and manage users
  4. Billing Models - Configure payment options
  5. Connection Pools - High availability setup
  6. Usage & Analytics - Monitor your service

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    Inference Provider Organization               │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐         │
│  │   Member    │    │   Member    │    │   Member    │         │
│  │   (User)    │    │   (User)    │    │   (User)    │         │
│  └──────┬──────┘    └──────┬──────┘    └──────┬──────┘         │
│         │                  │                  │                 │
│         └──────────────────┼──────────────────┘                 │
│                            │                                    │
│                            ▼                                    │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              Organization Connection Pool                │   │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐    │   │
│  │  │ OpenAI  │  │Anthropic│  │  Groq   │  │ Custom  │    │   │
│  │  │   Key   │  │   Key   │  │   Key   │  │Endpoint │    │   │
│  │  └─────────┘  └─────────┘  └─────────┘  └─────────┘    │   │
│  └─────────────────────────────────────────────────────────┘   │
│                                                                  │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │                   Billing & Analytics                     │  │
│  │  • Usage tracking per member                              │  │
│  │  • Spending limits enforcement                            │  │
│  │  • Cost allocation reports                                │  │
│  └──────────────────────────────────────────────────────────┘  │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Platform Integration

Inference Providers integrate with LangMart through:

API Access

Members access models via the OpenAI-compatible API:

curl -X POST https://api.langmart.ai/v1/chat/completions \
  -H "Authorization: Bearer <member-api-key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Web Chat Interface

Members can also use the built-in chat interface at https://langmart.ai/chat with access to organization models.

Admin Dashboard

Providers manage their service through the admin interface:

  • Member management
  • Connection configuration
  • Usage monitoring
  • Billing settings

Pricing for Providers

LangMart charges a small platform fee on managed inference:

Tier Fee Best For
Managed 3% of usage Most providers
Self-Hosted Gateway 0% Privacy-focused deployments
White-Label One-time license Large scale operations

Support & Resources

Next Steps

Ready to become an Inference Provider?

  1. Create your organization
  2. Add provider connections
  3. Invite your first members
  4. Configure billing