A

Trinity Mini - API, Providers, Stats

Arcee AI
131K
Context
$0.0400
Input /1M
$0.1500
Output /1M
N/A
Max Output

Trinity Mini - API, Providers, Stats

Overview

Trinity Mini is a 26B-parameter sparse mixture-of-experts language model engineered by Arcee AI for efficient reasoning over extended contexts with robust function calling and multi-step agent capabilities.

Key Specifications:

  • Architecture: Sparse Mixture of Experts (MoE)
  • Total Parameters: 26B
  • Active Parameters: 3B (effective)
  • Expert Configuration: 128 experts with 8 active per token
  • Context Length: 131,072 tokens
  • Input/Output Modalities: Text-only
  • Release Date: December 1, 2025

Pricing

Metric Cost
Input $0.04 per 1M tokens
Output $0.15 per 1M tokens

Performance

Latency & Throughput (via Together):

  • Latency: 0.26s
  • Throughput: 243.8 tokens per second
  • Uptime: 100.0%

Recent Usage Metrics:

  • Peak daily requests: ~58,500 (Dec 10, 2025)
  • Significant reasoning token usage patterns
  • Active tool calling functionality
  • Error rate: <1% on tool calls
  • Consistent performance across deployments

Information about related models is not provided on the Trinity Mini model page.

Providers

Provider Quantization Status
Clarifai BF16 Active

Model Endpoint ID: arcee_ai/AFM/models/trinity-mini

Parameters

Supported Parameters:

  • Max Tokens
  • Temperature (default: 0.15)
  • Top P (default: 0.75)
  • Stop Sequences
  • Frequency Penalty
  • Presence Penalty
  • Top K
  • Repetition Penalty
  • Logit Bias
  • Min P

Advanced Features:

  • Reasoning mode with <think> tags for internal reasoning
  • Tool/Function calling with tool choice selection
  • Structured outputs & response formatting
  • Include reasoning flag
  • Response format specification

Tool Capabilities:

  • Robust function calling support
  • Multi-step agent workflow capability
  • Tool choice parameters available