Usage Analytics

Usage analytics provide aggregated insights into your API consumption, helping you understand patterns, optimize costs, and forecast future usage.

Accessing Usage Analytics

Dashboard

Navigate to Analytics in the sidebar to access the usage analytics dashboard. The dashboard provides:

  • Summary cards with key metrics
  • Time series charts
  • Model breakdown tables
  • Cost projections

API Access

curl -X GET "https://api.langmart.ai/api/account/analytics/usage" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -G \
  -d "start_date=2025-01-01T00:00:00Z" \
  -d "end_date=2025-01-31T23:59:59Z" \
  -d "granularity=day"

Usage by Model

Model Breakdown

View how your usage is distributed across different models:

Metric Description
requests Number of requests per model
input_tokens Total input tokens consumed
output_tokens Total output tokens generated
total_cost Cost per model
cost_percentage Percentage of total spend
avg_latency Average response latency
avg_ttft Average time to first token

Example Response

{
  "model_breakdown": [
    {
      "model_name": "gpt-4o",
      "model_display_name": "GPT-4o",
      "provider_name": "OpenAI",
      "requests": 8500,
      "input_tokens": 4250000,
      "output_tokens": 850000,
      "total_tokens": 5100000,
      "total_cost": 127.50,
      "cost_percentage": 65.2,
      "avg_latency": 1250,
      "avg_ttft": 380
    },
    {
      "model_name": "claude-3-sonnet",
      "model_display_name": "Claude 3 Sonnet",
      "provider_name": "Anthropic",
      "requests": 4200,
      "total_cost": 48.30,
      "cost_percentage": 24.7
    }
  ]
}

Usage by Provider

Aggregate usage across all models from each provider:

# The model_breakdown groups by model, but you can aggregate by provider
curl -X GET "https://api.langmart.ai/api/account/analytics/usage" \
  -H "Authorization: Bearer YOUR_API_KEY"

Provider-level insights help you:

  • Compare costs across providers
  • Identify provider dependencies
  • Plan for provider diversification

Time Series Data

Track how your usage changes over time:

curl -X GET "https://api.langmart.ai/api/account/analytics/usage" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -G \
  -d "granularity=hour"  # Options: hour, day, week, month

Time Series Response

{
  "time_series": [
    {
      "period": "2025-01-15",
      "requests": 520,
      "input_tokens": 260000,
      "output_tokens": 52000,
      "total_tokens": 312000,
      "total_cost": 7.80,
      "avg_latency": 1180,
      "errors": 12
    },
    {
      "period": "2025-01-16",
      "requests": 485,
      "total_cost": 6.90,
      "errors": 8
    }
  ]
}

Granularity Options

Granularity Best For Recommended Range
hour Real-time monitoring Last 24-48 hours
day Weekly patterns 7-30 days
week Monthly trends 30-90 days
month Quarterly analysis 90+ days

Cost Insights

Cost Breakdown

Understand where your money is going:

curl -X GET "https://api.langmart.ai/api/account/cost-insights" \
  -H "Authorization: Bearer YOUR_API_KEY"

Cost Insights Response

{
  "summary": {
    "current_spend": 195.60,
    "total_potential_savings": 28.50,
    "potential_savings_percent": 14.6,
    "insights_count": 4
  },
  "insights": [
    {
      "id": "model_switch_gpt4o_to_gpt4omini",
      "type": "model_switch",
      "severity": "high",
      "title": "Switch from GPT-4o to cheaper alternative",
      "description": "You've spent $127.50 on GPT-4o. Consider switching to GPT-4o-mini for similar results at lower cost.",
      "potential_savings": 18.50,
      "potential_savings_percent": 14.5,
      "recommendation": "Replace gpt-4o with gpt-4o-mini in your API calls."
    },
    {
      "id": "token_optimization_ratio",
      "type": "token_optimization",
      "severity": "medium",
      "title": "Optimize prompt to reduce output tokens",
      "description": "Your average output is 5.2x your input tokens. Consider adding explicit length limits.",
      "potential_savings": 6.20,
      "potential_savings_percent": 3.2
    }
  ]
}

Insight Types

Type Description
model_switch Suggests cheaper model alternatives
token_optimization Identifies excessive token usage
rate_limit Highlights rate limiting issues
provider_comparison Compares provider pricing
usage_pattern Identifies usage optimization opportunities

Cost Projections

Projection Data

Estimate future costs based on current usage:

{
  "cost_trends": {
    "daily": [
      {"date": "2025-01-30", "cost": 6.30, "requests": 480},
      {"date": "2025-01-31", "cost": 7.10, "requests": 520}
    ],
    "projection": {
      "next_7_days": 47.60,
      "next_30_days": 204.00,
      "avg_daily_cost": 6.80
    }
  }
}

Understanding Projections

  • next_7_days: Estimated spend for the upcoming week
  • next_30_days: Estimated monthly spend
  • avg_daily_cost: Average daily spend based on recent data

Projections are calculated using simple linear extrapolation from your recent usage. Actual costs may vary based on usage changes.

Peak Usage Analysis

Peak Hours Data

Identify when you use the API most:

{
  "peak_usage": [
    {"day_of_week": 1, "hour": 14, "requests": 85, "cost": 2.10},
    {"day_of_week": 1, "hour": 15, "requests": 92, "cost": 2.30},
    {"day_of_week": 2, "hour": 10, "requests": 78, "cost": 1.95}
  ]
}

Day of Week Reference

Value Day
0 Sunday
1 Monday
2 Tuesday
3 Wednesday
4 Thursday
5 Friday
6 Saturday

Latency Distribution

Understanding Latency

View how your request latencies are distributed:

{
  "latency_distribution": [
    {"bucket": "0-100ms", "count": 120, "percentage": 2.5},
    {"bucket": "100-500ms", "count": 1850, "percentage": 38.5},
    {"bucket": "500ms-1s", "count": 1620, "percentage": 33.7},
    {"bucket": "1-3s", "count": 980, "percentage": 20.4},
    {"bucket": "3-10s", "count": 210, "percentage": 4.4},
    {"bucket": "10s+", "count": 25, "percentage": 0.5}
  ]
}

Latency Buckets

Bucket Typical Cause
0-100ms Cached responses, simple queries
100-500ms Standard responses
500ms-1s Average model responses
1-3s Complex responses, larger outputs
3-10s Very large outputs, slow models
10s+ Timeouts, overloaded providers

Summary Statistics

Quick Overview

{
  "summary": {
    "total_requests": 15420,
    "total_tokens": 8250000,
    "input_tokens": 6875000,
    "output_tokens": 1375000,
    "total_cost": 195.60,
    "avg_latency": 1180,
    "avg_ttft": 320,
    "unique_models": 8,
    "success_rate": 98.5,
    "failed_requests": 231
  }
}

Key Metrics Explained

Metric Description Good Range
success_rate Percentage of successful requests >95%
avg_latency Average total latency <3000ms
avg_ttft Average time to first token <500ms
unique_models Number of distinct models used Varies

Using Analytics for Optimization

Cost Optimization

  1. Review Model Breakdown: Identify expensive models
  2. Check Cost Insights: Act on savings recommendations
  3. Analyze Token Usage: Look for output token waste
  4. Compare Providers: Consider cheaper alternatives

Performance Optimization

  1. Monitor Latency Distribution: Identify slow requests
  2. Track TTFT: Optimize streaming performance
  3. Analyze Peak Usage: Scale resources appropriately
  4. Review Error Rates: Address reliability issues

Setting Alerts

Based on your analytics, set up proactive alerts:

  • Cost threshold alerts when spending exceeds limits
  • Error rate alerts when failures spike
  • Usage spike alerts for unusual activity