Usage Analytics

Usage analytics provide aggregated insights into your API consumption, helping you understand patterns, optimize costs, and forecast future usage.

Accessing Usage Analytics

Dashboard

Navigate to Analytics in the sidebar to access the usage analytics dashboard. The dashboard provides:

Summary cards with key metrics
Time series charts
Model breakdown tables
Cost projections

API Access

curl -X GET "https://api.langmart.ai/api/account/analytics/usage" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -G \
  -d "start_date=2025-01-01T00:00:00Z" \
  -d "end_date=2025-01-31T23:59:59Z" \
  -d "granularity=day"

Usage by Model

Model Breakdown

View how your usage is distributed across different models:

Metric	Description
`requests`	Number of requests per model
`input_tokens`	Total input tokens consumed
`output_tokens`	Total output tokens generated
`total_cost`	Cost per model
`cost_percentage`	Percentage of total spend
`avg_latency`	Average response latency
`avg_ttft`	Average time to first token

Example Response

{
  "model_breakdown": [
    {
      "model_name": "gpt-4o",
      "model_display_name": "GPT-4o",
      "provider_name": "OpenAI",
      "requests": 8500,
      "input_tokens": 4250000,
      "output_tokens": 850000,
      "total_tokens": 5100000,
      "total_cost": 127.50,
      "cost_percentage": 65.2,
      "avg_latency": 1250,
      "avg_ttft": 380
    },
    {
      "model_name": "claude-3-sonnet",
      "model_display_name": "Claude 3 Sonnet",
      "provider_name": "Anthropic",
      "requests": 4200,
      "total_cost": 48.30,
      "cost_percentage": 24.7
    }
  ]
}

Usage by Provider

Aggregate usage across all models from each provider:

# The model_breakdown groups by model, but you can aggregate by provider
curl -X GET "https://api.langmart.ai/api/account/analytics/usage" \
  -H "Authorization: Bearer YOUR_API_KEY"

Provider-level insights help you:

Compare costs across providers
Identify provider dependencies
Plan for provider diversification

Usage Trends

Time Series Data

Track how your usage changes over time:

curl -X GET "https://api.langmart.ai/api/account/analytics/usage" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -G \
  -d "granularity=hour"  # Options: hour, day, week, month

Time Series Response

{
  "time_series": [
    {
      "period": "2025-01-15",
      "requests": 520,
      "input_tokens": 260000,
      "output_tokens": 52000,
      "total_tokens": 312000,
      "total_cost": 7.80,
      "avg_latency": 1180,
      "errors": 12
    },
    {
      "period": "2025-01-16",
      "requests": 485,
      "total_cost": 6.90,
      "errors": 8
    }
  ]
}

Granularity Options

Granularity	Best For	Recommended Range
`hour`	Real-time monitoring	Last 24-48 hours
`day`	Weekly patterns	7-30 days
`week`	Monthly trends	30-90 days
`month`	Quarterly analysis	90+ days

Cost Insights

Cost Breakdown

Understand where your money is going:

curl -X GET "https://api.langmart.ai/api/account/cost-insights" \
  -H "Authorization: Bearer YOUR_API_KEY"

Cost Insights Response

{
  "summary": {
    "current_spend": 195.60,
    "total_potential_savings": 28.50,
    "potential_savings_percent": 14.6,
    "insights_count": 4
  },
  "insights": [
    {
      "id": "model_switch_gpt4o_to_gpt4omini",
      "type": "model_switch",
      "severity": "high",
      "title": "Switch from GPT-4o to cheaper alternative",
      "description": "You've spent $127.50 on GPT-4o. Consider switching to GPT-4o-mini for similar results at lower cost.",
      "potential_savings": 18.50,
      "potential_savings_percent": 14.5,
      "recommendation": "Replace gpt-4o with gpt-4o-mini in your API calls."
    },
    {
      "id": "token_optimization_ratio",
      "type": "token_optimization",
      "severity": "medium",
      "title": "Optimize prompt to reduce output tokens",
      "description": "Your average output is 5.2x your input tokens. Consider adding explicit length limits.",
      "potential_savings": 6.20,
      "potential_savings_percent": 3.2
    }
  ]
}

Insight Types

Type	Description
`model_switch`	Suggests cheaper model alternatives
`token_optimization`	Identifies excessive token usage
`rate_limit`	Highlights rate limiting issues
`provider_comparison`	Compares provider pricing
`usage_pattern`	Identifies usage optimization opportunities

Cost Projections

Projection Data

Estimate future costs based on current usage:

{
  "cost_trends": {
    "daily": [
      {"date": "2025-01-30", "cost": 6.30, "requests": 480},
      {"date": "2025-01-31", "cost": 7.10, "requests": 520}
    ],
    "projection": {
      "next_7_days": 47.60,
      "next_30_days": 204.00,
      "avg_daily_cost": 6.80
    }
  }
}

Understanding Projections

next_7_days: Estimated spend for the upcoming week
next_30_days: Estimated monthly spend
avg_daily_cost: Average daily spend based on recent data

Projections are calculated using simple linear extrapolation from your recent usage. Actual costs may vary based on usage changes.

Peak Usage Analysis

Peak Hours Data

Identify when you use the API most:

{
  "peak_usage": [
    {"day_of_week": 1, "hour": 14, "requests": 85, "cost": 2.10},
    {"day_of_week": 1, "hour": 15, "requests": 92, "cost": 2.30},
    {"day_of_week": 2, "hour": 10, "requests": 78, "cost": 1.95}
  ]
}

Day of Week Reference

Value	Day
0	Sunday
1	Monday
2	Tuesday
3	Wednesday
4	Thursday
5	Friday
6	Saturday

Latency Distribution

Understanding Latency

View how your request latencies are distributed:

{
  "latency_distribution": [
    {"bucket": "0-100ms", "count": 120, "percentage": 2.5},
    {"bucket": "100-500ms", "count": 1850, "percentage": 38.5},
    {"bucket": "500ms-1s", "count": 1620, "percentage": 33.7},
    {"bucket": "1-3s", "count": 980, "percentage": 20.4},
    {"bucket": "3-10s", "count": 210, "percentage": 4.4},
    {"bucket": "10s+", "count": 25, "percentage": 0.5}
  ]
}

Latency Buckets

Bucket	Typical Cause
0-100ms	Cached responses, simple queries
100-500ms	Standard responses
500ms-1s	Average model responses
1-3s	Complex responses, larger outputs
3-10s	Very large outputs, slow models
10s+	Timeouts, overloaded providers

Summary Statistics

Quick Overview

{
  "summary": {
    "total_requests": 15420,
    "total_tokens": 8250000,
    "input_tokens": 6875000,
    "output_tokens": 1375000,
    "total_cost": 195.60,
    "avg_latency": 1180,
    "avg_ttft": 320,
    "unique_models": 8,
    "success_rate": 98.5,
    "failed_requests": 231
  }
}

Key Metrics Explained

Metric	Description	Good Range
`success_rate`	Percentage of successful requests	>95%
`avg_latency`	Average total latency	<3000ms
`avg_ttft`	Average time to first token	<500ms
`unique_models`	Number of distinct models used	Varies

Using Analytics for Optimization

Cost Optimization

Review Model Breakdown: Identify expensive models
Check Cost Insights: Act on savings recommendations
Analyze Token Usage: Look for output token waste
Compare Providers: Consider cheaper alternatives

Performance Optimization

Monitor Latency Distribution: Identify slow requests
Track TTFT: Optimize streaming performance
Analyze Peak Usage: Scale resources appropriately
Review Error Rates: Address reliability issues

Setting Alerts

Based on your analytics, set up proactive alerts:

Cost threshold alerts when spending exceeds limits
Error rate alerts when failures spike
Usage spike alerts for unusual activity

Request Logs - Individual request details
Error Tracking - Error analysis
Billing Issues - Cost-related troubleshooting

Previous Request Logs Next Alerts Overview