G

Google: Gemini Document Understanding

Google
Vision
200K
Context
$0.0800
Input /1M
$0.2500
Output /1M
4K
Max Output

Google: Gemini Document Understanding

Model Overview

Property Value
Model ID google/gemini-doc-understanding
Name Gemini Document Understanding
Status Stable
Released 2024-09-15

Description

Specialized model for document processing and extraction.

Description

Google: Gemini Document Understanding is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.

Specifications

Spec Value
Context Window 200,000 tokens
Max Output 4,096 tokens
Modalities text, image, document

Pricing

Type Price
Input $0.08/1M tokens
Output $0.25/1M tokens

Capabilities

  • Text: Yes
  • Image: Yes
  • Audio: No
  • Video: No
  • Tool Use: No
  • JSON Mode: No

Key Features

  1. Multimodal Support - Text, images, audio, video
  2. Large Context - Up to 200,000 tokens
  3. Tool Use - Not supported
  4. JSON Mode - Not available
  5. Streaming - Real-time generation
  6. Cost Effective - Optimized pricing

Best For

  • Document extraction
  • Form processing
  • Table understanding
  • Content analysis

Data & Usage Policies

Policy Status
Training Data Not used for training
Prompt Retention Does not retain prompts
Data Processing Google Cloud privacy compliant

Status & Availability

  • Status: STABLE
  • Free Tier: No
  • Provider: Google

API Usage Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "google/gemini-doc-understanding",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'
  • google/gemini-3-pro-preview - Latest flagship
  • google/gemini-2.5-pro - Advanced 2.5
  • google/gemini-2.0-flash - Fast multimodal
  • google/gemma-3-27b-it - Open-source

Source

Generated for LangMart AI Platform on 2025-12-28