Google: Gemini Document Understanding

Model Overview

Property	Value
Model ID	`google/gemini-doc-understanding`
Name	Gemini Document Understanding
Status	Stable
Released	2024-09-15

Description

Specialized model for document processing and extraction.

Description

Google: Gemini Document Understanding is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.

Specifications

Spec	Value
Context Window	200,000 tokens
Max Output	4,096 tokens
Modalities	text, image, document

Pricing

Type	Price
Input	$0.08/1M tokens
Output	$0.25/1M tokens

Capabilities

Text: Yes
Image: Yes
Audio: No
Video: No
Tool Use: No
JSON Mode: No

Key Features

Multimodal Support - Text, images, audio, video
Large Context - Up to 200,000 tokens
Tool Use - Not supported
JSON Mode - Not available
Streaming - Real-time generation
Cost Effective - Optimized pricing

Best For

Document extraction
Form processing
Table understanding
Content analysis

Data & Usage Policies

Policy	Status
Training Data	Not used for training
Prompt Retention	Does not retain prompts
Data Processing	Google Cloud privacy compliant

Status & Availability

Status: STABLE
Free Tier: No
Provider: Google

API Usage Example

curl https://api.langmart.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "model": "google/gemini-doc-understanding",
    "messages": [{"role": "user", "content": "Hello"}],
    "max_tokens": 4096
  }'

google/gemini-3-pro-preview - Latest flagship
google/gemini-2.5-pro - Advanced 2.5
google/gemini-2.0-flash - Fast multimodal
google/gemma-3-27b-it - Open-source

Source

Generated for LangMart AI Platform on 2025-12-28

Google: Gemini Document Understanding

Google: Gemini Document Understanding

Model Overview

Description

Description

Specifications

Pricing

Capabilities

Key Features

Best For

Data & Usage Policies

Status & Availability

API Usage Example

Related Models

Source