Google: Multimodal Understanding Pro
Model Overview
| Property |
Value |
| Model ID |
google/multimodal-understanding-pro |
| Name |
Multimodal Understanding Pro |
| Status |
Preview |
| Released |
2025-10-20 |
Description
Advanced multimodal model.
Description
Google: Multimodal Understanding Pro is a language model provided by the provider. This model offers advanced capabilities for natural language processing tasks.
Specifications
| Spec |
Value |
| Context Window |
500,000 tokens |
| Max Output |
4,096 tokens |
| Modalities |
text, image, audio, video, docs |
Pricing
| Type |
Price |
| Input |
$2.0/1M tokens |
| Output |
$6.0/1M tokens |
Capabilities
- Text: Yes
- Image: Yes
- Audio: Yes
- Video: Yes
- Tool Use: Yes
- JSON Mode: Yes
Key Features
- Multimodal Support - Text, images, audio, and video
- Large Context - Up to 500,000 tokens
- Tool Use - Supported
- JSON Mode - Supported
- Streaming - Real-time generation
- Cost Effective - Optimized pricing
Best For
- Document analysis
- Multimedia processing
- Enterprise apps
- Research documents
Data & Usage Policies
| Policy |
Status |
| Training Data |
Not used for training |
| Prompt Retention |
Does not retain prompts |
| Data Processing |
Google Cloud privacy compliant |
Status & Availability
- Status: PREVIEW
- Free Tier: No
- Provider: Google
API Usage Example
curl https://api.langmart.ai/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_API_KEY" \
-d '{
"model": "google/multimodal-understanding-pro",
"messages": [{"role": "user", "content": "Hello"}],
"max_tokens": 4096
}'
google/gemini-3-pro-preview - Latest flagship
google/gemini-2.5-pro - Advanced 2.5 model
google/gemini-2.0-flash - Fast multimodal
google/gemma-3-27b-it - Open-source alternative
Source
Generated for LangMart AI Platform on 2025-12-28