Skip to main content

Vertex AI Video Generation (Veo)

LiteLLM supports Vertex AI's Veo video generation models using the unified OpenAI video API surface.

PropertyDetails
DescriptionGoogle Cloud Vertex AI Veo video generation models
Provider Route on LiteLLMvertex_ai/
Supported Modelsveo-2.0-generate-001, veo-3.0-generate-preview, veo-3.0-fast-generate-preview, veo-3.1-generate-preview, veo-3.1-fast-generate-preview
Cost Trackingโœ… Duration-based pricing
Logging Supportโœ… Full request/response logging
Proxy Server Supportโœ… Full proxy integration with virtual keys
Spend Managementโœ… Budget tracking and rate limiting
Link to Provider DocVertex AI Veo Documentation โ†—

Quick Startโ€‹

Required Environment Setupโ€‹

import json
import os

os.environ["VERTEXAI_PROJECT"] = "your-gcp-project-id"
os.environ["VERTEXAI_LOCATION"] = "us-central1"

# Option 1: Point to a service account file
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/path/to/service_account.json"

# Option 2: Store the service account JSON directly
with open("/path/to/service_account.json", "r", encoding="utf-8") as f:
os.environ["VERTEXAI_CREDENTIALS"] = f.read()

Basic Usageโ€‹

from litellm import video_generation, video_status, video_content
import json
import os
import time

with open("/path/to/service_account.json", "r", encoding="utf-8") as f:
vertex_credentials = f.read()

response = video_generation(
model="vertex_ai/veo-3.0-generate-preview",
prompt="A cat playing with a ball of yarn in a sunny garden",
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
seconds="8",
size="1280x720",
)

print(f"Video ID: {response.id}")
print(f"Initial Status: {response.status}")

# Poll for completion
while True:
status = video_status(
video_id=response.id,
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
)

print(f"Current Status: {status.status}")

if status.status == "completed":
break
if status.status == "failed":
raise RuntimeError("Video generation failed")

time.sleep(10)

# Download the rendered video
video_bytes = video_content(
video_id=response.id,
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
)

with open("generated_video.mp4", "wb") as f:
f.write(video_bytes)

Supported Modelsโ€‹

Model NameDescriptionMax DurationStatus
veo-2.0-generate-001Veo 2.0 video generation5 secondsGA
veo-3.0-generate-previewVeo 3.0 high quality8 secondsPreview
veo-3.0-fast-generate-previewVeo 3.0 fast generation8 secondsPreview
veo-3.1-generate-previewVeo 3.1 high quality10 secondsPreview
veo-3.1-fast-generate-previewVeo 3.1 fast10 secondsPreview

Video Generation Parametersโ€‹

LiteLLM converts OpenAI-style parameters to Veo's API shape automatically:

OpenAI ParameterVertex AI ParameterDescriptionExample
promptinstances[].promptText description of the video"A cat playing"
sizeparameters.aspectRatioConverted to 16:9 or 9:16"1280x720" โ†’ 16:9
secondsparameters.durationSecondsClip length in seconds"8" โ†’ 8
input_referenceinstances[].imageReference image for animationopen("image.jpg", "rb")
Provider-specific paramsextra_bodyForwarded to Vertex API{"negativePrompt": "blurry"}

Size to Aspect Ratio Mappingโ€‹

  • 1280x720, 1920x1080 โ†’ 16:9
  • 720x1280, 1080x1920 โ†’ 9:16
  • Unknown sizes default to 16:9

Async Usageโ€‹

from litellm import avideo_generation, avideo_status, avideo_content
import asyncio
import json

with open("/path/to/service_account.json", "r", encoding="utf-8") as f:
vertex_credentials = f.read()


async def workflow():
response = await avideo_generation(
model="vertex_ai/veo-3.1-generate-preview",
prompt="Slow motion water droplets splashing into a pool",
seconds="10",
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
)

while True:
status = await avideo_status(
video_id=response.id,
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
)

if status.status == "completed":
break
if status.status == "failed":
raise RuntimeError("Video generation failed")

await asyncio.sleep(10)

video_bytes = await avideo_content(
video_id=response.id,
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
)

with open("veo_water.mp4", "wb") as f:
f.write(video_bytes)

asyncio.run(workflow())

LiteLLM Proxy Usageโ€‹

Add Veo models to your config.yaml:

model_list:
- model_name: veo-3
litellm_params:
model: vertex_ai/veo-3.0-generate-preview
vertex_project: os.environ/VERTEXAI_PROJECT
vertex_location: os.environ/VERTEXAI_LOCATION
vertex_credentials: os.environ/VERTEXAI_CREDENTIALS

Start the proxy and make requests:

# Step 1: Generate video
curl --location 'http://0.0.0.0:4000/videos/generations' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"model": "veo-3",
"prompt": "Aerial shot over a futuristic city at sunrise",
"seconds": "8"
}'

# Step 2: Poll status
curl --location 'http://0.0.0.0:4000/videos/status' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"video_id": "projects/.../operations/..."
}'

# Step 3: Download video
curl --location 'http://0.0.0.0:4000/videos/retrieval' \
--header 'Authorization: Bearer sk-1234' \
--data '{
"video_id": "projects/.../operations/..."
}' \
--output veo_city.mp4

Cost Trackingโ€‹

LiteLLM records the duration returned by Veo so you can apply duration-based pricing.

with open("/path/to/service_account.json", "r", encoding="utf-8") as f:
vertex_credentials = f.read()

response = video_generation(
model="vertex_ai/veo-2.0-generate-001",
prompt="Flowers blooming in fast forward",
seconds="5",
vertex_project="your-gcp-project-id",
vertex_location="us-central1",
vertex_credentials=vertex_credentials,
)

print(response.usage) # {"duration_seconds": 5.0}

Troubleshootingโ€‹

  • vertex_project is required: set VERTEXAI_PROJECT env var or pass vertex_project in the request.
  • Permission denied: ensure the service account has the Vertex AI User role and the correct region enabled.
  • Video stuck in processing: Veo operations are long-running. Continue polling every 10โ€“15 seconds up to ~10 minutes.

See Alsoโ€‹