Back to Blog

How to Access Google Veo via API

Andrew Adams

Andrew Adams

·9 min read
How to Access Google Veo via API

Google Veo is one of the most capable AI video generation models available today, and accessing it programmatically opens up powerful automation possibilities. Whether you are building a content pipeline, prototyping a video app, or chaining Veo with other models, Wireflow lets you orchestrate multi-model workflows that include video generation nodes alongside image, audio, and text models through a single API.

This guide walks you through every step required to call Google Veo from code, covering both the official Vertex AI route and third-party providers that simplify authentication. For a hands-on look at chaining video models like Veo into automated pipelines via REST, check out the API overview documentation.

What Is Google Veo?

Google Veo is a family of text-to-video and image-to-video models developed by Google DeepMind. The lineup currently includes Veo 2, Veo 3, and Veo 3.1, each improving on output quality, audio synchronization, and generation speed. Veo 3 introduced native audio generation alongside video in a single inference pass, while Veo 3.1 added faster variants (Lite and Fast) for lower-latency use cases, making it a strong contender among AI video generators.

All Veo models are served through Google Cloud's Vertex AI platform. There is no standalone Veo API endpoint; you access Veo the same way you access Imagen or Gemini, through the Vertex AI prediction API. This means you need a Google Cloud project, billing enabled, and proper authentication before you can make your first request.

Prerequisites

Before making any API calls, complete these setup steps:

  1. Create a Google Cloud project at console.cloud.google.com. If you already have a project, you can reuse it.
  2. Enable billing by linking a payment method to your project. Veo API calls are not available on the free tier.
  3. Enable the Vertex AI API in the APIs & Services dashboard. Search for "Vertex AI API" and click Enable.
  4. Install the Google Cloud CLI (gcloud) and authenticate locally by running gcloud auth application-default login. This creates the credentials file that client libraries and curl commands use.
  5. Select a supported region. Veo is available in us-central1, us-east4, europe-west4, and asia-northeast1. Choose the region closest to your infrastructure for lower latency.

These prerequisites apply regardless of which Veo model version you plan to use. The AI pipeline automation features in most workflow platforms handle this configuration once, then reuse it across all subsequent requests.

Google Cloud Vertex AI console setup for Veo API

Step-by-Step: Making Your First Veo API Request

Step 1: Construct the Endpoint URL

The Veo prediction endpoint follows Vertex AI's standard format:

POST https://aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{REGION}/publishers/google/models/veo-3:predict

Replace {PROJECT_ID} with your Google Cloud project ID and {REGION} with one of the supported regions listed above. Swap veo-3 for veo-2 or veo-3.1-generate depending on the model variant you want.

Step 2: Build the Request Body

Veo expects a JSON payload with instances and parameters fields. Here is a minimal example that generates an 8-second clip from a text prompt:

{
  "instances": [
    {
      "prompt": "A drone shot of a coastal village at golden hour, waves gently lapping against a stone harbor"
    }
  ],
  "parameters": {
    "aspectRatio": "16:9",
    "duration": 8,
    "resolution": "1080p",
    "sampleCount": 1
  }
}

The sampleCount field controls how many video variants to generate per request (1 to 4). Higher counts multiply both generation time and cost. Most production use cases stick with 1 and handle retries at the application level, similar to how REST-based AI pipelines manage reliability.

Step 3: Send the Request with curl

curl -X POST \
  "https://aiplatform.googleapis.com/v1/projects/my-project/locations/us-central1/publishers/google/models/veo-3:predict" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  -d '{
    "instances": [{"prompt": "A drone shot of a coastal village at golden hour"}],
    "parameters": {"aspectRatio": "16:9", "duration": 8, "resolution": "1080p", "sampleCount": 1}
  }'

This returns an operation ID immediately. Video generation is asynchronous, so you need to poll for completion.

Veo API request and response flow

Step 4: Poll for Results

Use the operation ID from the initial response to check generation status:

curl -X GET \
  "https://aiplatform.googleapis.com/v1/projects/my-project/locations/us-central1/operations/{OPERATION_ID}" \
  -H "Authorization: Bearer $(gcloud auth print-access-token)"

Generation typically takes 60 to 180 seconds for an 8-second clip at 1080p. The response transitions through RUNNING to DONE, at which point it includes a videoUri field pointing to a Cloud Storage URL where the generated video is stored. Polling with exponential backoff (start at 2 seconds, multiply by 1.5, cap at 15 seconds) is the recommended pattern for model chaining workflows that depend on the output.

Step 5: Download the Video

Once the operation completes, download the video from the returned Cloud Storage URI:

gsutil cp gs://your-bucket/generated-video.mp4 ./output.mp4

Or use the Cloud Storage JSON API if you prefer HTTP-based downloads without the gsutil CLI.

Understanding Veo API Parameters

Parameter Values Default Notes
prompt String (up to 1024 chars) Required Descriptive text for the video scene
aspectRatio "16:9", "9:16", "1:1" "16:9" Vertical works well for social content
duration 4, 6, 8 8 Seconds; longer clips cost more credits
resolution "720p", "1080p" "1080p" 720p is faster and cheaper
sampleCount 1-4 1 Number of variants per request
seed Integer Random For reproducible outputs

Veo 3 and 3.1 also support an image field in the instances array for image-to-video generation, where you supply a base64-encoded reference frame alongside the text prompt. This is useful for converting still images into video with consistent subject framing.

Pricing and Rate Limits

Veo API pricing is based on video duration generated. As of early 2026, approximate costs are:

  • Veo 3 (1080p, 8 seconds): ~$0.35 per generation
  • Veo 3.1 Fast (720p, 4 seconds): ~$0.08 per generation
  • Veo 3.1 Lite: free tier available with daily limits

Rate limits vary by project quota. Default quotas allow approximately 10 concurrent generation requests, with daily limits scaling by billing tier. You can request quota increases through the Google Cloud console under IAM & Admin. Compare these limits with other platforms in this guide to content generation APIs.

Veo pricing tiers and rate limit comparison

Third-Party Providers for Veo Access

If you want to skip the Google Cloud setup entirely, several third-party API providers offer Veo access with simpler authentication:

  • Together AI hosts Veo 3.0 with a standard API key (no Google Cloud account needed). Authentication is a single Bearer token, and pricing is per-second of generated video.
  • AI/ML API provides Veo 3.1 endpoints with OpenAI-compatible request formatting, making it easy to swap in alongside existing video generation code.
  • Pollo AI wraps Veo 3 behind their own endpoint with built-in CDN delivery for generated videos.

These providers handle all the Vertex AI authentication complexity on their end. The tradeoff is slightly higher per-generation costs and less control over region selection. For production systems that need to build AI workflows via API, third-party providers reduce initial integration time significantly.

Integrating Veo into Automated Pipelines

Calling Veo once is straightforward. The real value comes from integrating it into larger pipelines where video generation is one step in a multi-model sequence. Common patterns include:

  • Text to image to video: Generate a reference frame with an image model (Flux, Imagen), then pass it to Veo's image-to-video mode for consistent output.
  • Batch generation: Loop over a list of prompts and manage parallel generation requests with proper rate limiting and retry logic.
  • Post-processing chains: Feed Veo output into upscaling, audio overlay, or thumbnail extraction steps automatically.

Visual pipeline platforms handle these patterns through node-based editors that connect to Veo and dozens of other models. Instead of writing boilerplate polling and retry code, you define the flow visually and execute it via API.

Try it yourself: Build this workflow in Wireflow to see a pre-configured text-to-video pipeline with the exact node setup discussed above.

Conclusion

Accessing Google Veo via API requires a Google Cloud project with billing and Vertex AI enabled, but the actual integration is clean once the prerequisites are in place. For teams that need to chain Veo with other models or automate video generation at scale, visual pipeline tools reduce the integration effort from days to minutes. Check current pricing tiers to see which plan fits your volume.

Frequently Asked Questions

Do I need a Google Cloud account to use the Veo API?

Yes. Google Veo is served exclusively through Vertex AI, which requires a Google Cloud project with billing enabled. There is no standalone Veo API outside of Google Cloud, though third-party providers like Together AI and AI/ML API offer Veo access with their own simpler authentication.

Which Veo model version should I use?

For the highest quality output, use Veo 3. For faster generation at lower cost, Veo 3.1 Fast is a good choice. Veo 3.1 Lite offers a free tier with daily limits, making it suitable for prototyping. Each version uses the same API structure; only the model name in the endpoint URL changes.

How long does Veo take to generate a video?

Generation time ranges from 60 to 180 seconds for an 8-second clip at 1080p, depending on server load and the model variant. Veo 3.1 Fast is significantly quicker, typically completing in 30 to 60 seconds for shorter clips.

Can I generate video from an image with the Veo API?

Yes. Veo 3 and 3.1 support image-to-video generation. Include a base64-encoded image in the image field of the instances array alongside your text prompt. The model uses the image as a reference frame and animates from there.

What video formats does the Veo API return?

Generated videos are stored as MP4 files in Google Cloud Storage. The API response includes a videoUri pointing to the file, which you can download via gsutil or the Cloud Storage REST API.

Is there a free tier for the Veo API?

Veo 3.1 Lite offers limited free usage with daily generation caps. Veo 3 and standard Veo 3.1 require a paid billing account. Third-party providers may offer free trial credits as well.

Can I use Veo in a production application?

Yes. Google Cloud's Vertex AI is designed for production workloads with SLAs, regional redundancy, and scalable quotas. Request quota increases through the Google Cloud console if the default limits are too low for your use case.

How does Veo compare to other video generation APIs?

Veo produces some of the highest-quality AI-generated video available, with native audio support in Veo 3 and above. Alternatives include Kling Video, Runway Gen-3, and Luma Dream Machine. The main differentiators are output quality, generation speed, and pricing per second of video.