Vertex AI Anthropic Integration
Connect Google Cloud Vertex AI to LLM Gateway to run Claude models on your own GCP project
Run Claude models (Sonnet, Opus, Haiku) on Google Cloud Vertex AI through LLM Gateway. This guide shows how to set up a GCP service account and integrate it with LLM Gateway using automatic OAuth2 token management — no manual token rotation required.
Prerequisites
- A Google Cloud project with billing enabled
- LLM Gateway account or self-hosted instance
Set up Google Cloud
Enable the Vertex AI API
In the Google Cloud Console, enable the Vertex AI API for your project.
Enable Claude Models in Model Garden
Navigate to Vertex AI > Model Garden in the Cloud Console. Search for the Claude models you want to use and click Enable on each one.
Available models:
claude-sonnet-4-6claude-sonnet-4-5claude-haiku-4-5claude-opus-4-5claude-opus-4-6claude-opus-4-7
Create a Service Account
Create a service account with the required permissions:
# Create the service account
gcloud iam service-accounts create vertex-ai-caller \
--display-name="Vertex AI Caller" \
--project=YOUR_PROJECT_ID
# Grant the Vertex AI User role
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:vertex-ai-caller@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"Download the Service Account Key
gcloud iam service-accounts keys create service-account.json \
--iam-account=vertex-ai-caller@YOUR_PROJECT_ID.iam.gserviceaccount.comThen convert it to a single-line string:
cat service-account.json | tr -d '\n'Keep the output handy — you'll paste it into LLM Gateway in the next steps.
Add to LLM Gateway
Navigate to Provider Keys
- Log into LLM Gateway Dashboard
- Select your organization and project
- Go to Provider Keys in the sidebar
Add Vertex Anthropic Provider Key
- Click Add for Vertex AI (Anthropic)
- Paste the single-line service account JSON as the API Key
- Leave Region empty to use the recommended
globalendpoint, or set a specific region (e.g.us-east5) if you need data residency - Click Add Key
The project ID is extracted automatically from the service account JSON — no separate project field is needed.
Test the Integration
curl -X POST https://api.llmgateway.io/v1/chat/completions \
-H "Authorization: Bearer YOUR_LLMGATEWAY_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "vertex-anthropic/claude-sonnet-4-6",
"messages": [
{
"role": "user",
"content": "Hello from Vertex Anthropic!"
}
]
}'Replace YOUR_LLMGATEWAY_API_KEY with your LLM Gateway API key.
Self-Host Configuration
If you're self-hosting LLM Gateway, configure the provider via environment variables instead of the dashboard:
LLM_VERTEX_ANTHROPIC_SERVICE_ACCOUNT_JSON={"type":"service_account","project_id":"YOUR_PROJECT_ID","private_key":"-----BEGIN RSA PRIVATE KEY-----\n...\n-----END RSA PRIVATE KEY-----\n","client_email":"vertex-ai-caller@YOUR_PROJECT_ID.iam.gserviceaccount.com","token_uri":"https://oauth2.googleapis.com/token"}
LLM_VERTEX_ANTHROPIC_REGION=globalThe project ID is extracted automatically from the service account JSON — no
separate LLM_VERTEX_ANTHROPIC_PROJECT variable is needed.
How Token Refresh Works
LLM Gateway handles the OAuth2 token lifecycle automatically:
- On first request, the service account JSON is parsed and used to sign a JWT
- The JWT is exchanged for an OAuth2 access token via Google's token endpoint
- The token is cached in Redis with a 50-minute TTL (Google tokens expire after 60 minutes)
- An in-memory cache avoids Redis round-trips on subsequent requests
- When the cached token expires, a new one is generated transparently
This means:
- No manual
gcloud auth print-access-tokencommands - No cron jobs to refresh tokens
- Works at any request rate (token generation happens at most once per 50 minutes)
- Multi-instance deployments share the cached token via Redis
Available Regions
LLM Gateway defaults to the global endpoint, which Anthropic recommends: requests are routed dynamically to whichever region has capacity, and there is no pricing premium.
| Region | Notes |
|---|---|
global | Default — dynamic routing, no pricing premium |
us | Multi-region (US only); 10% premium |
eu | Multi-region (EU only); 10% premium |
us-east5 | Columbus, Ohio; 10% premium |
us-central1 | Iowa; 10% premium |
europe-west1 | Belgium; 10% premium |
europe-west4 | Netherlands; 10% premium |
asia-southeast1 | Singapore; 10% premium |
Regional and multi-region endpoints add a 10% pricing premium on Claude Sonnet 4.5 and newer models. They are also required if you need single-region data residency or provisioned throughput. See Anthropic's Vertex docs for details.
Available Models
Once configured, you can access Claude models on Vertex AI through LLM Gateway:
- Sonnet:
vertex-anthropic/claude-sonnet-4-6,vertex-anthropic/claude-sonnet-4-5 - Opus:
vertex-anthropic/claude-opus-4-7,vertex-anthropic/claude-opus-4-6,vertex-anthropic/claude-opus-4-5 - Haiku:
vertex-anthropic/claude-haiku-4-5
Browse all available models at llmgateway.io/models.
Troubleshooting
401 UNAUTHENTICATED / ACCESS_TOKEN_TYPE_UNSUPPORTED
The gateway is sending an invalid token. Check:
- The service account JSON is valid and complete
- The service account has
roles/aiplatform.useron the project
403 Permission Denied
The service account lacks permissions. Grant the Vertex AI User role:
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:vertex-ai-caller@YOUR_PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"Model Not Found
The Claude model may not be enabled in your project's Model Garden, or may not be available in the selected region. Check the Model Garden in Cloud Console.
How is this guide?
Last updated on