jpskill.com
💬 コミュニケーション コミュニティ

vertex-ai-gemini

Google Cloud Vertex AI for enterprise Gemini deployments — production scaling, fine-tuning, and MLOps. Use when deploying Gemini in GCP-native environments, running fine-tuning jobs, needing enterprise IAM controls, VPC isolation, batch prediction at scale, or production ML pipelines on Google Cloud.

⚡ おすすめ: コマンド1行でインストール(60秒)

下記のコマンドをコピーしてターミナル(Mac/Linux)または PowerShell(Windows)に貼り付けてください。 ダウンロード → 解凍 → 配置まで全自動。

🍎 Mac / 🐧 Linux
mkdir -p ~/.claude/skills && cd ~/.claude/skills && curl -L -o vertex-ai-gemini.zip https://jpskill.com/download/15535.zip && unzip -o vertex-ai-gemini.zip && rm vertex-ai-gemini.zip
🪟 Windows (PowerShell)
$d = "$env:USERPROFILE\.claude\skills"; ni -Force -ItemType Directory $d | Out-Null; iwr https://jpskill.com/download/15535.zip -OutFile "$d\vertex-ai-gemini.zip"; Expand-Archive "$d\vertex-ai-gemini.zip" -DestinationPath $d -Force; ri "$d\vertex-ai-gemini.zip"

完了後、Claude Code を再起動 → 普通に「動画プロンプト作って」のように話しかけるだけで自動発動します。

💾 手動でダウンロードしたい(コマンドが難しい人向け)
  1. 1. 下の青いボタンを押して vertex-ai-gemini.zip をダウンロード
  2. 2. ZIPファイルをダブルクリックで解凍 → vertex-ai-gemini フォルダができる
  3. 3. そのフォルダを C:\Users\あなたの名前\.claude\skills\(Win)または ~/.claude/skills/(Mac)へ移動
  4. 4. Claude Code を再起動

⚠️ ダウンロード・利用は自己責任でお願いします。当サイトは内容・動作・安全性について責任を負いません。

🎯 このSkillでできること

下記の説明文を読むと、このSkillがあなたに何をしてくれるかが分かります。Claudeにこの分野の依頼をすると、自動で発動します。

📦 インストール方法 (3ステップ)

  1. 1. 上の「ダウンロード」ボタンを押して .skill ファイルを取得
  2. 2. ファイル名の拡張子を .skill から .zip に変えて展開(macは自動展開可)
  3. 3. 展開してできたフォルダを、ホームフォルダの .claude/skills/ に置く
    • · macOS / Linux: ~/.claude/skills/
    • · Windows: %USERPROFILE%\.claude\skills\

Claude Code を再起動すれば完了。「このSkillを使って…」と話しかけなくても、関連する依頼で自動的に呼び出されます。

詳しい使い方ガイドを見る →
最終更新
2026-05-18
取得日時
2026-05-18
同梱ファイル
1
📖 Claude が読む原文 SKILL.md(中身を展開)

この本文は AI(Claude)が読むための原文(英語または中国語)です。日本語訳は順次追加中。

Vertex AI — Gemini on Google Cloud

Overview

Vertex AI is Google Cloud's enterprise ML platform. It provides access to the same Gemini models as Google AI Studio, but with enterprise-grade features: IAM-based auth (no API keys), VPC Service Controls for data isolation, audit logging, fine-tuning capabilities, batch prediction jobs, and integration with GCP data services like BigQuery and Cloud Storage.

Vertex AI vs Google AI Studio

Feature Google AI Studio Vertex AI
Auth API Key Service Account / IAM
Data residency Limited GCP regions
VPC isolation
Audit logging ✅ Cloud Audit Logs
Fine-tuning
Batch prediction
Pricing Per token Per token (different rates)
Quotas Shared Project-level quotas

Naming note: "Vertex AI" is being rebranded to Agent Platform (full name: Gemini Enterprise Agent Platform). The endpoints, IAM roles, and SDKs are the same product — most documentation still uses the legacy "Vertex AI" name.

SDK Choice — Use the Unified Gen AI SDK

Google now ships a single google-genai SDK that targets both Agent Platform (Vertex) and Google AI Studio with the same code. Use this for all new code. The legacy google-cloud-aiplatform and vertexai modules are deprecated.

New (use this) Legacy (deprecated)
google-genai (Python) google-cloud-aiplatform, google-generativeai
@google/genai (JS/TS) @google-cloud/vertexai
google.golang.org/genai (Go) cloud.google.com/go/vertexai
com.google.genai:google-genai (Java)
Google.GenAI (.NET)
# Recommended: unified Gen AI SDK
pip install google-genai
import os
from google import genai

os.environ["GOOGLE_CLOUD_PROJECT"] = "my-project-id"
os.environ["GOOGLE_CLOUD_LOCATION"] = "global"  # routes to nearest region
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "true"

client = genai.Client()  # picks up env vars

response = client.models.generate_content(
    model="gemini-2.5-flash",
    contents="Explain containerization in simple terms.",
)
print(response.text)

Use location="global" by default — routes to the region with available capacity. Pin to a specific region (us-central1, europe-west4) only when data residency requires it.

Setup (Legacy SDK — only for existing code)

pip install google-cloud-aiplatform
# Authenticate
gcloud auth application-default login

# Or use service account
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account.json
# Set project and location
export GOOGLE_CLOUD_PROJECT=my-project-id
export GOOGLE_CLOUD_LOCATION=us-central1

Instructions

The examples below use the legacy google-cloud-aiplatform SDK. For new code, prefer the unified google-genai SDK shown above — same capabilities, cross-platform, current best practice.

Basic Gemini Inference

import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="my-project-id", location="us-central1")

model = GenerativeModel("gemini-2.0-flash-001")
response = model.generate_content("Explain containerization in simple terms.")
print(response.text)

Multi-Modal Inference

import vertexai
from vertexai.generative_models import GenerativeModel, Part
import base64

vertexai.init(project="my-project-id", location="us-central1")
model = GenerativeModel("gemini-2.0-flash-001")

# Analyze image from Cloud Storage
gcs_image = Part.from_uri(
    uri="gs://my-bucket/product-photo.jpg",
    mime_type="image/jpeg",
)
response = model.generate_content(["Describe this product:", gcs_image])
print(response.text)

# Analyze local image
with open("chart.png", "rb") as f:
    image_data = f.read()

local_image = Part.from_data(data=image_data, mime_type="image/png")
response = model.generate_content(["What trends does this chart show?", local_image])
print(response.text)

Streaming Responses

import vertexai
from vertexai.generative_models import GenerativeModel

vertexai.init(project="my-project-id", location="us-central1")
model = GenerativeModel("gemini-2.0-flash-001")

for chunk in model.generate_content("Write a product description for a smartwatch.", stream=True):
    print(chunk.text, end="", flush=True)
print()

Chat Session

import vertexai
from vertexai.generative_models import GenerativeModel, ChatSession

vertexai.init(project="my-project-id", location="us-central1")

model = GenerativeModel(
    model_name="gemini-2.0-flash-001",
    system_instruction="You are a GCP expert. Provide concise, actionable answers.",
)

chat = model.start_chat()
print(chat.send_message("How do I set up Cloud Run?").text)
print(chat.send_message("What about environment variables?").text)

Function Calling

import vertexai
from vertexai.generative_models import (
    FunctionDeclaration,
    GenerativeModel,
    Tool,
)

vertexai.init(project="my-project-id", location="us-central1")

get_bq_query = FunctionDeclaration(
    name="run_bigquery_query",
    description="Run a SQL query on BigQuery and return results",
    parameters={
        "type": "object",
        "properties": {
            "query": {"type": "string", "description": "SQL query to execute"},
            "dataset": {"type": "string", "description": "BigQuery dataset name"},
        },
        "required": ["query"],
    },
)

tool = Tool(function_declarations=[get_bq_query])
model = GenerativeModel("gemini-2.0-flash-001", tools=[tool])

response = model.generate_content("How many users signed up last week?")

if response.candidates[0].function_calls:
    fc = response.candidates[0].function_calls[0]
    print(f"Function: {fc.name}, Args: {dict(fc.args)}")

Fine-Tuning Gemini

import vertexai
from vertexai.tuning import sft

vertexai.init(project="my-project-id", location="us-central1")

# Prepare training data in JSONL format in GCS:
# {"messages": [{"role": "user", "content": "..."}, {"role": "model", "content": "..."}]}

tuning_job = sft.train(
    source_model="gemini-2.0-flash-001",
    train_dataset="gs://my-bucket/training-data.jsonl",
    validation_dataset="gs://my-bucket/validation-data.jsonl",
    tuned_model_display_name="my-fine-tuned-gemini",
    epochs=3,
    learning_rate_multiplier=1.0,
)

print(f"Tuning job: {tuning_job.resource_name}")
print(f"State: {tuning_job.state}")

# Wait for completion
tuning_job.wait()
print(f"Tuned model: {tuning_job.tuned_model_name}")

Batch Prediction

import vertexai
from vertexai.generative_models import GenerativeModel
from vertexai.preview.batch_prediction import BatchPredictionJob

vertexai.init(project="my-project-id", location="us-central1")

# Input JSONL format in GCS:
# {"request": {"contents": [{"role": "user", "parts": [{"text": "Translate: Hello"}]}]}}

job = BatchPredictionJob.submit(
    source_model="gemini-2.0-flash-001",
    input_dataset="gs://my-bucket/batch-inputs.jsonl",
    output_uri_prefix="gs://my-bucket/batch-outputs/",
)

print(f"Batch job: {job.resource_name}")
job.wait()
print(f"Output: {job.output_location}")

IAM Setup for Service Account

# Create a service account for your app
gcloud iam service-accounts create gemini-app-sa \
    --display-name="Gemini App Service Account"

# Grant Vertex AI User role
gcloud projects add-iam-policy-binding my-project-id \
    --member="serviceAccount:gemini-app-sa@my-project-id.iam.gserviceaccount.com" \
    --role="roles/aiplatform.user"

# Download key (for non-GCP environments)
gcloud iam service-accounts keys create key.json \
    --iam-account=gemini-app-sa@my-project-id.iam.gserviceaccount.com

VPC Service Controls (Enterprise Isolation)

# When VPC SC is enabled, all API calls must originate from within the perimeter
# Configure the SDK to use private endpoints:

import vertexai

vertexai.init(
    project="my-project-id",
    location="us-central1",
    api_endpoint="us-central1-aiplatform.googleapis.com",  # Regional endpoint
)

Available Gemini Models on Vertex AI

Model ID Notes
gemini-2.0-flash-001 Latest Flash, fast + capable
gemini-1.5-pro-002 2M context, most capable
gemini-1.5-flash-002 1M context, balanced
text-embedding-005 Latest embeddings (768 dims)

Use gemini-2.0-flash-001 (version pinned) in production to avoid unexpected model changes.

Examples

Example 1 — Migrate a Python service from google-cloud-aiplatform to google-genai

User has a recommendation service running on Cloud Run that uses the legacy google-cloud-aiplatform SDK. Replace pip install google-cloud-aiplatform with pip install google-genai, swap vertexai.init(...) + GenerativeModel(...) for genai.Client() (with GOOGLE_GENAI_USE_VERTEXAI=true), update model.generate_content(...) to client.models.generate_content(model=..., contents=...). Keep the existing service account and IAM bindings — same auth, new SDK. Pin to gemini-2.5-flash for cost, validate parity in staging before cutover.

Example 2 — Run nightly batch translation of 5M product titles

User has 5M product titles in BigQuery to translate into 4 languages. Streaming inference would be slow and expensive. Format input as JSONL in GCS ({"request": {"contents": [...]}}) per row × language, submit a BatchPredictionJob against gemini-2.5-flash, and let it run unattended. Output JSONL lands in GCS, load it back into BigQuery via bq load. Cost is roughly half of streaming, runtime is hours not days.

Guidelines

  • Use google-genai for all new codegoogle-cloud-aiplatform and google-generativeai are deprecated.
  • Always pin model versions in production for stability — gemini-2.5-flash is fine for non-prod, but production should target a specific build.
  • Use Application Default Credentials (gcloud auth application-default login) during development.
  • In GKE or Cloud Run, use Workload Identity — no service account keys needed.
  • Default location="global" for the Gen AI SDK; pin to a region only for data residency.
  • Fine-tuning requires a training JSONL with messages format and at least 100 examples.
  • Batch prediction is cost-effective for offline bulk inference (no streaming).
  • Enable Cloud Audit Logs on the aiplatform.googleapis.com service for compliance.