If you've been writing code that calls Gemini's Batch API or Deep Research, you have probably written this loop more times than you would like to admit:
while True:
op = client.operations.get(name=op_name)
if op.done:
break
time.sleep(30)
That loop is dead. On May 4, 2026, Google shipped event-driven Webhooks for the Gemini API — a push-based notification system that pings your server the instant a long-running job finishes. No more polling. No more wasted API calls. No more guessing the right time.sleep() interval.
Here's how to wire it up properly, with the gotchas that the official docs gloss over.
What actually changed
Until May 4, every long-running Gemini operation — Batch API jobs, Deep Research runs, video generation — required you to repeatedly call GET /operations/{name} to check status. Operations can take minutes or hours, so polling was either slow (long sleep intervals) or wasteful (short ones).
Webhooks invert the flow. You register an HTTPS endpoint, and the Gemini API sends a signed POST to that endpoint when the job state changes. The notification arrives within seconds of completion.
The implementation strictly adheres to the Standard Webhooks specification. Every request is signed using
webhook-signature,webhook-id, andwebhook-timestampheaders, ensuring idempotency and preventing replay attacks. — Google's Gemini API team
Google guarantees at-least-once delivery with automatic retries for up to 24 hours. Translation: your handler must be idempotent. Process a webhook-id twice and you should get the same end state.
The two modes: static vs. dynamic
This is the part that trips up most first-time integrators. Webhooks come in two flavors with different security models:
| Mode | Scope | Auth | Use case |
|---|---|---|---|
| Static | Project-level (all jobs) | HMAC | Single global integration |
| Dynamic | Per-request (one job) | JWKS signatures | Multi-tenant routing, agent orchestration |
Static is the simple case: register one webhook URL for your project, every job notification goes there. Dynamic is what you want if you are building an agent platform — you bind a webhook to a specific request and pass user_metadata so your handler knows which tenant or user owned the job.
Setting up a static webhook in five minutes
You will need a public HTTPS endpoint. For local development use ngrok or cloudflared. Production endpoints need a valid TLS certificate.
Step 1 — Register the webhook with the Gemini API:
from google import genai
client = genai.Client(api_key="YOUR_API_KEY")
webhook = client.webhooks.create(
url="https://api.yourdomain.com/gemini-callback",
events=["batch.completed", "batch.failed",
"deep_research.completed", "video.completed"]
)
print(f"Webhook ID: {webhook.id}")
print(f"Signing secret: {webhook.secret}") # Save this — you'll need it
Save the signing secret somewhere safe. You cannot retrieve it again.
Step 2 — Verify incoming requests in your handler. Do not skip this. Anyone can POST to your URL; signature verification is the only thing standing between you and a malicious actor injecting fake "job completed" events.
from standardwebhooks import Webhook
from flask import Flask, request, abort
app = Flask(__name__)
wh = Webhook(WEBHOOK_SECRET)
@app.post("/gemini-callback")
def callback():
try:
payload = wh.verify(request.data, dict(request.headers))
except Exception:
abort(401)
event_type = payload["type"]
job_name = payload["data"]["operation"]
if event_type == "batch.completed":
process_batch_result(job_name)
elif event_type == "batch.failed":
log_failure(job_name, payload["data"].get("error"))
return "", 200
Step 3 — Submit a Batch job and walk away:
batch = client.batches.create(
model="gemini-3.1-pro",
requests=load_prompts("./prompts.jsonl"),
)
# No polling loop. Just exit.
When the batch finishes, Google posts to /gemini-callback. Your handler picks up the result and continues the workflow.
Dynamic webhooks: the agent-orchestration use case
If you are running an agent platform where users submit Deep Research jobs and you need to route results back to the right user, dynamic webhooks are the move. Bind a webhook on a per-request basis and stash a user_metadata blob:
result = client.deep_research.create(
query="market sizing for vertical-AI in legal tech",
webhook={
"url": "https://api.yourdomain.com/research-done",
"user_metadata": {
"tenant_id": "acme-corp",
"user_id": "u_8821",
"trace_id": "trc_a91f"
}
}
)
When the research completes, your callback receives the same user_metadata payload and can route accordingly. JWKS signatures keep tenants from spoofing each other.
The mistakes I have already seen
Three things will burn you in the first week:
- Returning a non-2xx status crashes your delivery loop. Google retries failed deliveries with exponential backoff for 24 hours. If your handler 500s on every event for 24 hours, you lose those events. Always return 200, then queue the work for an internal worker.
- Treating duplicate events as bugs. "At-least-once" is not "exactly-once." Use the
webhook-idheader as your idempotency key. - Forgetting to verify timestamps. The Standard Webhooks library validates
webhook-timestampto block replay attacks, but only if you pass the headers correctly. Double-check that your framework is forwarding raw header casing.
The Bottom Line
Polling for Gemini job status was always a kludge. Event-driven Webhooks make Gemini a first-class citizen in any modern async workflow — Lambda triggers, Cloud Run jobs, queue-driven agents. The setup takes ten minutes, removes a meaningful chunk of latency, and cuts API quota waste to near zero. If you're running anything against the Batch API or Deep Research today, migrate this week. The polling loop was tech debt the day Google shipped the alternative.
