Send Google Cloud traffic to Searchable (Pub/Sub push)

What this does

Google Cloud Logging captures every request that hits your HTTPS Load Balancer or Cloud CDN. We forward those logs to Searchable via a Pub/Sub topic with a push subscription, classify the AI bots at our edge, and drop everything else. No code changes — all configuration happens with three gcloud commands.

No code changes. All configuration is done with gcloud (or the GCP Console if you prefer click-ops).

Prerequisites

A GCP project running an HTTPS Load Balancer or Cloud CDN

roles/pubsub.admin and roles/logging.admin on that project (or equivalent)

gcloud CLI installed and authenticated, or access to Cloud Shell

A Searchable project with your domain confirmed

Setup

Generate an integration token in Searchable

Open your Searchable dashboard
Go to Agent Analytics → Setup
Pick Google Cloud Platform as your crawler source
Click Generate token

Copy the token now — it starts with sa_… and won’t be shown again. You can always generate a new one if you lose it.

Create the Pub/Sub topic

In Cloud Shell (or any terminal with gcloud authenticated to the target project):

gcloud pubsub topics create searchable-ai-traffic

This is a dedicated topic that will hold log entries en route to Searchable. Keeping it separate from any other logging pipeline you have makes troubleshooting and removal trivial.

Create the push subscription

Replace <your-sa_-token> with the token you just generated:

gcloud pubsub subscriptions create searchable-ai-traffic-sub \
  --topic=searchable-ai-traffic \
  --push-endpoint=https://searchable-tracker.searchable.workers.dev/v1/gcp-logs \
  --push-request-attributes=X-Searchable-Token='<your-sa_-token>' \
  --ack-deadline=30 \
  --min-retry-delay=10s \
  --max-retry-delay=600s

The --push-request-attributes flag adds the X-Searchable-Token header (with the raw token as its value — no Bearer prefix) to every push request. We use this dedicated header rather than Authorization because GCP Pub/Sub reserves Authorization for native OIDC JWT auth.

Create the Log Sink

Route HTTPS Load Balancer and Cloud CDN logs into the topic:

gcloud logging sinks create searchable-ai-traffic-sink \
  pubsub.googleapis.com/projects/$(gcloud config get-value project)/topics/searchable-ai-traffic \
  --log-filter='resource.type="http_load_balancer" OR resource.type="cloud_cdn"'

GCP prints a service-account email after this command — grant it roles/pubsub.publisher on your project. The exact gcloud command is shown in the output of the previous step; alternatively in the Console you’ll see a yellow banner asking you to authorize the sink.

Verifying the connection

In Searchable:

Go to Agent Analytics → Setup
Look at the Google Cloud Platform card status
Click Check if it still shows “Waiting for first event”

Status	What it means
Waiting for first event	The subscription is configured but no AI bot has hit your site yet. Typical wait is a few hours for sites that are already indexed.
Connected	Events are arriving. The card shows the count from the last 24 hours.

Geo enrichment caveat

GCP HTTPS Load Balancer logs do not include geographic enrichment by default — country, region, and city will arrive empty in Searchable. The integration is otherwise fully functional. If you need geo, the simplest workaround is to also instrument your site with the Searchable Beacon (s.js), which derives geo from the visitor’s request.

Customer-built relay (org policy fallback) [#customer-built-relay]

If your GCP organisation policy blocks external Pub/Sub push endpoints (e.g. iam.allowedPolicyMemberDomains or a custom constraints/pubsub.allowExternalPushEndpoints), you can’t point a push subscription directly at our tracker URL. The workaround is a tiny Cloud Run service inside your project that forwards Pub/Sub push messages to Searchable unchanged. Minimal relay (Node 22, Hono — ~30 lines):

import { Hono } from "hono";
import { serve } from "@hono/node-server";

const app = new Hono();
app.get("/", (c) => c.text("ok", 200));
app.post("/", async (c) => {
  const body = await c.req.text();
  const upstream = await fetch(process.env.SEARCHABLE_ENDPOINT!, {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "X-Searchable-Token": process.env.SEARCHABLE_TOKEN!,
    },
    body,
  });
  return c.body(null, upstream.status as 200 | 204 | 400 | 401 | 403 | 413 | 500);
});
serve({ fetch: app.fetch, port: parseInt(process.env.PORT ?? "8080", 10) });

Build and deploy it to Cloud Run:

gcloud run deploy searchable-relay \
  --source=. \
  --region=us-central1 \
  --allow-unauthenticated \
  --set-env-vars=SEARCHABLE_ENDPOINT=https://searchable-tracker.searchable.workers.dev/v1/gcp-logs,SEARCHABLE_TOKEN=<your-sa_-token>

Then create the push subscription pointing at the deployed Cloud Run URL instead of our tracker:

gcloud pubsub subscriptions create searchable-ai-traffic-sub \
  --topic=searchable-ai-traffic \
  --push-endpoint=<your Cloud Run URL> \
  --ack-deadline=30

The Log Sink command is identical to the standard setup. The relay is stateless and idempotent — Pub/Sub’s at-least-once delivery semantics flow through unchanged.

Searchable plans to publish a versioned Docker image (searchablehq/gcp-log-relay:v1) for this in a future release. The snippet above is the v1 reference implementation.

Troubleshooting

Searchable shows 401 errors or the card stays 'Not connected'

The X-Searchable-Token header is missing or wrong on the push subscription.

Confirm the subscription was created with --push-request-attributes=X-Searchable-Token='sa_…' (note the single quotes around the value, and that the header value is the raw token with no Bearer prefix)
The /v1/gcp-logs endpoint does not accept the Authorization: Bearer header — that slot is reserved by GCP Pub/Sub for native OIDC JWTs. If you set up before this header change, recreate the subscription with X-Searchable-Token
If you’ve recently revoked the token in Searchable, generate a new one and update the subscription with gcloud pubsub subscriptions modify-push-config

Pub/Sub reports delivery errors or messages pile up unacked

Most likely the X-Searchable-Token header isn’t reaching us or the log filter is sending non-HTTP logs.

Verify the subscription’s push config in the Console (Pub/Sub → Subscriptions → searchable-ai-traffic-sub → Edit) — the auth attribute should be visible there
Verify the sink filter is exactly resource.type="http_load_balancer" OR resource.type="cloud_cdn". Broader filters can send non-HTTP logs that we reject with 204 (no retry) but also waste your Pub/Sub quota

The sink isn't routing logs to the topic

GCP requires the sink’s service account to have roles/pubsub.publisher on the topic. Without it the sink silently drops messages.

Run gcloud logging sinks describe searchable-ai-traffic-sink to find the writer service account
Grant it publisher on the topic: gcloud pubsub topics add-iam-policy-binding searchable-ai-traffic --member=serviceAccount:<writer-sa> --role=roles/pubsub.publisher

Status stays on 'Waiting for first event' for more than 24 hours

A few possible causes:

The Log Sink filter doesn’t match your service — confirm your HTTPS Load Balancer logs actually populate Cloud Logging (View → Logs Explorer → filter on resource.type="http_load_balancer" and confirm you see entries)
Your domain in Searchable doesn’t match the site served by GCP (check Agent Analytics → Setup → Confirm your domain)
No AI bot has visited yet — try visiting your site with a known AI user agent (e.g. Mozilla/5.0 (compatible; GPTBot/1.0)) to trigger a test event

If Pub/Sub’s “Push attempt” metrics show successful 204 responses but Searchable still says no events, that points to a domain mismatch.

Removing the integration

To stop sending traffic to Searchable:

gcloud logging sinks delete searchable-ai-traffic-sink
gcloud pubsub subscriptions delete searchable-ai-traffic-sub
gcloud pubsub topics delete searchable-ai-traffic
Searchable → Agent Analytics → Setup → Tokens → revoke the token

Both sides are independent — revoking the token alone is enough to stop ingestion immediately, even if the GCP-side resources stay configured (push deliveries will start returning 401, which Pub/Sub will retry until the messages expire).

Getting Started

Using Searchable

Integrations

Agent Analytics Setup

Advanced Features

Send Google Cloud traffic to Searchable (Pub/Sub push)

What this does

Prerequisites

Setup

Verifying the connection

Geo enrichment caveat

Customer-built relay (org policy fallback) [#customer-built-relay]

Troubleshooting

Removing the integration

Next steps

See the data

Add Search Console

Getting Started

Using Searchable

Integrations

Agent Analytics Setup

Advanced Features

Documentation Index

​What this does

​Prerequisites

​Setup

​Verifying the connection

​Geo enrichment caveat

​Customer-built relay (org policy fallback) [#customer-built-relay]

​Troubleshooting

​Removing the integration

​Next steps

See the data

Add Search Console

What this does

Prerequisites

Setup

Verifying the connection

Geo enrichment caveat

Customer-built relay (org policy fallback) [#customer-built-relay]

Troubleshooting

Removing the integration

Next steps