Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.searchable.com/llms.txt

Use this file to discover all available pages before exploring further.

What this does

Google Cloud Logging captures every request that hits your HTTPS Load Balancer or Cloud CDN. We forward those logs to Searchable via a Pub/Sub topic with a push subscription, classify the AI bots at our edge, and drop everything else. No code changes — all configuration happens with three gcloud commands.
No code changes. All configuration is done with gcloud (or the GCP Console if you prefer click-ops).

Prerequisites

A GCP project running an HTTPS Load Balancer or Cloud CDN
roles/pubsub.admin and roles/logging.admin on that project (or equivalent)
gcloud CLI installed and authenticated, or access to Cloud Shell
A Searchable project with your domain confirmed

Setup

1

Generate an integration token in Searchable

  1. Open your Searchable dashboard
  2. Go to Agent Analytics → Setup
  3. Pick Google Cloud Platform as your crawler source
  4. Click Generate token
Copy the token now — it starts with sa_… and won’t be shown again. You can always generate a new one if you lose it.
2

Create the Pub/Sub topic

In Cloud Shell (or any terminal with gcloud authenticated to the target project):
gcloud pubsub topics create searchable-ai-traffic
This is a dedicated topic that will hold log entries en route to Searchable. Keeping it separate from any other logging pipeline you have makes troubleshooting and removal trivial.
3

Create the push subscription

Replace <your-sa_-token> with the token you just generated:
gcloud pubsub subscriptions create searchable-ai-traffic-sub \
  --topic=searchable-ai-traffic \
  --push-endpoint=https://searchable-tracker.searchable.workers.dev/v1/gcp-logs \
  --push-request-attributes=X-Searchable-Token='<your-sa_-token>' \
  --ack-deadline=30 \
  --min-retry-delay=10s \
  --max-retry-delay=600s
The --push-request-attributes flag adds the X-Searchable-Token header (with the raw token as its value — no Bearer prefix) to every push request. We use this dedicated header rather than Authorization because GCP Pub/Sub reserves Authorization for native OIDC JWT auth.
4

Create the Log Sink

Route HTTPS Load Balancer and Cloud CDN logs into the topic:
gcloud logging sinks create searchable-ai-traffic-sink \
  pubsub.googleapis.com/projects/$(gcloud config get-value project)/topics/searchable-ai-traffic \
  --log-filter='resource.type="http_load_balancer" OR resource.type="cloud_cdn"'
GCP prints a service-account email after this command — grant it roles/pubsub.publisher on your project. The exact gcloud command is shown in the output of the previous step; alternatively in the Console you’ll see a yellow banner asking you to authorize the sink.

Verifying the connection

In Searchable:
  1. Go to Agent Analytics → Setup
  2. Look at the Google Cloud Platform card status
  3. Click Check if it still shows “Waiting for first event”
StatusWhat it means
Waiting for first eventThe subscription is configured but no AI bot has hit your site yet. Typical wait is a few hours for sites that are already indexed.
ConnectedEvents are arriving. The card shows the count from the last 24 hours.

Geo enrichment caveat

GCP HTTPS Load Balancer logs do not include geographic enrichment by default — country, region, and city will arrive empty in Searchable. The integration is otherwise fully functional. If you need geo, the simplest workaround is to also instrument your site with the Searchable Beacon (s.js), which derives geo from the visitor’s request.

Customer-built relay (org policy fallback) [#customer-built-relay]

If your GCP organisation policy blocks external Pub/Sub push endpoints (e.g. iam.allowedPolicyMemberDomains or a custom constraints/pubsub.allowExternalPushEndpoints), you can’t point a push subscription directly at our tracker URL. The workaround is a tiny Cloud Run service inside your project that forwards Pub/Sub push messages to Searchable unchanged. Minimal relay (Node 22, Hono — ~30 lines):
import { Hono } from "hono";
import { serve } from "@hono/node-server";

const app = new Hono();
app.get("/", (c) => c.text("ok", 200));
app.post("/", async (c) => {
  const body = await c.req.text();
  const upstream = await fetch(process.env.SEARCHABLE_ENDPOINT!, {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "X-Searchable-Token": process.env.SEARCHABLE_TOKEN!,
    },
    body,
  });
  return c.body(null, upstream.status as 200 | 204 | 400 | 401 | 403 | 413 | 500);
});
serve({ fetch: app.fetch, port: parseInt(process.env.PORT ?? "8080", 10) });
Build and deploy it to Cloud Run:
gcloud run deploy searchable-relay \
  --source=. \
  --region=us-central1 \
  --allow-unauthenticated \
  --set-env-vars=SEARCHABLE_ENDPOINT=https://searchable-tracker.searchable.workers.dev/v1/gcp-logs,SEARCHABLE_TOKEN=<your-sa_-token>
Then create the push subscription pointing at the deployed Cloud Run URL instead of our tracker:
gcloud pubsub subscriptions create searchable-ai-traffic-sub \
  --topic=searchable-ai-traffic \
  --push-endpoint=<your Cloud Run URL> \
  --ack-deadline=30
The Log Sink command is identical to the standard setup. The relay is stateless and idempotent — Pub/Sub’s at-least-once delivery semantics flow through unchanged.
Searchable plans to publish a versioned Docker image (searchablehq/gcp-log-relay:v1) for this in a future release. The snippet above is the v1 reference implementation.

Troubleshooting

The X-Searchable-Token header is missing or wrong on the push subscription.
  • Confirm the subscription was created with --push-request-attributes=X-Searchable-Token='sa_…' (note the single quotes around the value, and that the header value is the raw token with no Bearer prefix)
  • The /v1/gcp-logs endpoint does not accept the Authorization: Bearer header — that slot is reserved by GCP Pub/Sub for native OIDC JWTs. If you set up before this header change, recreate the subscription with X-Searchable-Token
  • If you’ve recently revoked the token in Searchable, generate a new one and update the subscription with gcloud pubsub subscriptions modify-push-config
Most likely the X-Searchable-Token header isn’t reaching us or the log filter is sending non-HTTP logs.
  • Verify the subscription’s push config in the Console (Pub/Sub → Subscriptions → searchable-ai-traffic-sub → Edit) — the auth attribute should be visible there
  • Verify the sink filter is exactly resource.type="http_load_balancer" OR resource.type="cloud_cdn". Broader filters can send non-HTTP logs that we reject with 204 (no retry) but also waste your Pub/Sub quota
GCP requires the sink’s service account to have roles/pubsub.publisher on the topic. Without it the sink silently drops messages.
  • Run gcloud logging sinks describe searchable-ai-traffic-sink to find the writer service account
  • Grant it publisher on the topic: gcloud pubsub topics add-iam-policy-binding searchable-ai-traffic --member=serviceAccount:<writer-sa> --role=roles/pubsub.publisher
A few possible causes:
  • The Log Sink filter doesn’t match your service — confirm your HTTPS Load Balancer logs actually populate Cloud Logging (View → Logs Explorer → filter on resource.type="http_load_balancer" and confirm you see entries)
  • Your domain in Searchable doesn’t match the site served by GCP (check Agent Analytics → Setup → Confirm your domain)
  • No AI bot has visited yet — try visiting your site with a known AI user agent (e.g. Mozilla/5.0 (compatible; GPTBot/1.0)) to trigger a test event
If Pub/Sub’s “Push attempt” metrics show successful 204 responses but Searchable still says no events, that points to a domain mismatch.

Removing the integration

To stop sending traffic to Searchable:
  1. gcloud logging sinks delete searchable-ai-traffic-sink
  2. gcloud pubsub subscriptions delete searchable-ai-traffic-sub
  3. gcloud pubsub topics delete searchable-ai-traffic
  4. Searchable → Agent Analytics → Setup → Tokens → revoke the token
Both sides are independent — revoking the token alone is enough to stop ingestion immediately, even if the GCP-side resources stay configured (push deliveries will start returning 401, which Pub/Sub will retry until the messages expire).

Next steps

See the data

Open Agent Analytics to see which assistants are crawling your site.

Add Search Console

Correlate AI crawls with search demand.