Skip to main content

What this does

Google Cloud Logging captures every request that hits your HTTPS Load Balancer or Cloud CDN. You’ll route those logs to a Pub/Sub topic, push them through a small Cloud Run relay inside your project, and forward them to Searchable. We classify AI bots at our edge and drop everything else.
Everything runs inside your GCP project. The relay is stateless, idempotent, and only forwards payloads — it doesn’t read or store them.

Prerequisites

A GCP project running an HTTPS Load Balancer or Cloud CDN
roles/pubsub.admin, roles/logging.admin, and roles/run.admin on that project (or equivalent)
gcloud CLI installed and authenticated, or access to Cloud Shell
A Searchable project with your domain confirmed

Setup

1

Generate an integration token in Searchable

  1. Open your Searchable dashboard
  2. Go to LLM Analytics → Setup
  3. Pick Google Cloud Platform as your crawler source
  4. Click Generate token
Copy the token now — it starts with sa_… and won’t be shown again. You can always generate a new one if you lose it.
2

Create the Pub/Sub topic

In Cloud Shell (or any terminal with gcloud authenticated to the target project):
gcloud pubsub topics create searchable-ai-traffic
This is a dedicated topic that will hold log entries en route to Searchable. Keeping it separate from any other logging pipeline you have makes troubleshooting and removal trivial.
3
4

Deploy the Cloud Run relay

Create an empty directory and add two files. The relay is a single HTTP handler that adds the X-Searchable-Token header and forwards the Pub/Sub push body to Searchable unchanged.package.json:
{
  "name": "searchable-relay",
  "type": "module",
  "main": "index.js",
  "scripts": { "start": "node index.js" },
  "dependencies": {
    "hono": "^4.0.0",
    "@hono/node-server": "^1.0.0"
  }
}
index.js:
import { Hono } from "hono";
import { serve } from "@hono/node-server";

const app = new Hono();
app.get("/", (c) => c.text("ok", 200));
app.post("/", async (c) => {
  const body = await c.req.text();
  const upstream = await fetch(process.env.SEARCHABLE_ENDPOINT, {
    method: "POST",
    headers: {
      "content-type": "application/json",
      "X-Searchable-Token": process.env.SEARCHABLE_TOKEN,
    },
    body,
  });
  return c.body(null, upstream.status);
});
serve({ fetch: app.fetch, port: parseInt(process.env.PORT ?? "8080", 10) });
Create a dedicated service account for Pub/Sub to invoke the relay as, deploy with --no-allow-unauthenticated, then grant it roles/run.invoker. Replace <TOKEN> with the token from step 1:
gcloud iam service-accounts create searchable-pubsub-invoker \
  --display-name="Searchable Pub/Sub invoker"

gcloud run deploy searchable-relay \
  --source=. \
  --region=us-central1 \
  --no-allow-unauthenticated \
  --set-env-vars=SEARCHABLE_ENDPOINT=https://tracker.searchableanalytics.com/v1/gcp-logs,SEARCHABLE_TOKEN=<TOKEN>

gcloud run services add-iam-policy-binding searchable-relay \
  --region=us-central1 \
  --member=serviceAccount:searchable-pubsub-invoker@$(gcloud config get-value project).iam.gserviceaccount.com \
  --role=roles/run.invoker
Cloud Run will reject any invocation that isn’t signed by searchable-pubsub-invoker, so the relay can’t be called by anyone who happens to find the URL.
For production, store the token in Secret Manager instead of passing it via --set-env-vars (which leaves the cleartext value in gcloud run services describe output and your shell history). Create a secret, grant the Cloud Run runtime service account roles/secretmanager.secretAccessor, then swap --set-env-vars=SEARCHABLE_TOKEN=... for --set-secrets=SEARCHABLE_TOKEN=projects/<your-project>/secrets/searchable-token:latest.
Copy the Service URL printed by gcloud run deploy — you’ll use it in the next step.
5

Create the push subscription

Replace <your Cloud Run URL> with the Service URL from the previous step. The subscription mints an OIDC token from searchable-pubsub-invoker for every push:
gcloud pubsub subscriptions create searchable-ai-traffic-sub \
  --topic=searchable-ai-traffic \
  --push-endpoint=<your Cloud Run URL> \
  --push-auth-service-account=searchable-pubsub-invoker@$(gcloud config get-value project).iam.gserviceaccount.com \
  --ack-deadline=30 \
  --min-retry-delay=10s \
  --max-retry-delay=600s
The Searchable integration token is injected by the relay; the OIDC service-account auth here is what gates access to the relay itself.
6

Create the Log Sink

Route HTTPS Load Balancer and Cloud CDN logs into the topic:
gcloud logging sinks create searchable-ai-traffic-sink \
  pubsub.googleapis.com/projects/$(gcloud config get-value project)/topics/searchable-ai-traffic \
  --log-filter='resource.type="http_load_balancer" OR resource.type="cloud_cdn"'
GCP prints a service-account email after this command — grant it roles/pubsub.publisher on your project. The exact gcloud command is shown in the output; alternatively in the Console you’ll see a yellow banner asking you to authorize the sink.

Verifying the connection

In Searchable:
  1. Go to LLM Analytics → Setup
  2. Look at the Google Cloud Platform card status
  3. Click Check if it still shows “Waiting for first event”
StatusWhat it means
Waiting for first eventThe subscription is configured but no AI bot has hit your site yet. Typical wait is a few hours for sites that are already indexed.
ConnectedEvents are arriving. The card shows the count from the last 24 hours.

Geo enrichment caveat

GCP HTTPS Load Balancer logs do not include geographic enrichment by default — country, region, and city will arrive empty in Searchable. The integration is otherwise fully functional. If you need geo, the simplest workaround is to also instrument your site with the Searchable Beacon (s.js), which derives geo from the visitor’s request.

Troubleshooting

The relay isn’t sending the X-Searchable-Token header — either it’s misconfigured or the token has been revoked.
  • Confirm the Cloud Run service has SEARCHABLE_TOKEN set. List env var names only (not values) with gcloud run services describe searchable-relay --region=us-central1 --format='value(spec.template.spec.containers[0].env[].name)'. If you used Secret Manager, the entry shows as SEARCHABLE_TOKEN and the value lives in the secret
  • If you’ve recently revoked the token in Searchable, generate a new one and redeploy the relay with the new value
Pub/Sub can’t reach the relay, or the log filter is sending non-HTTP logs.
  • Verify the subscription’s push endpoint in the Console (Pub/Sub → Subscriptions → searchable-ai-traffic-sub → Edit) matches the deployed Cloud Run URL
  • Check Cloud Run logs (gcloud run services logs read searchable-relay --region=us-central1 --limit=50) for non-200 responses — those become Pub/Sub retries
  • Verify the sink filter is exactly resource.type="http_load_balancer" OR resource.type="cloud_cdn". Broader filters can send non-HTTP logs that we reject with 204 (no retry) but also waste your Pub/Sub quota
GCP requires the sink’s service account to have roles/pubsub.publisher on the topic. Without it the sink silently drops messages.
  • Run gcloud logging sinks describe searchable-ai-traffic-sink to find the writer service account
  • Grant it publisher on the topic: gcloud pubsub topics add-iam-policy-binding searchable-ai-traffic --member=serviceAccount:<writer-sa> --role=roles/pubsub.publisher
Some GCP orgs enforce policies that further restrict Cloud Run. The default setup already deploys with --no-allow-unauthenticated, which satisfies most org policies — these errors usually point at one of the others:
  • constraints/run.allowedIngress forces internal-only: redeploy with --ingress=internal-and-cloud-load-balancing and place a Pub/Sub-VPC connector in front, or run the relay on Cloud Functions instead
  • constraints/iam.disableServiceAccountCreation blocks the gcloud iam service-accounts create command: use an existing service account your org already trusts and reuse its email in both add-iam-policy-binding and --push-auth-service-account
  • constraints/iam.allowedPolicyMemberDomains blocks the invoker binding: have an admin add the project’s service-agent domain to the allow-list, then re-run the binding command
A few possible causes:
  • The Log Sink filter doesn’t match your service — confirm your HTTPS Load Balancer logs actually populate Cloud Logging (View → Logs Explorer → filter on resource.type="http_load_balancer" and confirm you see entries)
  • Your domain in Searchable doesn’t match the site served by GCP (check LLM Analytics → Setup → Confirm your domain)
  • No AI bot has visited yet — try visiting your site with a known AI user agent (e.g. Mozilla/5.0 (compatible; GPTBot/1.0)) to trigger a test event
If Pub/Sub’s “Push attempt” metrics show successful 204 responses but Searchable still says no events, that points to a domain mismatch.

Removing the integration

To stop sending traffic to Searchable and tear down the GCP-side resources:
gcloud logging sinks delete searchable-ai-traffic-sink
gcloud pubsub subscriptions delete searchable-ai-traffic-sub
gcloud pubsub topics delete searchable-ai-traffic
gcloud run services delete searchable-relay --region=us-central1
gcloud iam service-accounts delete \
  searchable-pubsub-invoker@$(gcloud config get-value project).iam.gserviceaccount.com
Then go to Searchable → LLM Analytics → Setup → Tokens and revoke the integration token. Both sides are independent — revoking the token alone is enough to stop ingestion immediately, even if the GCP-side resources stay configured (push deliveries will start returning 401, which Pub/Sub will retry until the messages expire).

Next steps

See the data

Open LLM Analytics to see which assistants are crawling your site.

Add Search Console

Correlate AI crawls with search demand.