Documentation Index
Fetch the complete documentation index at: https://docs.searchable.com/llms.txt
Use this file to discover all available pages before exploring further.
What this does
Google Cloud Logging captures every request that hits your HTTPS Load Balancer or Cloud CDN. We forward those logs to Searchable via a Pub/Sub topic with a push subscription, classify the AI bots at our edge, and drop everything else. No code changes — all configuration happens with threegcloud commands.
No code changes. All configuration is done with
gcloud (or the GCP Console if you prefer click-ops).Prerequisites
A GCP project running an HTTPS Load Balancer or Cloud CDN
roles/pubsub.admin and roles/logging.admin on that project (or equivalent)gcloud CLI installed and authenticated, or access to Cloud ShellA Searchable project with your domain confirmed
Setup
Generate an integration token in Searchable
- Open your Searchable dashboard
- Go to Agent Analytics → Setup
- Pick Google Cloud Platform as your crawler source
- Click Generate token
sa_… and won’t be shown again. You can always generate a new one if you lose it.Create the Pub/Sub topic
In Cloud Shell (or any terminal with This is a dedicated topic that will hold log entries en route to Searchable. Keeping it separate from any other logging pipeline you have makes troubleshooting and removal trivial.
gcloud authenticated to the target project):Create the Log Sink
Route HTTPS Load Balancer and Cloud CDN logs into the topic:GCP prints a service-account email after this command — grant it
roles/pubsub.publisher on your project. The exact gcloud command is shown in the output of the previous step; alternatively in the Console you’ll see a yellow banner asking you to authorize the sink.Verifying the connection
In Searchable:- Go to Agent Analytics → Setup
- Look at the Google Cloud Platform card status
- Click Check if it still shows “Waiting for first event”
| Status | What it means |
|---|---|
| Waiting for first event | The subscription is configured but no AI bot has hit your site yet. Typical wait is a few hours for sites that are already indexed. |
| Connected | Events are arriving. The card shows the count from the last 24 hours. |
Geo enrichment caveat
GCP HTTPS Load Balancer logs do not include geographic enrichment by default —country, region, and city will arrive empty in Searchable. The integration is otherwise fully functional. If you need geo, the simplest workaround is to also instrument your site with the Searchable Beacon (s.js), which derives geo from the visitor’s request.
Customer-built relay (org policy fallback) [#customer-built-relay]
If your GCP organisation policy blocks external Pub/Sub push endpoints (e.g.iam.allowedPolicyMemberDomains or a custom constraints/pubsub.allowExternalPushEndpoints), you can’t point a push subscription directly at our tracker URL. The workaround is a tiny Cloud Run service inside your project that forwards Pub/Sub push messages to Searchable unchanged.
Minimal relay (Node 22, Hono — ~30 lines):
Searchable plans to publish a versioned Docker image (
searchablehq/gcp-log-relay:v1) for this in a future release. The snippet above is the v1 reference implementation.Troubleshooting
Searchable shows 401 errors or the card stays 'Not connected'
Searchable shows 401 errors or the card stays 'Not connected'
The
X-Searchable-Token header is missing or wrong on the push subscription.- Confirm the subscription was created with
--push-request-attributes=X-Searchable-Token='sa_…'(note the single quotes around the value, and that the header value is the raw token with noBearerprefix) - The
/v1/gcp-logsendpoint does not accept theAuthorization: Bearerheader — that slot is reserved by GCP Pub/Sub for native OIDC JWTs. If you set up before this header change, recreate the subscription withX-Searchable-Token - If you’ve recently revoked the token in Searchable, generate a new one and update the subscription with
gcloud pubsub subscriptions modify-push-config
Pub/Sub reports delivery errors or messages pile up unacked
Pub/Sub reports delivery errors or messages pile up unacked
Most likely the
X-Searchable-Token header isn’t reaching us or the log filter is sending non-HTTP logs.- Verify the subscription’s push config in the Console (Pub/Sub → Subscriptions → searchable-ai-traffic-sub → Edit) — the auth attribute should be visible there
- Verify the sink filter is exactly
resource.type="http_load_balancer" OR resource.type="cloud_cdn". Broader filters can send non-HTTP logs that we reject with 204 (no retry) but also waste your Pub/Sub quota
The sink isn't routing logs to the topic
The sink isn't routing logs to the topic
GCP requires the sink’s service account to have
roles/pubsub.publisher on the topic. Without it the sink silently drops messages.- Run
gcloud logging sinks describe searchable-ai-traffic-sinkto find the writer service account - Grant it publisher on the topic:
gcloud pubsub topics add-iam-policy-binding searchable-ai-traffic --member=serviceAccount:<writer-sa> --role=roles/pubsub.publisher
Status stays on 'Waiting for first event' for more than 24 hours
Status stays on 'Waiting for first event' for more than 24 hours
A few possible causes:
- The Log Sink filter doesn’t match your service — confirm your HTTPS Load Balancer logs actually populate Cloud Logging (View → Logs Explorer → filter on
resource.type="http_load_balancer"and confirm you see entries) - Your domain in Searchable doesn’t match the site served by GCP (check Agent Analytics → Setup → Confirm your domain)
- No AI bot has visited yet — try visiting your site with a known AI user agent (e.g.
Mozilla/5.0 (compatible; GPTBot/1.0)) to trigger a test event
Removing the integration
To stop sending traffic to Searchable:gcloud logging sinks delete searchable-ai-traffic-sinkgcloud pubsub subscriptions delete searchable-ai-traffic-subgcloud pubsub topics delete searchable-ai-traffic- Searchable → Agent Analytics → Setup → Tokens → revoke the token
401, which Pub/Sub will retry until the messages expire).
Next steps
See the data
Open Agent Analytics to see which assistants are crawling your site.
Add Search Console
Correlate AI crawls with search demand.