> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchable.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Send Google Cloud traffic to Searchable

> Forward GCP HTTPS Load Balancer and Cloud CDN logs to Searchable via a Pub/Sub push subscription routed through a small Cloud Run relay.

## What this does

Google Cloud Logging captures every request that hits your HTTPS Load Balancer or Cloud CDN. You'll route those logs to a Pub/Sub topic, push them through a small Cloud Run relay inside your project, and forward them to Searchable. We classify AI bots at our edge and drop everything else.

<Info>
  Everything runs inside your GCP project. The relay is stateless, idempotent, and only forwards payloads — it doesn't read or store them.
</Info>

## Prerequisites

<Check>A GCP project running an HTTPS Load Balancer or Cloud CDN</Check>
<Check>`roles/pubsub.admin`, `roles/logging.admin`, and `roles/run.admin` on that project (or equivalent)</Check>
<Check>`gcloud` CLI installed and authenticated, or access to Cloud Shell</Check>
<Check>A Searchable project with your domain confirmed</Check>

## Setup

<Steps>
  <Step title="Generate an integration token in Searchable">
    1. Open your Searchable dashboard
    2. Go to **LLM Analytics → Setup**
    3. Pick **Google Cloud Platform** as your crawler source
    4. Click **Generate token**

    Copy the token now — it starts with `sa_…` and won't be shown again. You can always generate a new one if you lose it.
  </Step>

  <Step title="Create the Pub/Sub topic">
    In Cloud Shell (or any terminal with `gcloud` authenticated to the target project):

    ```bash theme={null}
    gcloud pubsub topics create searchable-ai-traffic
    ```

    This is a dedicated topic that will hold log entries en route to Searchable. Keeping it separate from any other logging pipeline you have makes troubleshooting and removal trivial.
  </Step>

  <a id="deploy-the-cloud-run-relay" />

  <Step title="Deploy the Cloud Run relay">
    Create an empty directory and add two files. The relay is a single HTTP handler that adds the `X-Searchable-Token` header and forwards the Pub/Sub push body to Searchable unchanged.

    `package.json`:

    ```json theme={null}
    {
      "name": "searchable-relay",
      "type": "module",
      "main": "index.js",
      "scripts": { "start": "node index.js" },
      "dependencies": {
        "hono": "^4.0.0",
        "@hono/node-server": "^1.0.0"
      }
    }
    ```

    `index.js`:

    ```js theme={null}
    import { Hono } from "hono";
    import { serve } from "@hono/node-server";

    const app = new Hono();
    app.get("/", (c) => c.text("ok", 200));
    app.post("/", async (c) => {
      const body = await c.req.text();
      const upstream = await fetch(process.env.SEARCHABLE_ENDPOINT, {
        method: "POST",
        headers: {
          "content-type": "application/json",
          "X-Searchable-Token": process.env.SEARCHABLE_TOKEN,
        },
        body,
      });
      return c.body(null, upstream.status);
    });
    serve({ fetch: app.fetch, port: parseInt(process.env.PORT ?? "8080", 10) });
    ```

    Create a dedicated service account for Pub/Sub to invoke the relay as, deploy with `--no-allow-unauthenticated`, then grant it `roles/run.invoker`. Replace `<TOKEN>` with the token from step 1:

    ```bash theme={null}
    gcloud iam service-accounts create searchable-pubsub-invoker \
      --display-name="Searchable Pub/Sub invoker"

    gcloud run deploy searchable-relay \
      --source=. \
      --region=us-central1 \
      --no-allow-unauthenticated \
      --set-env-vars=SEARCHABLE_ENDPOINT=https://tracker.searchableanalytics.com/v1/gcp-logs,SEARCHABLE_TOKEN=<TOKEN>

    gcloud run services add-iam-policy-binding searchable-relay \
      --region=us-central1 \
      --member=serviceAccount:searchable-pubsub-invoker@$(gcloud config get-value project).iam.gserviceaccount.com \
      --role=roles/run.invoker
    ```

    Cloud Run will reject any invocation that isn't signed by `searchable-pubsub-invoker`, so the relay can't be called by anyone who happens to find the URL.

    <Tip>
      For production, store the token in Secret Manager instead of passing it via `--set-env-vars` (which leaves the cleartext value in `gcloud run services describe` output and your shell history). Create a secret, grant the Cloud Run runtime service account `roles/secretmanager.secretAccessor`, then swap `--set-env-vars=SEARCHABLE_TOKEN=...` for `--set-secrets=SEARCHABLE_TOKEN=projects/<your-project>/secrets/searchable-token:latest`.
    </Tip>

    Copy the **Service URL** printed by `gcloud run deploy` — you'll use it in the next step.
  </Step>

  <Step title="Create the push subscription">
    Replace `<your Cloud Run URL>` with the Service URL from the previous step. The subscription mints an OIDC token from `searchable-pubsub-invoker` for every push:

    ```bash theme={null}
    gcloud pubsub subscriptions create searchable-ai-traffic-sub \
      --topic=searchable-ai-traffic \
      --push-endpoint=<your Cloud Run URL> \
      --push-auth-service-account=searchable-pubsub-invoker@$(gcloud config get-value project).iam.gserviceaccount.com \
      --ack-deadline=30 \
      --min-retry-delay=10s \
      --max-retry-delay=600s
    ```

    The Searchable integration token is injected by the relay; the OIDC service-account auth here is what gates access to the relay itself.
  </Step>

  <Step title="Create the Log Sink">
    Route HTTPS Load Balancer and Cloud CDN logs into the topic:

    ```bash theme={null}
    gcloud logging sinks create searchable-ai-traffic-sink \
      pubsub.googleapis.com/projects/$(gcloud config get-value project)/topics/searchable-ai-traffic \
      --log-filter='resource.type="http_load_balancer" OR resource.type="cloud_cdn"'
    ```

    GCP prints a service-account email after this command — grant it `roles/pubsub.publisher` on your project. The exact `gcloud` command is shown in the output; alternatively in the Console you'll see a yellow banner asking you to authorize the sink.
  </Step>
</Steps>

## Verifying the connection

In Searchable:

1. Go to **LLM Analytics → Setup**
2. Look at the Google Cloud Platform card status
3. Click **Check** if it still shows "Waiting for first event"

| Status                      | What it means                                                                                                                       |
| --------------------------- | ----------------------------------------------------------------------------------------------------------------------------------- |
| **Waiting for first event** | The subscription is configured but no AI bot has hit your site yet. Typical wait is a few hours for sites that are already indexed. |
| **Connected**               | Events are arriving. The card shows the count from the last 24 hours.                                                               |

## Geo enrichment caveat

GCP HTTPS Load Balancer logs do **not** include geographic enrichment by default — `country`, `region`, and `city` will arrive empty in Searchable. The integration is otherwise fully functional. If you need geo, the simplest workaround is to also instrument your site with the Searchable Beacon (`s.js`), which derives geo from the visitor's request.

## Troubleshooting

<AccordionGroup>
  <Accordion title="Searchable shows 401 errors or the card stays 'Not connected'">
    The relay isn't sending the `X-Searchable-Token` header — either it's misconfigured or the token has been revoked.

    * Confirm the Cloud Run service has `SEARCHABLE_TOKEN` set. List env var **names** only (not values) with `gcloud run services describe searchable-relay --region=us-central1 --format='value(spec.template.spec.containers[0].env[].name)'`. If you used Secret Manager, the entry shows as `SEARCHABLE_TOKEN` and the value lives in the secret
    * If you've recently revoked the token in Searchable, generate a new one and redeploy the relay with the new value
  </Accordion>

  <Accordion title="Pub/Sub reports delivery errors or messages pile up unacked">
    Pub/Sub can't reach the relay, or the log filter is sending non-HTTP logs.

    * Verify the subscription's push endpoint in the Console (Pub/Sub → Subscriptions → searchable-ai-traffic-sub → Edit) matches the deployed Cloud Run URL
    * Check Cloud Run logs (`gcloud run services logs read searchable-relay --region=us-central1 --limit=50`) for non-200 responses — those become Pub/Sub retries
    * Verify the sink filter is exactly `resource.type="http_load_balancer" OR resource.type="cloud_cdn"`. Broader filters can send non-HTTP logs that we reject with 204 (no retry) but also waste your Pub/Sub quota
  </Accordion>

  <Accordion title="The sink isn't routing logs to the topic">
    GCP requires the sink's service account to have `roles/pubsub.publisher` on the topic. Without it the sink silently drops messages.

    * Run `gcloud logging sinks describe searchable-ai-traffic-sink` to find the writer service account
    * Grant it publisher on the topic: `gcloud pubsub topics add-iam-policy-binding searchable-ai-traffic --member=serviceAccount:<writer-sa> --role=roles/pubsub.publisher`
  </Accordion>

  <Accordion title="`gcloud run deploy` fails with an org-policy or ingress error">
    Some GCP orgs enforce policies that further restrict Cloud Run. The default setup already deploys with `--no-allow-unauthenticated`, which satisfies most org policies — these errors usually point at one of the others:

    * `constraints/run.allowedIngress` forces internal-only: redeploy with `--ingress=internal-and-cloud-load-balancing` and place a Pub/Sub-VPC connector in front, or run the relay on Cloud Functions instead
    * `constraints/iam.disableServiceAccountCreation` blocks the `gcloud iam service-accounts create` command: use an existing service account your org already trusts and reuse its email in both `add-iam-policy-binding` and `--push-auth-service-account`
    * `constraints/iam.allowedPolicyMemberDomains` blocks the invoker binding: have an admin add the project's service-agent domain to the allow-list, then re-run the binding command
  </Accordion>

  <Accordion title="Status stays on 'Waiting for first event' for more than 24 hours">
    A few possible causes:

    * The Log Sink filter doesn't match your service — confirm your HTTPS Load Balancer logs actually populate Cloud Logging (View → Logs Explorer → filter on `resource.type="http_load_balancer"` and confirm you see entries)
    * Your domain in Searchable doesn't match the site served by GCP (check **LLM Analytics → Setup → Confirm your domain**)
    * No AI bot has visited yet — try visiting your site with a known AI user agent (e.g. `Mozilla/5.0 (compatible; GPTBot/1.0)`) to trigger a test event

    If Pub/Sub's "Push attempt" metrics show successful 204 responses but Searchable still says no events, that points to a domain mismatch.
  </Accordion>
</AccordionGroup>

## Removing the integration

To stop sending traffic to Searchable and tear down the GCP-side resources:

```bash theme={null}
gcloud logging sinks delete searchable-ai-traffic-sink
gcloud pubsub subscriptions delete searchable-ai-traffic-sub
gcloud pubsub topics delete searchable-ai-traffic
gcloud run services delete searchable-relay --region=us-central1
gcloud iam service-accounts delete \
  searchable-pubsub-invoker@$(gcloud config get-value project).iam.gserviceaccount.com
```

Then go to Searchable → **LLM Analytics → Setup → Tokens** and revoke the integration token.

Both sides are independent — revoking the token alone is enough to stop ingestion immediately, even if the GCP-side resources stay configured (push deliveries will start returning `401`, which Pub/Sub will retry until the messages expire).

## Next steps

<CardGroup cols={2}>
  <Card title="See the data" icon="chart-line" href="/using-searchable/visibility-tracking">
    Open LLM Analytics to see which assistants are crawling your site.
  </Card>

  <Card title="Add Search Console" icon="google" href="/integrations/google-search-console">
    Correlate AI crawls with search demand.
  </Card>
</CardGroup>
