> ## Documentation Index
> Fetch the complete documentation index at: https://docs.searchable.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Send Cloudflare traffic to Searchable (Worker)

> Deploy a small Cloudflare Worker that forwards AI-bot requests to Searchable. Works on every Cloudflare plan, including Free.

## What this does

The Worker sits at Cloudflare's edge in front of your origin. For every inbound request it:

1. Lets the request flow through to your origin unchanged (no added latency for users)
2. Checks the user agent against Searchable's AI-bot registry
3. If it matches, fires a fire-and-forget POST to Searchable with the request metadata

The bot registry is fetched from Searchable and refreshed hourly, so new AI crawlers are picked up automatically — you never need to redeploy.

<Info>
  **Works on any Cloudflare plan, including Free.** Most sites stay well within the Workers free tier (100k requests/day). High-traffic sites can upgrade to Workers Paid for a few dollars a month.
</Info>

## Prerequisites

<Check>A Cloudflare account with your domain on it (the domain doesn't have to be on a paid plan)</Check>
<Check>Permission to deploy Workers and add Worker routes to your zone</Check>
<Check>A Searchable project with your domain confirmed</Check>

## Setup

<Steps>
  <Step title="Generate the Worker script in Searchable">
    1. Open your Searchable dashboard
    2. Go to **LLM Analytics → Setup**
    3. Pick **Cloudflare Worker** as your crawler source
    4. Click **Generate token**

    Searchable templates the token directly into the Worker script in the next panel — no find-and-replace needed. Copy the whole script.

    <Tip>
      The Searchable UI is the easiest way to get the script — the token is auto-filled and you avoid typos. The reference script is below if you'd rather build it yourself.
    </Tip>
  </Step>

  <Step title="Create a new Worker in Cloudflare">
    Open [dash.cloudflare.com](https://dash.cloudflare.com/) → **Workers & Pages** → **Create**.

    1. Pick the **Hello World** template
    2. Name it (e.g. `searchable-tracker`)
    3. Click **Deploy**
  </Step>

  <Step title="Paste the Worker script">
    From the new Worker's overview page:

    1. Click **Edit code**
    2. Replace the entire file contents with the script you copied from Searchable
    3. Click **Save and deploy**
  </Step>

  <Step title="Bind the Worker to your domain">
    A Worker by itself isn't on your domain — it has its own `*.workers.dev` URL. To make it observe your real traffic, add a route.

    From the Worker's **Settings** tab:

    1. Open **Domains & Routes** (older UIs label this **Triggers**)
    2. Click **Add → Route**
    3. Enter your route, e.g. `*yourdomain.com/*` (the leading `*` matches all subdomains)
    4. Pick your zone and save
  </Step>

  <Step title="Verify in Searchable">
    Return to **LLM Analytics → Setup** in your Searchable dashboard. The status strip should show **Connected** within a few minutes once an AI bot hits your site.
  </Step>
</Steps>

## Reference: Worker script

If you'd rather not copy the script from the Searchable UI, here's the reference. Replace `INTEGRATION_TOKEN` with the token you generated.

```js worker.js theme={null}
const SEARCHABLE_ENDPOINT = "https://tracker.searchableanalytics.com/v1/cloudflare-logs";
const SEARCHABLE_BOTS_URL = "https://tracker.searchableanalytics.com/v1/bots.json";

// Quick-start: paste your token here.
// Production: leave blank and set a Worker secret named INTEGRATION_TOKEN
// (Settings → Variables and Secrets). The secret wins if both are set.
const INTEGRATION_TOKEN = "sa_PASTE_YOUR_TOKEN_HERE";

let cachedPatterns = null;
let cachedAt = 0;
let loadPromise = null;
const PATTERNS_TTL_MS = 60 * 60 * 1000;

function ensurePatternsLoaded() {
  if (loadPromise) return;
  loadPromise = (async () => {
    try {
      const resp = await fetch(SEARCHABLE_BOTS_URL, { cf: { cacheTtl: 3600 } });
      if (!resp.ok) return;
      const artifact = await resp.json();
      const patterns = (artifact.bots || [])
        .filter((b) => b.user_agent_pattern)
        .map((b) => {
          try { return new RegExp(b.user_agent_pattern, "i"); }
          catch { return null; }
        })
        .filter(Boolean);
      if (patterns.length > 0) {
        cachedPatterns = patterns;
        cachedAt = Date.now();
      }
    } catch {}
    loadPromise = null;
  })();
}

const STATIC_ASSET_RE = /\.(css|js|mjs|map|woff2?|ttf|otf|eot|svg|png|jpe?g|gif|webp|avif|ico|mp3|mp4|webm|pdf)(\?|$)/i;

function shouldForward(ua) {
  if (!cachedPatterns) return true;
  if (!ua) return true;
  return cachedPatterns.some((re) => re.test(ua));
}

async function forwardToSearchable(request, ua, token) {
  try {
    const url = new URL(request.url);
    const cf = request.cf ?? {};
    const entry = {
      RayID: request.headers.get("cf-ray") ?? crypto.randomUUID(),
      ClientRequestMethod: request.method,
      ClientRequestURI: url.pathname,
      ClientRequestPath: url.pathname,
      ClientRequestHost: url.hostname,
      ClientRequestScheme: url.protocol.replace(":", ""),
      ClientRequestUserAgent: ua,
      ClientRequestReferer: request.headers.get("referer") ?? "",
      ClientIP: request.headers.get("cf-connecting-ip") ?? "0.0.0.0",
      ClientCountry: cf.country ?? "",
      EdgeResponseStatus: 0,
      EdgeStartTimestamp: Date.now() * 1_000_000,
      OriginResponseTime: 0,
      EdgeColoCode: cf.colo ?? "",
      CacheStatus: "",
      EdgeResponseBytes: 0,
    };
    await fetch(SEARCHABLE_ENDPOINT, {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${token}`,
        "Content-Type": "application/x-ndjson",
        "X-Searchable-Source": "cloudflare-worker",
      },
      body: JSON.stringify(entry) + "\n",
    });
  } catch {}
}

export default {
  async fetch(request, env, ctx) {
    ctx.passThroughOnException();

    if (STATIC_ASSET_RE.test(request.url)) return fetch(request);

    if (!cachedPatterns || Date.now() - cachedAt > PATTERNS_TTL_MS) {
      ensurePatternsLoaded();
    }

    const ua = request.headers.get("user-agent") ?? "";
    const token = env.INTEGRATION_TOKEN || INTEGRATION_TOKEN;
    if (token && shouldForward(ua)) {
      ctx.waitUntil(forwardToSearchable(request, ua, token));
    }
    return fetch(request);
  },
};
```

A few things worth knowing about this script:

* **It never blocks the response.** The forward to Searchable is `ctx.waitUntil(...)`, so even if Searchable is unreachable, your users see no slowdown.
* **It fails open.** While the bot registry is loading (cold start) the Worker forwards every request and lets Searchable's server-side classifier do the filtering. Once the registry loads, it filters at the edge to cut traffic.
* **Static assets are skipped.** CSS, JS, images, fonts, and video bypass the Worker entirely — those almost never come from AI crawlers, and there's no point burning a Worker invocation on each.

## Storing the token as a secret (recommended)

For tighter security, store the token as a Worker secret instead of leaving it inline.

1. In your Worker, open **Settings → Variables and Secrets**
2. Add a secret named `INTEGRATION_TOKEN` with your `sa_…` token as the value
3. In the script, blank the inline `INTEGRATION_TOKEN` constant: `const INTEGRATION_TOKEN = "";`
4. Redeploy

The Worker reads the secret if present and falls back to the constant otherwise. Same script, no other edits.

This means you can rotate the token in Cloudflare's UI without ever touching the Worker code.

## Cost

Each AI-bot request costs you one extra Worker invocation. The Workers Free plan includes 100,000 requests per day — that's about 3 million crawls per month, which covers nearly all sites. If you exceed it, the [Workers Paid plan](https://www.cloudflare.com/plans/developer-platform-pricing/) is \$5/month for 10 million requests.

The Worker itself does no CPU-intensive work (a regex check and a `fetch` call), so CPU time is negligible.

## Troubleshooting

<AccordionGroup>
  <Accordion title="Status stays on 'Waiting for first event'">
    Most likely the route isn't bound correctly.

    * In Cloudflare, open the Worker → **Settings → Domains & Routes**
    * Confirm the route pattern matches your live domain (e.g. `*yourdomain.com/*`, not `*yourdomain.dev/*`)
    * Confirm the route's **Zone** is the zone that's actually serving traffic for the domain

    If the route is right, hit your site with a curl using a known AI user agent:

    ```bash theme={null}
    curl -H "User-Agent: GPTBot/1.0 (+https://openai.com/gptbot)" https://yourdomain.com/
    ```

    Then click **Check** in Searchable. If that doesn't appear in the dashboard, the Worker's logs (Cloudflare → Worker → **Logs**) will show whether `forwardToSearchable` is running and whether Searchable returned a non-2xx status.
  </Accordion>

  <Accordion title="Worker logs show 401 from Searchable">
    The token is missing or wrong.

    * Confirm `INTEGRATION_TOKEN` (inline) or the secret named `INTEGRATION_TOKEN` is set to a `sa_…` value
    * If you've recently revoked the token in Searchable, generate a new one and update the Worker
    * The token must have **no quotes** around it in the secret value
  </Accordion>

  <Accordion title="My site is fronted by Cloudflare but I don't manage the zone">
    You need permission to add Worker routes on the zone. If you don't have it:

    * Ask the zone admin to add the route, or
    * Use the **[custom REST API](/setup/custom)** path from your application server
  </Accordion>

  <Accordion title="I see double-counted events">
    If you also have **[Cloudflare Logpush](/setup/cloudflare)** sending traffic to Searchable, you'll get the same request from both sources. Searchable de-duplicates server-side using the Cloudflare Ray ID, so the dashboard shows each request once. If you'd rather not run both, pick one and disable the other.
  </Accordion>
</AccordionGroup>

## Removing the integration

1. Cloudflare → Worker → **Settings → Domains & Routes** → remove the route (the Worker stops observing traffic)
2. Optionally delete the Worker itself
3. Searchable → **LLM Analytics → Setup → Tokens** → revoke the token

## Next steps

<CardGroup cols={2}>
  <Card title="Cloudflare Logpush" icon="cloudflare" href="/setup/cloudflare">
    On the Enterprise plan? See the native Logpush path.
  </Card>

  <Card title="See the data" icon="chart-line" href="/using-searchable/visibility-tracking">
    Open LLM Analytics to see which assistants are crawling your site.
  </Card>
</CardGroup>
