Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.searchable.com/llms.txt

Use this file to discover all available pages before exploring further.

What this does

The Worker sits at Cloudflare’s edge in front of your origin. For every inbound request it:
  1. Lets the request flow through to your origin unchanged (no added latency for users)
  2. Checks the user agent against Searchable’s AI-bot registry
  3. If it matches, fires a fire-and-forget POST to Searchable with the request metadata
The bot registry is fetched from Searchable and refreshed hourly, so new AI crawlers are picked up automatically — you never need to redeploy.
Works on any Cloudflare plan, including Free. Most sites stay well within the Workers free tier (100k requests/day). High-traffic sites can upgrade to Workers Paid for a few dollars a month.

Prerequisites

A Cloudflare account with your domain on it (the domain doesn’t have to be on a paid plan)
Permission to deploy Workers and add Worker routes to your zone
A Searchable project with your domain confirmed

Setup

1

Generate the Worker script in Searchable

  1. Open your Searchable dashboard
  2. Go to Agent Analytics → Setup
  3. Pick Cloudflare Worker as your crawler source
  4. Click Generate token
Searchable templates the token directly into the Worker script in the next panel — no find-and-replace needed. Copy the whole script.
The Searchable UI is the easiest way to get the script — the token is auto-filled and you avoid typos. The reference script is below if you’d rather build it yourself.
2

Create a new Worker in Cloudflare

Open dash.cloudflare.comWorkers & PagesCreate.
  1. Pick the Hello World template
  2. Name it (e.g. searchable-tracker)
  3. Click Deploy
3

Paste the Worker script

From the new Worker’s overview page:
  1. Click Edit code
  2. Replace the entire file contents with the script you copied from Searchable
  3. Click Save and deploy
4

Bind the Worker to your domain

A Worker by itself isn’t on your domain — it has its own *.workers.dev URL. To make it observe your real traffic, add a route.From the Worker’s Settings tab:
  1. Open Domains & Routes (older UIs label this Triggers)
  2. Click Add → Route
  3. Enter your route, e.g. *yourdomain.com/* (the leading * matches all subdomains)
  4. Pick your zone and save
5

Verify in Searchable

Return to Agent Analytics → Setup in your Searchable dashboard. The status strip should show Connected within a few minutes once an AI bot hits your site.

Reference: Worker script

If you’d rather not copy the script from the Searchable UI, here’s the reference. Replace INTEGRATION_TOKEN with the token you generated.
worker.js
const SEARCHABLE_ENDPOINT = "https://searchable-tracker.searchable.workers.dev/v1/cloudflare-logs";
const SEARCHABLE_BOTS_URL = "https://searchable-tracker.searchable.workers.dev/v1/bots.json";

// Quick-start: paste your token here.
// Production: leave blank and set a Worker secret named INTEGRATION_TOKEN
// (Settings → Variables and Secrets). The secret wins if both are set.
const INTEGRATION_TOKEN = "sa_PASTE_YOUR_TOKEN_HERE";

let cachedPatterns = null;
let cachedAt = 0;
let loadPromise = null;
const PATTERNS_TTL_MS = 60 * 60 * 1000;

function ensurePatternsLoaded() {
  if (loadPromise) return;
  loadPromise = (async () => {
    try {
      const resp = await fetch(SEARCHABLE_BOTS_URL, { cf: { cacheTtl: 3600 } });
      if (!resp.ok) return;
      const artifact = await resp.json();
      const patterns = (artifact.bots || [])
        .filter((b) => b.user_agent_pattern)
        .map((b) => {
          try { return new RegExp(b.user_agent_pattern, "i"); }
          catch { return null; }
        })
        .filter(Boolean);
      if (patterns.length > 0) {
        cachedPatterns = patterns;
        cachedAt = Date.now();
      }
    } catch {}
    loadPromise = null;
  })();
}

const STATIC_ASSET_RE = /\.(css|js|mjs|map|woff2?|ttf|otf|eot|svg|png|jpe?g|gif|webp|avif|ico|mp3|mp4|webm|pdf)(\?|$)/i;

function shouldForward(ua) {
  if (!cachedPatterns) return true;
  if (!ua) return true;
  return cachedPatterns.some((re) => re.test(ua));
}

async function forwardToSearchable(request, ua, token) {
  try {
    const url = new URL(request.url);
    const cf = request.cf ?? {};
    const entry = {
      RayID: request.headers.get("cf-ray") ?? crypto.randomUUID(),
      ClientRequestMethod: request.method,
      ClientRequestURI: url.pathname,
      ClientRequestPath: url.pathname,
      ClientRequestHost: url.hostname,
      ClientRequestScheme: url.protocol.replace(":", ""),
      ClientRequestUserAgent: ua,
      ClientRequestReferer: request.headers.get("referer") ?? "",
      ClientIP: request.headers.get("cf-connecting-ip") ?? "0.0.0.0",
      ClientCountry: cf.country ?? "",
      EdgeResponseStatus: 0,
      EdgeStartTimestamp: Date.now() * 1_000_000,
      OriginResponseTime: 0,
      EdgeColoCode: cf.colo ?? "",
      CacheStatus: "",
      EdgeResponseBytes: 0,
    };
    await fetch(SEARCHABLE_ENDPOINT, {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${token}`,
        "Content-Type": "application/x-ndjson",
        "X-Searchable-Source": "cloudflare-worker",
      },
      body: JSON.stringify(entry) + "\n",
    });
  } catch {}
}

export default {
  async fetch(request, env, ctx) {
    ctx.passThroughOnException();

    if (STATIC_ASSET_RE.test(request.url)) return fetch(request);

    if (!cachedPatterns || Date.now() - cachedAt > PATTERNS_TTL_MS) {
      ensurePatternsLoaded();
    }

    const ua = request.headers.get("user-agent") ?? "";
    const token = env.INTEGRATION_TOKEN || INTEGRATION_TOKEN;
    if (token && shouldForward(ua)) {
      ctx.waitUntil(forwardToSearchable(request, ua, token));
    }
    return fetch(request);
  },
};
A few things worth knowing about this script:
  • It never blocks the response. The forward to Searchable is ctx.waitUntil(...), so even if Searchable is unreachable, your users see no slowdown.
  • It fails open. While the bot registry is loading (cold start) the Worker forwards every request and lets Searchable’s server-side classifier do the filtering. Once the registry loads, it filters at the edge to cut traffic.
  • Static assets are skipped. CSS, JS, images, fonts, and video bypass the Worker entirely — those almost never come from AI crawlers, and there’s no point burning a Worker invocation on each.
For tighter security, store the token as a Worker secret instead of leaving it inline.
  1. In your Worker, open Settings → Variables and Secrets
  2. Add a secret named INTEGRATION_TOKEN with your sa_… token as the value
  3. In the script, blank the inline INTEGRATION_TOKEN constant: const INTEGRATION_TOKEN = "";
  4. Redeploy
The Worker reads the secret if present and falls back to the constant otherwise. Same script, no other edits. This means you can rotate the token in Cloudflare’s UI without ever touching the Worker code.

Cost

Each AI-bot request costs you one extra Worker invocation. The Workers Free plan includes 100,000 requests per day — that’s about 3 million crawls per month, which covers nearly all sites. If you exceed it, the Workers Paid plan is $5/month for 10 million requests. The Worker itself does no CPU-intensive work (a regex check and a fetch call), so CPU time is negligible.

Troubleshooting

Most likely the route isn’t bound correctly.
  • In Cloudflare, open the Worker → Settings → Domains & Routes
  • Confirm the route pattern matches your live domain (e.g. *yourdomain.com/*, not *yourdomain.dev/*)
  • Confirm the route’s Zone is the zone that’s actually serving traffic for the domain
If the route is right, hit your site with a curl using a known AI user agent:
curl -H "User-Agent: GPTBot/1.0 (+https://openai.com/gptbot)" https://yourdomain.com/
Then click Check in Searchable. If that doesn’t appear in the dashboard, the Worker’s logs (Cloudflare → Worker → Logs) will show whether forwardToSearchable is running and whether Searchable returned a non-2xx status.
The token is missing or wrong.
  • Confirm INTEGRATION_TOKEN (inline) or the secret named INTEGRATION_TOKEN is set to a sa_… value
  • If you’ve recently revoked the token in Searchable, generate a new one and update the Worker
  • The token must have no quotes around it in the secret value
You need permission to add Worker routes on the zone. If you don’t have it:
  • Ask the zone admin to add the route, or
  • Use the custom REST API path from your application server
If you also have Cloudflare Logpush sending traffic to Searchable, you’ll get the same request from both sources. Searchable de-duplicates server-side using the Cloudflare Ray ID, so the dashboard shows each request once. If you’d rather not run both, pick one and disable the other.

Removing the integration

  1. Cloudflare → Worker → Settings → Domains & Routes → remove the route (the Worker stops observing traffic)
  2. Optionally delete the Worker itself
  3. Searchable → Agent Analytics → Setup → Tokens → revoke the token

Next steps

Cloudflare Logpush

On the Enterprise plan? See the native Logpush path.

See the data

Open Agent Analytics to see which assistants are crawling your site.