Documentation Index
Fetch the complete documentation index at: https://docs.searchable.com/llms.txt
Use this file to discover all available pages before exploring further.
What this is
@searchablehq/middleware is a small Node package that captures inbound requests in your application and fires a non-blocking POST to Searchable’s ingest endpoint. The Searchable backend then runs the same AI-bot classifier the rest of our connectors use, so only GPTBot, ClaudeBot, PerplexityBot, and other AI agents end up in your dashboard.
Zero added latency. The SDK fires events fire-and-forget after your response is built — your users never wait on Searchable. If the network is down or Searchable is unreachable, the request still completes.
withSearchable wrapper for Next.js middleware. The same core primitives (buildEventPayload, sendEvent) can be reused to wire any other Node framework — examples for Express and Fastify are below.
Prerequisites
A Node 18+ application (Next.js 13+, Express, Fastify, etc.)
A Searchable project with your domain confirmed
The two credentials from the common prerequisites: a project site token (
st_…) and a workspace API key (sk_live_…)Install
next is an optional peer dependency — only loaded when you import from @searchablehq/middleware/nextjs.
Next.js setup
Add the credentials to your environment
In Both values come from LLM Analytics → Setup → Custom in your Searchable dashboard.
.env.local (or your hosting platform’s env-var settings):.env.local
Create or update middleware.ts
In your Next.js project root:The
middleware.ts
matcher keeps the middleware off Next’s static asset routes — those almost never come from AI crawlers, and there’s no point burning a middleware invocation on each.Already have a middleware.ts?
Pass your existing middleware as the second argument to
withSearchable. Searchable runs first, then yields to your logic:middleware.ts
Config reference
Skipping paths
By default, the SDK auto-skips/_next/* and common static-asset extensions (.js, .css, .png, .jpg, .svg, .ico, .woff, .woff2, .ttf, .map).
Add custom skips for health checks, internal APIs, or anything else you don’t want to count:
Custom properties
Anything you return fromcustom(request) is attached to the event under parameters and is queryable from the dashboard:
Debug mode
In development, setdebug: true to log every captured event to your terminal:
Express, Fastify, other Node frameworks
The Next.js helper is a thin convenience wrapper around two exported primitives:Express
server.ts
Fastify
server.ts
What gets captured
For every non-static request the SDK records:- HTTP method, path, URL, status code, response time
- User agent (used by Searchable to classify the AI bot)
- Anonymised IP (zero last octet by default — toggle via
anonymizeIp: false) - Referrer + parsed referrer domain
- UTM parameters extracted from the URL
- Geo location (country, region, city) when your edge runtime exposes it (e.g. Next.js Edge Runtime)
- Filtered request headers — only an allowlist of safe ones (
accept-language,host,sec-ch-ua*, etc.) - Non-UTM query parameters
- Anything you return from
custom(request)
authorization, cookie, set-cookie, x-api-key, proxy-authorization, x-forwarded-for, x-real-ip) are stripped at the edge before the worker forwards events to ingest.
Verifying the connection
In Searchable:- Go to LLM Analytics → Setup
- The Custom card’s status indicator should flip to Connected once the first event arrives
- Hit your site with
curlusing a known AI user agent to force one:
| Status | What it means |
|---|---|
| Waiting for first event | The middleware is deployed but no AI bot has been seen yet. Curl an AI UA to force one. |
| Connected | Events are arriving. The count from the last 24 hours is shown alongside. |
debug: true locally to confirm events are firing without leaving your dev environment.
Troubleshooting
Status stays on 'Waiting for first event'
Status stays on 'Waiting for first event'
Most often the request never reaches Searchable because the middleware isn’t running on the routes AI bots hit.
- Check your
config.matcheractually covers the live URLs — Next’s default skips_next/*but you might be excluding more than intended - Confirm
SEARCHABLE_SITE_TOKENandSEARCHABLE_API_KEYare set in the deployed environment (not just locally) - Set
debug: true, redeploy, and check your logs — the SDK logs every event it sends - Curl your live domain with
User-Agent: GPTBot/1.0and re-check the status
I see logs but no events in the dashboard
I see logs but no events in the dashboard
Searchable filters non-bot user agents server-side. If your test request used a normal browser UA, it’s discarded silently. Use a known AI UA:Other supported test UAs:
ClaudeBot/1.0, PerplexityBot/1.0, Google-Extended/1.0.The middleware throws 401 / 403 in logs
The middleware throws 401 / 403 in logs
The API key is missing or wrong:
- Confirm
SEARCHABLE_API_KEYstarts withsk_live_and has no leading/trailing whitespace - If you’ve recently revoked the key in Searchable, generate a new one and update your env
- The key must have the Log Events permission. Re-create it from the Custom connector dialog if unsure — that’s the default permission for keys generated there.
I want to use a different site token per environment
I want to use a different site token per environment
Set different values for
SEARCHABLE_SITE_TOKEN in your staging and production environments. Site tokens are per-project — staging traffic going to a staging project keeps your production data clean.The same workspace SEARCHABLE_API_KEY works across all projects in a workspace as long as it isn’t project-scoped. To scope a key to a single project, generate it from inside that project’s Custom connector dialog.My CDN strips the Authorization header
My CDN strips the Authorization header
Removing the integration
- Delete the
withSearchable(...)call (or pass through the inner middleware only) - Remove
SEARCHABLE_SITE_TOKENandSEARCHABLE_API_KEYfrom your env - In Searchable → Settings → API Keys → revoke the API key
Next steps
REST API reference
Want the raw HTTP shape, or instrument a non-Node stack?
See the data
Open LLM Analytics to see which assistants are crawling your site.