Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.searchable.com/llms.txt

Use this file to discover all available pages before exploring further.

What this does

Akamai DataStream 2 ships near-real-time request logs from your Akamai properties to a destination of your choosing. We point that stream at Searchable’s tracker endpoint, classify the AI bots at the edge, and drop everything else. No code changes — all configuration happens inside Akamai Control Center.
No code changes. All configuration is done inside Akamai Control Center.

Prerequisites

An Akamai account with DataStream 2 entitled on the contract that owns your delivery property
The Edit role on that property, or an Akamai admin who can author and activate a stream
A Searchable project with your domain confirmed
DataStream 2 is a per-contract product. If the DataStream entry doesn’t appear in Control Center, your Akamai contract doesn’t include it yet — your Akamai account team can add it. The legacy DataStream 1 product is not supported by this integration; you need DataStream 2 (the JSON-format successor).

Setup

1

Generate an integration token in Searchable

  1. Open your Searchable dashboard
  2. Go to LLM Analytics → Setup
  3. Pick Akamai as your crawler source
  4. Click Generate token
Copy the token now — it starts with sa_… and won’t be shown again. You can always generate a new one if you lose it.The endpoint URL is fixed:
https://searchable-tracker.searchable.workers.dev/v1/akamai-logs
Sanity check before pointing DataStream at the endpoint: curl https://searchable-tracker.searchable.workers.dev/v1/akamai-logs should return 200 ok. That’s the same health check Control Center’s Validate and save step runs against the URL when you finish the wizard.
2

Open the DataStream 2 wizard

In Akamai Control Center, navigate to:CDN → DataStream → Create new streamGive the stream a name like Searchable LLM Analytics and assign it to the group that contains the property fronting your domain. Pick the property that serves the traffic you want Searchable to see — DataStream is scoped per-property, not per-account.
If your domain is fronted by more than one property (for example, www and api on separate properties), create one stream per property and point them all at the same endpoint with the same token. Searchable tags events by host, so they still arrive split by domain in the dashboard.
3

Pick the data sets (log format fields)

On the Data sets step, switch the log format to Structured JSON, then select the fields below. Field names are case-sensitivereqID uses a capital ID, and the sub-country admin region is state (not region).Required (the endpoint drops records missing any of these):
  • reqID
  • reqMethod
  • reqHost
  • reqPath
  • UA
  • statusCode
  • cliIP
  • reqEndTimeMSec
Recommended (used for enrichment + debugging; missing fields don’t break the integration):
  • queryStr
  • referer (HTTP-standard single-r spelling)
  • rspContentLen
  • turnAroundTimeMSec
  • cacheStatus
  • country
  • state
  • city
  • edgeIP
  • tlsVersion
The exact list, comma-separated for quick copy/paste, is also shown in the Searchable setup card.
reqID is load-bearing. DataStream 2 has at-least-once delivery semantics, so the same record can arrive in multiple batches; Searchable derives a stable event ID from reqID to deduplicate them. If you skip reqID, dedup falls back to a heuristic that’s noticeably less precise — your dashboard will show inflated counts after stream retries.
4

Configure the HTTPS delivery

On the Delivery step, choose HTTPS and fill in:
FieldValue
Endpoint URLhttps://searchable-tracker.searchable.workers.dev/v1/akamai-logs
AuthenticationNone
Custom HTTP header — nameAuthorization
Custom HTTP header — valueBearer <your-sa_-token>
Compress filesOff (see note below)
Set Authentication to None and put the token in a custom header. Akamai’s built-in Authentication choices (Basic, mTLS) are for upstreams that expect those formats — Searchable verifies a signed Bearer token, so the right slot is a plain custom header.The header value must be Bearer followed by the full sa_… token, with one space and no quotes. The header name must be exactly Authorization.
DataStream 2’s HTTPS delivery sends batched, line-delimited JSON. Searchable’s endpoint accepts both uncompressed and gzip-compressed bodies. If you turn on Compress files, leave the rest of the settings unchanged — Akamai sets Content-Encoding: gzip automatically and our endpoint decompresses on the way in.
5

Set the upload frequency

On the Frequency step, choose 30 seconds (the default) or 60 seconds. Either is fine — both keep dashboard latency under a minute.Don’t choose the largest “files per group” / longest interval option. Larger Akamai batches are still well under our 5 MB payload cap on any realistic traffic level, but tighter intervals give you faster feedback in the dashboard while debugging.
6

Validate and activate the stream

On the Review step, click Validate and save. Akamai sends a GET to the endpoint URL and expects a 200 response — Searchable’s tracker responds with a literal ok. If validation fails, see Troubleshooting below.Once validation succeeds, click Activate. Akamai activation typically completes within a couple of minutes on Staging and 5–10 minutes on Production. Once active, expect events in Searchable as soon as an AI bot hits your site.

Verifying the connection

In Searchable:
  1. Go to LLM Analytics → Setup
  2. Look at the Akamai card status
  3. Click Check if it still shows “Waiting for first event”
StatusWhat it means
Waiting for first eventThe stream is configured but no AI bot has hit your site yet. Typical wait is a few hours for sites that are already indexed.
ConnectedEvents are arriving. The card shows the count from the last 24 hours.
You can also confirm in Akamai: CDN → DataStream → your stream → Monitor. The stream’s metrics show outgoing batches and the HTTP response codes from our endpoint. Healthy delivery looks like a steady stream of 204 responses.

What Searchable receives

For each request that matches an AI-bot user agent, Searchable receives:
  • HTTP method, path, and host (query strings stripped before storage)
  • User agent
  • Referer
  • Country, state, and city (from country / state / city)
  • Response status and bytes out (rspContentLen)
  • Edge turnaround time (turnAroundTimeMSec)
  • Akamai edge metadata (edgeIP, tlsVersion, cacheStatus) — preserved as custom_properties for debugging
  • The DataStream 2 reqID — used as the dedup key on our side
Bodies, headers other than User-Agent / Referer, cookies, and full IPs are never sent or stored. The DataStream 2 reqID is hashed into a deterministic event ID so retried batches collapse into a single event server-side.

Troubleshooting

Akamai’s validator does a GET against your endpoint URL and expects 200. The two common reasons it fails:
  • Typo in the URL — the correct value is exactly https://searchable-tracker.searchable.workers.dev/v1/akamai-logs. No trailing slash, no path suffix.
  • A custom Akamai delivery network policy is blocking egress to *.workers.dev. In strict environments this surfaces as a connection error rather than a 4xx. If you suspect this, ask your Akamai admin to allow-list the endpoint hostname for the delivery property.
The custom header isn’t validated at this step — Akamai’s Validate and save only verifies that the URL is reachable. Auth issues surface in the Monitor tab after the stream activates.
The Authorization header is missing or wrong.
  • Make sure you added a custom HTTP header named exactly Authorization (not X-Authorization, not Bearer)
  • The value must be Bearer followed by the full sa_… token, with one space and no quotes
  • Confirm the Authentication dropdown is set to None — picking Basic there overrides the custom header
  • If you’ve recently revoked the token in Searchable, generate a new one and update the stream’s custom header (no need to recreate the stream)
In Akamai’s Monitor tab, repeated 401 responses are the visible symptom — fix the header and the next batch will succeed.
Map the response code to the cause:
  • 401 Unauthorized — see the Authorization-header section above.
  • 413 Payload Too Large — the batch exceeded Searchable’s 5 MB payload cap. This is rare on default frequency settings (30s / 60s); if you see it, lower the upload interval rather than raising file sizes.
While 4xx errors are occurring, Akamai retries each batch a few times and then drops it. Fix the root cause and the stream catches up automatically with the next batch.
The endpoint silently drops records that are missing required fields rather than rejecting the whole batch, so a misconfigured Data sets step shows up as healthy delivery on Akamai’s side and empty cards on Searchable’s side.Open the stream’s Data sets step and confirm every required field is selected:
  • reqID, reqMethod, reqHost, reqPath, UA, statusCode, cliIP, reqEndTimeMSec
Field names are case-sensitive — reqID uses a capital ID. Missing any one of these causes every record in the batch to be dropped on the way in.If all required fields are present, the next most common cause is a domain mismatch — see “Status stays on ‘Waiting for first event’” below.
reqID is populated with a non-unique value, or reqID was omitted from the field selection altogether.
  • Open the stream’s Data sets step and confirm reqID (capital ID) is selected — Akamai’s UI lists it alphabetically near the bottom of the request-section fields
  • reqID is Akamai’s per-request unique identifier and is auto-populated; no property-level VCL or PMUSER variables required
  • If reqID is selected and you still see this symptom, contact support and include a few sample records — most often this turns out to be a property-level override blanking the value
Without reqID, Searchable falls back to a (timestamp, path, user-agent) heuristic that’s much coarser. The fix is non-destructive — add reqID and republish.
DataStream 2’s documented epoch field is reqEndTimeMSec — milliseconds since the Unix epoch. Some legacy stream configurations use reqTimeSec (seconds) by mistake.Searchable handles this defensively: if the value looks like seconds (below ~year-2001-in-ms), we scale it up to milliseconds before storage. So you shouldn’t see this in practice — but if you’ve copied a stream config from a much older DataStream 1 export, double-check that the field is reqEndTimeMSec and not reqTimeSec.
A few possible causes, in order of likelihood:
  • The stream isn’t activated — open the stream in Control Center and confirm the status badge reads Active, not Inactive or Pending activation. Activations on Production take 5–10 minutes
  • Your domain in Searchable doesn’t match the host the Akamai property serves (check LLM Analytics → Setup → Confirm your domain)
  • The Akamai property fronts a different host than the one you’re testing with — DataStream is scoped per-property, so traffic going through a different property won’t appear
  • No AI bot has visited yet — try visiting your site with a known AI user agent (e.g. Mozilla/5.0 (compatible; GPTBot/1.0)) to trigger a test event
If Akamai’s Monitor tab shows successful 204 responses but Searchable still says no events, the issue is almost always a domain mismatch on the Searchable side.
Akamai’s standard staging-network workflow works fine — activate the stream on Staging first, then point a few test requests at your staging hostname (Akamai exposes the staging network at <your-host>.edgesuite-staging.net or via a Pragma: akamai-x-cache-on header trick from your origin). Successful staging delivery is a strong signal the production activation will work as well.You’ll see events arrive in the same Searchable project — there’s no separate staging endpoint to worry about.

Removing the integration

To stop sending traffic to Searchable:
  1. Akamai → CDN → DataStream → your stream → Deactivate (and optionally delete the stream once it’s deactivated)
  2. Searchable → LLM Analytics → Setup → Tokens → revoke the token
Both sides are independent — revoking the token alone is enough to stop ingestion immediately, even if the stream stays configured in Akamai (its deliveries will start returning 401, which Akamai will retry and then drop).

Next steps

See the data

Open LLM Analytics to see which assistants are crawling your site.

Add Search Console

Correlate AI crawls with search demand.