Skip to main content

Publisher Reporting

Pull bot and non-human traffic data for campaigns registered with use case publisher_bot_detection or publisher_agent_targeting. Powered by AgentGraph, Classify's intelligence layer for identifying and categorizing non-human visitors.

Reporting is asynchronous. Submit a report request, receive an ID, and poll for results.


The publisher report object

{
"id": 501,
"status": "complete",
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"created_date": "2026-01-20T08:00:00Z",
"processed_date": "2026-01-20T08:02:30Z",
"summary": {
"total_loads": 4382129,
"human_loads": 3155133,
"non_human_loads": 1226996,
"non_human_pct": 0.28,
"llm_agent_pct": 0.06,
"search_bot_pct": 0.14,
"scraper_pct": 0.08,
"high_risk_bot_count": 17
},
"agents": [...],
"hot_spots": [...],
"data": [...]
}
FieldTypeDescription
idintegerUnique report ID
statusstringpendingprocessingcomplete or failed
campaign_idintegerThe pixel campaign (property) this report is for
start_datestring (ISO 8601)Report start date (inclusive)
end_datestring (ISO 8601)Report end date (inclusive)
created_datestring (ISO 8601)When the report was requested
processed_datestring (ISO 8601) | nullWhen results were ready. null until complete.
summaryobject | nullAggregate traffic breakdown. Present when status is complete.
agentsarray[object] | nullPer-agent traffic data. Present when status is complete.
hot_spotsarray[object] | nullURLs with highest bot activity. Present when status is complete.
dataarray[object] | nullTime-series breakdown. Present when status is complete and dimensions includes date.

Summary object

FieldTypeDescription
total_loadsintegerTotal pixel loads in the period
human_loadsintegerLoads identified as human traffic
non_human_loadsintegerLoads identified as non-human
non_human_pctnumberNon-human traffic as a fraction (e.g. 0.28 = 28%)
llm_agent_pctnumberLLM agent traffic as a fraction of total loads
search_bot_pctnumberSearch bot traffic as a fraction of total loads
scraper_pctnumberScraper traffic as a fraction of total loads
high_risk_bot_countintegerNumber of distinct high-risk bots detected

Create a publisher report

POST https://api.clsfy.me/v1/pixel/reports/publisher

Parameters

ParameterTypeRequiredDescription
campaign_idintegerRequiredThe pixel campaign (property) ID
start_datestring (ISO 8601)RequiredStart of the reporting period
end_datestring (ISO 8601)RequiredEnd of the reporting period (inclusive)
dimensionsarray[string]OptionalBreakdown dimensions. Values: date, agent, url, agent_type. Omit for summary + agents + hot spots only.

Request

curl -X POST "https://api.clsfy.me/v1/pixel/reports/publisher" \
-H "X-API-Key: <your_api_key>" \
-H "Content-Type: application/json" \
-d '{
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"dimensions": ["date", "agent_type"]
}'

Get a publisher report

GET https://api.clsfy.me/v1/pixel/reports/publisher/{id}
ParameterTypeDescription
idintegerThe report ID
curl "https://api.clsfy.me/v1/pixel/reports/publisher/501" \
-H "X-API-Key: <your_api_key>"

Completed response

{
"id": 501,
"status": "complete",
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"created_date": "2026-01-20T08:00:00Z",
"processed_date": "2026-01-20T08:02:30Z",
"summary": {
"total_loads": 4382129,
"human_loads": 3155133,
"non_human_loads": 1226996,
"non_human_pct": 0.28,
"llm_agent_pct": 0.06,
"search_bot_pct": 0.14,
"scraper_pct": 0.08,
"high_risk_bot_count": 17
},
"agents": [
{
"agent": "Googlebot",
"type": "search",
"loads": 402112,
"pct_of_non_human": 0.33
},
{
"agent": "GPTBot",
"type": "llm_training",
"loads": 88223,
"pct_of_non_human": 0.07
},
{
"agent": "ClaudeBot",
"type": "llm_retrieval",
"loads": 52998,
"pct_of_non_human": 0.04
},
{
"agent": "PerplexityBot",
"type": "llm_retrieval",
"loads": 41221,
"pct_of_non_human": 0.03
},
{
"agent": "Bytespider",
"type": "scraper",
"loads": 35990,
"pct_of_non_human": 0.03
}
],
"hot_spots": [
{
"url": "/best-credit-cards",
"bot_pct": 0.64,
"loads": 112093,
"dominant_agent": "GPTBot"
},
{
"url": "/ai-tools-guide",
"bot_pct": 0.72,
"loads": 94381,
"dominant_agent": "ClaudeBot"
},
{
"url": "/mortgage-rates",
"bot_pct": 0.51,
"loads": 83002,
"dominant_agent": "Googlebot"
}
],
"data": [
{
"date": "2026-01-13",
"agent_type": "human",
"loads": 448219
},
{
"date": "2026-01-13",
"agent_type": "search",
"loads": 62540
},
{
"date": "2026-01-13",
"agent_type": "llm_training",
"loads": 12890
},
{
"date": "2026-01-13",
"agent_type": "llm_retrieval",
"loads": 13420
},
{
"date": "2026-01-13",
"agent_type": "scraper",
"loads": 9102
}
]
}

Agent types

AgentGraph categorizes every non-human visitor into one of four types:

Type valueDescriptionExamples
searchSearch engine crawlers indexing contentGooglebot, Bingbot, YandexBot
llm_trainingAI model training crawlersGPTBot, CCBot, Google-Extended
llm_retrievalAI agents fetching content for real-time answersClaudeBot, PerplexityBot, ChatGPT-User
scraperGeneral-purpose web scrapers and crawlersBytespider, AhrefsBot, custom scrapers

Agents array

Each entry in the agents array identifies a specific non-human visitor:

FieldTypeDescription
agentstringAgent name as identified by AgentGraph
typestringAgent type: search, llm_training, llm_retrieval, or scraper
loadsintegerNumber of pixel loads from this agent
pct_of_non_humannumberThis agent's share of total non-human traffic (e.g. 0.33 = 33%)

Hot spots array

The hot_spots array shows the URLs with the highest non-human traffic concentration:

FieldTypeDescription
urlstringURL path on your property
bot_pctnumberFraction of traffic to this URL that is non-human (e.g. 0.64 = 64%)
loadsintegerTotal pixel loads for this URL
dominant_agentstringThe agent responsible for the most non-human traffic to this URL

Polling for results

import requests
import time

def wait_for_publisher_report(report_id: int, api_key: str, poll_interval: int = 30):
"""Poll until a publisher report is ready."""
url = f"https://api.clsfy.me/v1/pixel/reports/publisher/{report_id}"
headers = {"X-API-Key": api_key}

while True:
report = requests.get(url, headers=headers).json()

if report["status"] == "complete":
s = report["summary"]
print(f"Done — {s['total_loads']} loads, {s['non_human_pct']:.0%} non-human")
return report
elif report["status"] == "failed":
raise RuntimeError(f"Report {report_id} failed.")

print(f"Status: {report['status']} — retrying in {poll_interval}s")
time.sleep(poll_interval)

Dimensions reference

DimensionValueGranularity
DatedateDaily
AgentagentIndividual agent name
Agent typeagent_typehuman, search, llm_training, llm_retrieval, scraper
URLurlIndividual page URL path

Error responses

When a request fails, the API returns a JSON object with an error code and a human-readable message:

{
"error": "not_found",
"message": "Publisher report with ID 999 not found"
}

HTTP status codes

StatusMeaning
200 OKSuccess
201 CreatedResource created (POST endpoints)
400 Bad RequestInvalid or missing parameters
401 UnauthorizedMissing or invalid API key
404 Not FoundResource not found
422 Unprocessable ContentValidation error (e.g. invalid field values)
429 Too Many RequestsRate limit exceeded