Publisher Reporting
Pull bot and non-human traffic data for campaigns registered with use case publisher_bot_detection or publisher_agent_targeting. Powered by AgentGraph, Classify's intelligence layer for identifying and categorizing non-human visitors.
Reporting is asynchronous. Submit a report request, receive an ID, and poll for results.
The publisher report object
{
"id": 501,
"status": "complete",
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"created_date": "2026-01-20T08:00:00Z",
"processed_date": "2026-01-20T08:02:30Z",
"summary": {
"total_loads": 4382129,
"human_loads": 3155133,
"non_human_loads": 1226996,
"non_human_pct": 0.28,
"llm_agent_pct": 0.06,
"search_bot_pct": 0.14,
"scraper_pct": 0.08,
"high_risk_bot_count": 17
},
"agents": [...],
"hot_spots": [...],
"data": [...]
}
| Field | Type | Description |
|---|---|---|
id | integer | Unique report ID |
status | string | pending → processing → complete or failed |
campaign_id | integer | The pixel campaign (property) this report is for |
start_date | string (ISO 8601) | Report start date (inclusive) |
end_date | string (ISO 8601) | Report end date (inclusive) |
created_date | string (ISO 8601) | When the report was requested |
processed_date | string (ISO 8601) | null | When results were ready. null until complete. |
summary | object | null | Aggregate traffic breakdown. Present when status is complete. |
agents | array[object] | null | Per-agent traffic data. Present when status is complete. |
hot_spots | array[object] | null | URLs with highest bot activity. Present when status is complete. |
data | array[object] | null | Time-series breakdown. Present when status is complete and dimensions includes date. |
Summary object
| Field | Type | Description |
|---|---|---|
total_loads | integer | Total pixel loads in the period |
human_loads | integer | Loads identified as human traffic |
non_human_loads | integer | Loads identified as non-human |
non_human_pct | number | Non-human traffic as a fraction (e.g. 0.28 = 28%) |
llm_agent_pct | number | LLM agent traffic as a fraction of total loads |
search_bot_pct | number | Search bot traffic as a fraction of total loads |
scraper_pct | number | Scraper traffic as a fraction of total loads |
high_risk_bot_count | integer | Number of distinct high-risk bots detected |
Create a publisher report
POST https://api.clsfy.me/v1/pixel/reports/publisher
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
campaign_id | integer | Required | The pixel campaign (property) ID |
start_date | string (ISO 8601) | Required | Start of the reporting period |
end_date | string (ISO 8601) | Required | End of the reporting period (inclusive) |
dimensions | array[string] | Optional | Breakdown dimensions. Values: date, agent, url, agent_type. Omit for summary + agents + hot spots only. |
Request
- curl
- Python
curl -X POST "https://api.clsfy.me/v1/pixel/reports/publisher" \
-H "X-API-Key: <your_api_key>" \
-H "Content-Type: application/json" \
-d '{
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"dimensions": ["date", "agent_type"]
}'
import requests
response = requests.post(
"https://api.clsfy.me/v1/pixel/reports/publisher",
headers={
"X-API-Key": "<your_api_key>",
"Content-Type": "application/json",
},
json={
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"dimensions": ["date", "agent_type"],
},
)
report = response.json()
print(report["id"]) # e.g. 501
Get a publisher report
GET https://api.clsfy.me/v1/pixel/reports/publisher/{id}
| Parameter | Type | Description |
|---|---|---|
id | integer | The report ID |
- curl
- Python
curl "https://api.clsfy.me/v1/pixel/reports/publisher/501" \
-H "X-API-Key: <your_api_key>"
import requests
response = requests.get(
"https://api.clsfy.me/v1/pixel/reports/publisher/501",
headers={"X-API-Key": "<your_api_key>"},
)
report = response.json()
if report["status"] == "complete":
s = report["summary"]
print(f"Total loads: {s['total_loads']}")
print(f"Non-human: {s['non_human_pct']:.0%}")
print(f"LLM agents: {s['llm_agent_pct']:.0%}")
Completed response
{
"id": 501,
"status": "complete",
"campaign_id": 610,
"start_date": "2026-01-13",
"end_date": "2026-01-19",
"created_date": "2026-01-20T08:00:00Z",
"processed_date": "2026-01-20T08:02:30Z",
"summary": {
"total_loads": 4382129,
"human_loads": 3155133,
"non_human_loads": 1226996,
"non_human_pct": 0.28,
"llm_agent_pct": 0.06,
"search_bot_pct": 0.14,
"scraper_pct": 0.08,
"high_risk_bot_count": 17
},
"agents": [
{
"agent": "Googlebot",
"type": "search",
"loads": 402112,
"pct_of_non_human": 0.33
},
{
"agent": "GPTBot",
"type": "llm_training",
"loads": 88223,
"pct_of_non_human": 0.07
},
{
"agent": "ClaudeBot",
"type": "llm_retrieval",
"loads": 52998,
"pct_of_non_human": 0.04
},
{
"agent": "PerplexityBot",
"type": "llm_retrieval",
"loads": 41221,
"pct_of_non_human": 0.03
},
{
"agent": "Bytespider",
"type": "scraper",
"loads": 35990,
"pct_of_non_human": 0.03
}
],
"hot_spots": [
{
"url": "/best-credit-cards",
"bot_pct": 0.64,
"loads": 112093,
"dominant_agent": "GPTBot"
},
{
"url": "/ai-tools-guide",
"bot_pct": 0.72,
"loads": 94381,
"dominant_agent": "ClaudeBot"
},
{
"url": "/mortgage-rates",
"bot_pct": 0.51,
"loads": 83002,
"dominant_agent": "Googlebot"
}
],
"data": [
{
"date": "2026-01-13",
"agent_type": "human",
"loads": 448219
},
{
"date": "2026-01-13",
"agent_type": "search",
"loads": 62540
},
{
"date": "2026-01-13",
"agent_type": "llm_training",
"loads": 12890
},
{
"date": "2026-01-13",
"agent_type": "llm_retrieval",
"loads": 13420
},
{
"date": "2026-01-13",
"agent_type": "scraper",
"loads": 9102
}
]
}
Agent types
AgentGraph categorizes every non-human visitor into one of four types:
| Type value | Description | Examples |
|---|---|---|
search | Search engine crawlers indexing content | Googlebot, Bingbot, YandexBot |
llm_training | AI model training crawlers | GPTBot, CCBot, Google-Extended |
llm_retrieval | AI agents fetching content for real-time answers | ClaudeBot, PerplexityBot, ChatGPT-User |
scraper | General-purpose web scrapers and crawlers | Bytespider, AhrefsBot, custom scrapers |
Agents array
Each entry in the agents array identifies a specific non-human visitor:
| Field | Type | Description |
|---|---|---|
agent | string | Agent name as identified by AgentGraph |
type | string | Agent type: search, llm_training, llm_retrieval, or scraper |
loads | integer | Number of pixel loads from this agent |
pct_of_non_human | number | This agent's share of total non-human traffic (e.g. 0.33 = 33%) |
Hot spots array
The hot_spots array shows the URLs with the highest non-human traffic concentration:
| Field | Type | Description |
|---|---|---|
url | string | URL path on your property |
bot_pct | number | Fraction of traffic to this URL that is non-human (e.g. 0.64 = 64%) |
loads | integer | Total pixel loads for this URL |
dominant_agent | string | The agent responsible for the most non-human traffic to this URL |
Polling for results
- Python
import requests
import time
def wait_for_publisher_report(report_id: int, api_key: str, poll_interval: int = 30):
"""Poll until a publisher report is ready."""
url = f"https://api.clsfy.me/v1/pixel/reports/publisher/{report_id}"
headers = {"X-API-Key": api_key}
while True:
report = requests.get(url, headers=headers).json()
if report["status"] == "complete":
s = report["summary"]
print(f"Done — {s['total_loads']} loads, {s['non_human_pct']:.0%} non-human")
return report
elif report["status"] == "failed":
raise RuntimeError(f"Report {report_id} failed.")
print(f"Status: {report['status']} — retrying in {poll_interval}s")
time.sleep(poll_interval)
Dimensions reference
| Dimension | Value | Granularity |
|---|---|---|
| Date | date | Daily |
| Agent | agent | Individual agent name |
| Agent type | agent_type | human, search, llm_training, llm_retrieval, scraper |
| URL | url | Individual page URL path |
Error responses
When a request fails, the API returns a JSON object with an error code and a human-readable message:
{
"error": "not_found",
"message": "Publisher report with ID 999 not found"
}
HTTP status codes
| Status | Meaning |
|---|---|
200 OK | Success |
201 Created | Resource created (POST endpoints) |
400 Bad Request | Invalid or missing parameters |
401 Unauthorized | Missing or invalid API key |
404 Not Found | Resource not found |
422 Unprocessable Content | Validation error (e.g. invalid field values) |
429 Too Many Requests | Rate limit exceeded |