Smalk AI Analytics

Maelezo

See the AI agents and AI Search visitors your site actually receives — and never lose an event again.

The web is shifting from search results to answer engines. GPTBot, ClaudeBot, PerplexityBot, Gemini, Google AI Overviews and dozens of other AI crawlers visit your pages every day, and a growing share of your human traffic now arrives from an AI assistant rather than a classic SERP. Most analytics tools were built for the old web and miss all of it.

Smalk AI Analytics adds an AI-aware tracking layer to your WordPress site: every public-page visit (AI agent OR human-from-AI-search) is captured, classified, and pushed to your Smalk dashboard where you can see which engines cite you, which pages they prefer, and how that AI exposure converts into real human traffic.

Unlike crawler-blocking plugins, Smalk does not block AI bots. It helps you measure, optimize, and turn AI exposure into growth.

KEY FEATURES

Tracking

  • Real-time capture of AI agent visits (GPTBot, ClaudeBot, PerplexityBot, GeminiBot, GoogleOther, OAI-SearchBot, Mistral, AppleBot-Extended, and 30+ others) plus human visitors arriving from AI search engines.
  • Two tracking layers: a lightweight server-side hook captures bot traffic (which never executes JavaScript) AND a tracker.js snippet captures human session context. Both feed the same Smalk dashboard.
  • Rich proxy header capture (Cloudflare, geo, X-Forwarded-For, X-Real-IP, X-Smalk-CMS) so bot classification stays accurate even behind CDNs.

Reliability (new in 1.1.0)

  • Local MySQL queue buffers visits when the Smalk API is briefly unreachable — no more lost events during transient outages.
  • Background cron flushes the queue every minute via a batched endpoint (up to 500 visits per request).
  • Built-in circuit breaker pauses outbound calls after repeated failures and resumes automatically.
  • Per-row exponential-backoff retry + daily cleanup of stale entries.
  • “Direct” fallback mode for sites that prefer the old fire-and-forget behavior (no queue table).

Privacy & compliance

  • Per-site redact list for query parameters (passwords, tokens, credit-card and SSN-style keys are redacted by default before any value leaves your server).
  • Deny-paths textarea with wildcards on top of a strict built-in exclusion list (/wp-admin, /wp-login, /wp-json, /xmlrpc.php, REST API calls, favicon, etc.).
  • Per-request-type skip toggles (admin / cron / AJAX / REST) — admin / cron / AJAX skipped by default.
  • No cookies are set on visitors. We only capture sends standard HTTP request metadata.

Admin experience

  • Modern card-based settings page with a 4-tile Status & Diagnostics panel: queue size, last successful send, circuit-breaker state, last error — refreshed live on every action.
  • One-click Send test event (probes the Smalk API with a X-Smalk-Probe: 1 header so the test never appears in your real analytics), Force-flush queue now, Resume sending (reset circuit breaker), and Clear page caches (purges WP Rocket, W3 Total Cache, LiteSpeed Cache, Autoptimize, WP Super Cache, Hummingbird, FlyingPress, SG Optimizer, WP Fastest Cache, Cache Enabler, WP Engine, plus the WordPress object cache and PHP OPcache).
  • Auto-save when you paste a new API key — no scrolling to “Save changes”.
  • Debug report generator: collects plugin version, queue + sender stats, cron schedules, detected cache plugins, live connectivity probes, and active plugins into a single text blob you can email to support — never dumps raw error_log (no risk of leaking unrelated secrets).

Performance

  • <link rel="preconnect"> + <link rel="dns-prefetch"> for the tracker origin before the async script tag — saves ~50–150 ms TLS+DNS handshake on cold connections (mostly mobile / first-time visitors).
  • Tracker JS auto-excluded from 10+ caching/minify plugins (WP Rocket, Autoptimize, W3TC, LiteSpeed, SG Optimizer, Hummingbird, FlyingPress, Asset CleanUp, …) so caching never strips it.
  • Zero impact on TTFB: every flavor of outbound call is either non-blocking (direct mode) or deferred to cron (queued mode, the default).

WHY CHOOSE SMALK AI

  • See AI traffic accurately. Most analytics drop AI agents entirely; Smalk catches them server-side and classifies them in real time.
  • Never lose an event. The queued mode keeps tracking your site even when our API is briefly unreachable.
  • Privacy-first defaults. Sensitive query parameters are redacted, admin / cron / AJAX requests are skipped, no cookies on visitors.
  • No bot blocking. Smalk helps you grow your AI footprint, not shrink it.
  • Plays nicely with every host and cache. Tested on WP 5.0 / PHP 7.0 through WP 6.9 / PHP 8.3. Auto-detects and works around the major caching / minify plugins.

EXTERNAL SERVICES

This plugin connects to Smalk AI’s API services for analytics and tracking functionality. The following data transmissions occur:

  1. Analytics Tracking:

    • The plugin sends website visit data to Smalk AI when AI agents visit your site
    • Data sent includes: visit information, page URLs, and user agent data
  2. Project Configuration:

    • The plugin connects to the Smalk API to retrieve project settings when the API key is saved or updated
    • Project data is cached locally and reused — no API calls are made on page loads
    • Only authenticated API requests are made using your project key
    • No personal user data is transmitted during these requests

For more information about our data handling practices, please visit:
– Terms of Service: https://smalkapp.notion.site/Terms-of-service-Smalk-9f047b4200b84b70a4fb38142cfb5799
– Privacy Policy: https://smalkapp.notion.site/Privacy-Policy-Smalk-08a503612b3e481596b0a434e96dd7c1?pvs=74

Screenshots

Installation

  1. Install and activate the Smalk AI plugin from the WordPress Plugin Directory.
  2. Go to the plugin’s settings page.
  3. Connect your Smalk AI account and copy your API Key
  4. Configure your AI agent tracking preferences and start optimizing your content!

FAQ

  • Do I need a Smalk account to use this plugin?
    Yes. Sign up at app.smalk.ai (30 seconds) to get an API key. The free tier covers most small to mid-size sites.

  • Is Smalk free?
    Freemium model. Basic AI traffic analytics are free;

  • What AI agents does the plugin track?
    All major ones: GPTBot, ChatGPT-User, OAI-SearchBot, ClaudeBot, Claude-Web, PerplexityBot, Gemini, GoogleOther, Google AI Overviews, MistralAI-User, AppleBot-Extended, Meta-ExternalAgent, and 30+ others. The list is maintained son a daily basis, so you get new agents the day they show up — no plugin update needed.

  • Will it slow my site down?
    No. By default (queued mode, since 1.1.0) the plugin only writes one row to a local table per visit, then a background cron flushes them in batches every minute. TTFB impact is below a couple of milliseconds. The tracker.js for client-side capture loads async with a preconnect hint.

  • What happens if the Smalk API goes down?
    Visits keep being buffered locally for up to 7 days (configurable). A built-in circuit breaker pauses outbound calls after repeated failures and resumes automatically once the API recovers. Nothing is lost during transient outages.

  • Does the plugin set cookies?
    No cookies are set by Smalk on your visitors.

  • Does it block AI bots?
    No. Smalk is the opposite of a bot-blocker. AI-driven discovery is the future of web traffic — we help you measure it and optimize for it.

  • Where is my API key stored?
    In the wp_options table by default (same place every WP plugin stores credentials). For enterprise sites that want secrets out of the database, define SMALK_AI_API_KEY in wp-config.php — the constant overrides the DB value.

  • What data is sent to Smalk?
    Standard HTTP request metadata: path, method, user-agent, referer, proxy headers (Cloudflare, X-Forwarded-For, geo headers). Sensitive query parameter values (password, token, card, ssn, …) are redacted to [REDACTED] before transmission. No page content, no cookies, no personal identifiers beyond what’s in the request headers themselves.

  • Why should I care about AI agents?
    AI search is replacing classic Google for a growing share of queries. If GPTBot, ClaudeBot, or PerplexityBot can’t find or read your content, your future audience won’t see you. Smalk shows you exactly which engines visit, which pages they prefer, and which of those engines actually convert to human traffic.

Reviews

Machi 18, 2025
Just added on different blogs and I cannot imagine the way I’m present on AI live search products like ChatGPT and Perplexity
Machi 11, 2025
I was one of their early beta users and the team was super helpful understanding AI traffic and how AI models were scraping my content. Work like a charm, as described!
Machi 11, 2025
Since a month or two, I was witnessing traffic from ChatGPT that wasn’t in my Google Analytics. I found this plugin and the team was super helpful!! Since then, I discovered that I had a LOT of visits from AI Search and LLMs with referrals. I can now track this new marketing channel and optimize it!!
Soma maoni yote 3

Wachangiaji & Wasanidi

“Smalk AI Analytics” is open source software. The following people have contributed to this plugin.

Contributors

Translate “Smalk AI Analytics” into your language.

Interested in development?

Browse the code, check out the SVN repository, or subscribe to the development log by RSS.

Changelog

1.1.0

Reliability

  • Server-side tracking now buffers visits in a local MySQL queue and flushes them in batches via the new POST /api/v1/tracking/visit/batch/ endpoint. Survives temporary outages — no more lost events when api.smalk.ai is unreachable.
  • Circuit breaker pauses outbound calls after 3 consecutive failures (default 30-min cooldown) to avoid hammering a degraded API.
  • Exponential-backoff retry per row plus daily cleanup of stale / poison entries (7-day retention default).
  • Queue table is auto-created on plugin update (not just first activation) via ensure_schema() on plugins_loaded + one-shot self-heal in Smalk_Queue::insert if the table goes missing.
  • Smalk_Queue::insert failures (DB error, queue full, JSON encode failure) now log via error_log AND surface in the Status panel’s “Last error” tile. No more silent data loss.
  • New “Force-flush queue now” button is throttled to once per 30 s so spam-clicking during an outage doesn’t bypass the backoff.

Settings & UI

  • Setting: pick between Queued (recommended default) and Direct (legacy fire-and-forget) tracking modes.
  • Default batch_size raised from 100 to 200 (bounded 50–500). ≈12 000 visits/hour at the 60 s cron tick. Backend hard-caps at 500 visits / request.
  • Redesigned admin page: card-based layout, design tokens (violet + neutral grays), 4-tile Status & Diagnostics grid. Removed the legacy table-of-steps layout and the full-width pink CTA pill. Renamed “Tracking” to “AI Analytics” in the Connection card to match the app dashboard sidebar. Shared style guide: docs/Plugins/wordpress-admin-style-guide.md.
  • Status panel actions: “Send test event”, “Force-flush queue”, “Resume sending (reset circuit breaker)”, “Clear page caches” (purges WP Rocket, W3TC, LiteSpeed, Autoptimize, WP Super Cache, Hummingbird, FlyingPress, SG Optimizer, WP Fastest Cache, Cache Enabler, WP Engine, WP object cache + PHP OPcache).
  • Auto-save on API key paste/blur — no more “I scrolled past Save changes and forgot”.

Privacy

  • Per-site redact list for query parameter names (default covers password, token, auth, card, ssn, etc.). Matched values are replaced with [REDACTED] before the URL is stored or sent.
  • Deny paths textarea with wildcard patterns (one per line). Layered on top of the built-in exclusion list.
  • Per-request-type skip toggles (admin, cron, AJAX, REST). Admin / cron / AJAX skipped by default.

API key storage

  • API key stored in wp_options (matches the standard practice of every comparable WP plugin: Sentry, NewRelic, Profound, MailChimp, etc.).
  • Sites that want the key out of the database can set it via define('SMALK_AI_API_KEY', '…') in wp-config.php. The override takes precedence and is the recommended path for enterprise / audited environments.

Debug & support

  • Debug report rewritten as an AJAX endpoint returning a structured, Smalk-specific text report: plugin version, tracking mode, API base URL + override source, key source (plain / constant / legacy ciphertext), workspace info, queue + sender stats, cron schedules, privacy rules, live connectivity probe to /projects/ and /tracking/visit/batch/ (with latency), detected cache plugins, active plugins, recent queue entries. No raw error_log dump (could leak unrelated secrets / explode on shared hosting); the report only tails Smalk-tagged lines from wp-content/debug.log when WP_DEBUG_LOG is on. Modal restyled with the design system.
  • “Send test event” + debug connectivity probe now send X-Smalk-Probe: 1. The backend ACKs with 202 but does not dispatch a Celery task, so support diagnostics never pollute production analytics with synthetic ScrapEvent rows.

Path exclusions (built-in, not configurable)

  • Always skipped: /wp-admin/*, /wp-login*, /wp-cron*, /wp-json/*, /wp-includes/*, /wp-content/*, /xmlrpc.php, /favicon.ico, /apple-touch-icon*.png, plus any ?rest_route=… REST API call and any request where REST_REQUEST is defined.
  • Still tracked (useful AI-analytics signals): /robots.txt, /sitemap*.xml, /ads.txt, /security.txt, /.well-known/*.

Backend companion changes (relevant for self-hosted Smalk deployments)

  • New endpoint POST /api/v1/tracking/visit/batch/ (FastAPI + Django mirror, same path-replay routing as /visit/). Returns {received, queued, probe}.
  • Hard cap 500 visits 413 Payload Too Large (was 422). Pydantic / DRF validation errors stay 400 / 422.
  • X-Smalk-Probe: 1 short-circuit returns 202 without dispatching tasks.
  • Full OpenAPI annotations on both sides.

Cleanup

  • Proper uninstall.php — drops the queue table, deletes plugin options + transients, clears scheduled cron jobs. Opt-in “Preserve data” checkbox for users who plan to re-install.

Performance (rolled in from the unreleased 1.1.0-dev)

  • Inject <link rel="preconnect"> + <link rel="dns-prefetch"> for the Smalk tracker origin before the async script tag — saves ~50-150 ms TLS+DNS handshake on cold connections (mostly mobile / first-time visitors).
  • Validate parse_url returns both scheme and host before building the preconnect href (no malformed https:// link emitted if the API base URL is misconfigured).

1.0.15
– Fix: removed page-level cache-busting that disabled ALL caching site-wide (DONOTCACHEPAGE, DONOTCACHEOBJECT, DONOTCACHEDB were set on every frontend request — devastating for sites using Cloudflare, Varnish, WP Rocket, etc.)
– Perf: skip server-side tracking on admin/system/REST paths early to avoid unnecessary DB queries on every admin request
– Perf: debug logs now gated behind WP_DEBUG — no more log noise in production
– Improved: cleaner tracker.js injection, removed dead backup script method
– Tested up to WordPress 6.9.4

1.0.14
– Fix: replaced define() inside function with local variables to prevent fatal error on re-registration
– Tested up to WordPress 6.9

1.0.13
– Added X-Smalk-CMS and X-Smalk-Plugin-Version headers on tracking requests for CMS detection

1.0.12
– Fix: use cached project ID instead of calling the API on every page load
– Fix: add trailing slashes to all API endpoint URLs to avoid 301 redirects
– Fix: use centralized Smalk_API class for all endpoint URLs instead of hardcoded strings
– Remove: robots.txt management section (feature removed in 1.0.9)

1.0.11
– Bug fix

1.0.10
– Remove PHPSESSION cookie.

1.0.9
– Remove unnecessary code for robots.txt

1.0.8
– Fix bug with Cache Plugins and robots.txt

1.0.7
– Fix bug with Cache Plugins

1.0.6
– Fix bug with Cache Plugins

1.0.5
– Add Debuging Mode

1.0.4
– Bug fix

1.0.3
– Bug fix

1.0.2
– Bug fix

1.0.1
– Initial release