The intelligence layer for your AI API stack

Track API usage across Synthetic, Z.ai, and Anthropic in one place. onWatch polls every provider's quotas, stores historical data locally, and surfaces patterns, projections, and cross-provider context—so you always know where your budget stands and which provider has headroom.

One-command install
$ curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash Copy

v1.8.0 · Go · GPL-3.0 · Zero Telemetry · ~25 MB

localhost:9211
Dashboard
Synthetic Z.ai Anthropic All
Subscription
3h 47m
0%
154 / 1,350
Search (Hourly)
38m
238 / 250
Tool Calls
16h 22m
0%
14,191 / 16,200
Usage over time — Last 6 hours
Subscription Search Tool Calls

Three providers, one dashboard

Each provider has different quotas, reset cycles, and gotchas. onWatch handles the complexity.

Anthropic
  • five_hour — utilization + reset
  • seven_day — utilization + reset
  • Per-model — Sonnet, Opus breakdown
Auto-detects token from Claude Code (Keychain, keyring, or credentials file).
S
Synthetic
  • subscription — ~5h reset cycle
  • search.hourly — 250 req/hr
  • toolCallDiscounts — ~24h cycle
Polls /v2/quotas every 60s. Does not count against quota.
Z.ai
  • TOKENS_LIMIT — daily token budget
  • TIME_LIMIT — daily + per-model
  • Tool Calls — daily limit
Normalizes Z.ai's confusing field names (usage = limit, currentValue = used).
Cross-Provider View
See all providers side-by-side. Compare headroom at a glance and route work to whichever provider has capacity.
Only in onWatch

You can't optimize what you can't measure

Provider dashboards show a snapshot. onWatch shows the story.

Discover your usage patterns

After a week of data, you stop guessing and start knowing when you burn quota fastest. onWatch reveals consumption rhythms that provider dashboards never show.

Anthropic five_hour utilization hits 80% by 2 PM on weekdays but barely moves on weekends.

Know before you're throttled

Extrapolates your consumption rate to the next reset boundary. Know whether you'll run out before relief arrives — and switch providers before it happens.

Anthropic seven_day: 31% used, resets Feb 14. Z.ai tokens: 95% used, resets tomorrow. Route to Anthropic.

Route intelligently across providers

See every provider's headroom at a glance. When one nears its limit, you know exactly which alternative still has capacity.

Anthropic: 50% (5h). Synthetic: 95% (search). Z.ai: 34% (tokens). Route to Z.ai.

Never waste quota near a reset

Tracks every reset boundary independently — Anthropic's 5-hour and 7-day windows, Synthetic's ~5h subscription, Z.ai's daily token budget. Know when each resets and how much went unused.

Anthropic five_hour resets in 12 min. 50% utilization. Synthetic subscription resets in 3h.

What onWatch adds to every provider

Provider dashboards show a number. onWatch shows the context, history, and projections behind it.

Historical usage trends

Time-series charts with 1h, 6h, 24h, 7d, and 30d ranges for every provider. Consumption patterns emerge that a single snapshot can never reveal.

Reset cycle detection

Detects when quotas reset across all providers — Anthropic's 5-hour and 7-day windows, Synthetic's subscription cycles, Z.ai's daily token budget. Logs peak usage and total delta per cycle.

Live countdowns & projections

Real-time countdown to each quota reset. Extrapolates your current rate to the next boundary so you know whether you'll make it — or should switch providers.

Session tracking & insights

Every agent run logs peak consumption per provider. Compare sessions side-by-side, see cycle utilization trends, and get cross-provider routing recommendations.

What your provider doesn't show you

API providers show current usage. onWatch shows everything else.

Capability Synthetic Z.ai Anthropic onWatch
Current quota usage
Reset time visibility
Historical usage trends
Reset cycle detection & history
Per-cycle consumption stats
Usage rate & projections
Per-session tracking
Cross-provider unified view
Live countdown timers
Open source & self-hosted

How the intelligence works

A background agent polls your API quotas, stores snapshots in SQLite, detects patterns, and serves an intelligence dashboard. That's it.

Your Providers
Synthetic + Z.ai + Anthropic
onWatch Agent
Poll → Detect → Store → Analyze
Intelligence Dashboard
Patterns, projections, decisions
~25 MB
RAM Idle
0
Dependencies
SQLite
Local Intelligence Store
60s
Data Collection Interval

Who is onWatch for?

Anyone who pays for AI coding API access and wants to know where the budget goes.

Solo Developers

Running Cline, Claude Code, or Kilo Code on a single API key? onWatch tracks your burn rate across Anthropic, Synthetic, and Z.ai so you never get throttled mid-task.

Teams Sharing API Keys

Multiple people on the same Anthropic Max plan or shared Z.ai key? A single onWatch instance gives everyone a shared dashboard with historical trends and session tracking.

Multi-Provider Users

Subscribed to more than one provider? The cross-provider view shows every quota side-by-side so you can route work to whichever provider has headroom.

DevOps & Platform Engineers

Deploy as a ~25 MB sidecar with zero SaaS dependencies. Use the SQLite database as a Grafana data source or pipe the REST API into your existing monitoring stack.

Get up and running in minutes

Choose the method that works best for you. Supports macOS (ARM64, AMD64) and Linux (AMD64, ARM64).

Manual Download

Download the binary for your platform from GitHub Releases, configure your .env, and run directly.

bash
# Download (Linux AMD64 example) curl -L -o onwatch \ https://github.com/onllm-dev/onwatch/releases/latest/downloa./onwatch-linux-amd64 chmod +x onwatch # Create config cat > .env << 'EOF' SYNTHETIC_API_KEY=syn_your_key_here ZAI_API_KEY=your_zai_key_here ONWATCH_ADMIN_USER=admin ONWATCH_ADMIN_PASS=changeme ONWATCH_PORT=9211 EOF # Run ./onwatch # start (background) ./onwatch --debug # foreground mode ./onwatch stop # stop ./onwatch status # check status

Want to build from source? See DEVELOPMENT.md

Frequently asked questions

How do I get started?

Install with one command: curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash. Add your API keys to ~/.onwatch/.env — set any combination of SYNTHETIC_API_KEY, ZAI_API_KEY, or ANTHROPIC_TOKEN. onWatch polls each configured provider every 60 seconds, stores snapshots in SQLite, and serves a dashboard at localhost:9211 with live countdowns, charts, and cross-provider views.

Does onWatch work with Cline, Roo Code, Kilo Code, or Claude Code?

Yes. onWatch monitors the API provider (Synthetic, Z.ai, or Anthropic), not the coding tool. Any tool that uses these API keys—including Cline, Roo Code, Kilo Code, Claude Code, Cursor, Windsurf, and others—will have its usage tracked automatically.

How does Anthropic API tracking work?

Anthropic's Pro/Max plan exposes utilization percentages and reset times for five_hour and seven_day windows, plus per-model breakdowns (seven_day_sonnet, seven_day_opus). onWatch polls this data, stores historical snapshots, and adds what Anthropic doesn't show: usage trends over time, reset cycle detection, rate projections, and cross-provider context alongside Synthetic and Z.ai. Set ANTHROPIC_TOKEN in your .env or let onWatch auto-detect from Claude Code credentials.

What is the Both view and why does it matter?

The Both view is onWatch's cross-provider unified dashboard. It shows Synthetic, Z.ai, and Anthropic quotas side-by-side so you can compare headroom across providers at a glance. For example, if your Synthetic search quota is at 95% but Z.ai tokens are at 34%, you know to route work to Z.ai. No other tool provides this cross-provider intelligence—provider dashboards only show their own data.

Does onWatch send any data to external servers?

No. onWatch has zero telemetry. All usage data is stored locally in a SQLite file on your machine. The only outbound network calls are to the Synthetic, Z.ai, and Anthropic quota APIs you configure. No analytics, no tracking, no cloud. The source code is fully auditable on GitHub (GPL-3.0).

How much memory does onWatch use?

onWatch idles at ~25-30 MB RAM and peaks at ~50 MB during dashboard rendering. Breakdown: Go runtime (5 MB), SQLite in-process (2 MB), HTTP server (1 MB), polling buffer (1 MB). This is lighter than a single browser tab and designed to run as a background daemon indefinitely.

What platforms does onWatch support?

Pre-built binaries are available for macOS (ARM64 and AMD64), Linux (AMD64 and ARM64), and Windows (AMD64). onWatch is written in pure Go with no CGO dependencies, so it cross-compiles cleanly to any Go-supported platform.

Start making intelligent API decisions

Every hour you run without onWatch is an hour of usage data you'll never get back. Install in under a minute, start collecting intelligence immediately. Free, open source, zero telemetry.