The intelligence layer for your AI API stack

Track API usage across Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, and Antigravity in one place. onWatch polls every provider's quotas, stores historical data locally, and surfaces patterns, projections, and cross-provider context-so you always know where your budget stands and which provider has headroom. Get notified via email or push notifications when quotas approach limits.

One-command install
Mac & Linux $ curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash Copy
Homebrew $ brew install onllm-dev/tap/onwatch Copy
Windows PS> irm https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.ps1 | iex Copy

v2.11.5 · Go · GPL-3.0 · Zero Telemetry · <50 MB RAM · ~13 MB Binary

localhost:9211
Dashboard
Anthropic Synthetic Z.ai Antigravity All
Anthropic
Five-Hour
0%
2h 18m
Warning
Seven-Day
0%
4d 12h
Healthy
Sonnet (7-Day)
0%
4d 12h
Healthy
Usage over time - Last 6 hours
Five-Hour Seven-Day Sonnet

Six providers, one dashboard

Each provider has different quotas, reset cycles, and gotchas. onWatch handles the complexity.

Codex
  • 5-Hour Limit - rolling limit utilization + reset
  • Review Requests - code review quota
  • Multi-Account (Beta) - track multiple accounts
OAuth token from ~/.codex/auth.json. Multi-account: save profiles with onwatch codex profile save <name>, refresh with onwatch codex profile refresh <name>.
Anthropic
  • five_hour - utilization + reset
  • seven_day - utilization + reset
  • Per-model - Sonnet, Opus breakdown
Auto-detects token from Claude Code (Keychain, keyring, or credentials file).
Synthetic
  • subscription - ~5h reset cycle
  • search.hourly - 250 req/hr
  • toolCallDiscounts - ~24h cycle
Polls /v2/quotas every 60s. Does not count against quota.
Z.ai
  • TOKENS_LIMIT - daily token budget
  • TIME_LIMIT - daily + per-model
  • Tool Calls - daily limit
Normalizes Z.ai's confusing field names (usage = limit, currentValue = used).
GitHub Copilot
  • premium_interactions - monthly limit (300-1500)
  • chat - unlimited
  • completions - unlimited
Uses GitHub PAT with copilot scope. Monthly reset cycle detection.
Antigravity
  • Claude models - Sonnet, Opus quotas
  • Gemini models - Pro, Flash quotas
  • GPT OSS models - Open-source GPT variants
Zero configuration - automatically connects to your running Antigravity instance.
Cross-Provider View
See all providers side-by-side. Compare headroom at a glance and route work to whichever provider has capacity.
Only in onWatch

You can't optimize what you can't measure

Provider dashboards show a snapshot. onWatch shows the story.

Discover your usage patterns

After a week of data, you stop guessing and start knowing when you burn quota fastest. onWatch reveals consumption rhythms that provider dashboards never show.

Anthropic five_hour utilization hits 80% by 2 PM on weekdays but barely moves on weekends.

Know before you're throttled

Extrapolates your consumption rate to the next reset boundary. Know whether you'll run out before relief arrives - and switch providers before it happens.

Anthropic seven_day: 31% used, resets Feb 14. Z.ai tokens: 95% used, resets tomorrow. Route to Anthropic.

Route intelligently across providers

See every provider's headroom at a glance. When one nears its limit, you know exactly which alternative still has capacity.

Anthropic: 50% (5h). Synthetic: 95% (search). Z.ai: 34% (tokens). Route to Z.ai.

Never waste quota near a reset

Tracks every reset boundary independently - Anthropic's 5-hour and 7-day windows, Synthetic's ~5h subscription, Z.ai's daily token budget. Know when each resets and how much went unused.

Anthropic five_hour resets in 12 min. 50% utilization. Synthetic subscription resets in 3h.

What onWatch adds to every provider

Provider dashboards show a number. onWatch shows the context, history, and projections behind it.

Multi-Account Codex Beta

Track multiple Codex accounts simultaneously. Save profiles with onwatch codex profile save, switch between accounts in the dashboard, and see per-account usage charts, sessions, and cycle history. Perfect for teams or personal/work account separation.

Usage trends & cycle detection

Time-series charts with 1h-30d ranges for every provider. Automatic reset cycle detection for Anthropic's 5-hour and 7-day windows, Synthetic's subscription cycles, and Z.ai's daily budget. Logs peak usage and delta per cycle.

Live countdowns & projections

Real-time countdown to each quota reset. Extrapolates your current rate to the next boundary so you know whether you'll make it - or should switch providers.

Session tracking & insights

Every agent run logs peak consumption per provider. Compare sessions side-by-side, see cycle utilization trends, and get cross-provider routing recommendations.

Email & push alerts Beta

Get notified when quotas cross thresholds. SMTP email with AES-256 encrypted passwords, or browser push notifications via the PWA. Configure per-quota thresholds and delivery channels.

PWA installable Beta

Install onWatch from your browser for a native app experience. Works on desktop and mobile. Powered by Web Push (VAPID) with zero external dependencies. Note: Push notifications require HTTPS.

Anthropic OAuth auto-refresh

Automatically refreshes Anthropic tokens before expiry. Detects Claude Code credential changes, handles auth failures gracefully, and resumes polling when new tokens appear. Never manually update credentials.

Self-updating binary

Update from CLI or dashboard with one click. Automatic systemd restart on Linux, graceful binary replacement on macOS. Always stay current without manual downloads or service interruption.

Enterprise-grade security

AES-256-GCM encrypted SMTP passwords, auto-generated VAPID keys (ECDSA P-256), Web Push RFC 8291 encryption, constant-time auth, parameterized SQL. Zero telemetry - all data stays local.

REST API & Docker

15+ JSON endpoints for integration with your monitoring stack. Docker support with distroless base (~12 MB), non-root user, and Docker Compose ready. Perfect for DevOps workflows.

What your provider doesn't show you

API providers show current usage. onWatch shows everything else.

Capability Synthetic Z.ai Anthropic Codex Copilot onWatch
Current quota usage
Reset time visibility
Historical usage trends
Reset cycle detection & history
Per-cycle consumption stats
Usage rate & projections
Per-session tracking
Cross-provider unified view
Live countdown timers
Email & push alerts
PWA installable
OAuth token auto-refresh
Self-updating binary
REST API & Docker support
Open source & self-hosted

How the intelligence works

A background agent polls your API quotas, stores snapshots in SQLite, detects patterns, and serves an intelligence dashboard. That's it.

Your Providers
Synthetic + Z.ai + Anthropic + Codex + Copilot
onWatch Agent
Poll → Detect → Store → Analyze
Intelligence Dashboard
Patterns, projections, decisions
<50 MB
RAM Usage
0
Dependencies
SQLite
Local Intelligence Store
60s
Data Collection Interval

Who is onWatch for?

Anyone who pays for AI coding API access and wants to know where the budget goes.

Solo Developers

Running Cline, Claude Code, or Kilo Code on a single API key? onWatch tracks your burn rate across Anthropic, Synthetic, Z.ai, Codex, and GitHub Copilot so you never get throttled mid-task.

Teams Sharing API Keys

Multiple people on the same Anthropic Max plan or shared Z.ai key? A single onWatch instance gives everyone a shared dashboard with historical trends and session tracking.

Multi-Provider Users

Subscribed to more than one provider? The cross-provider view shows every quota side-by-side so you can route work to whichever provider has headroom.

DevOps & Platform Engineers

Deploy as a ~13 MB binary (<50 MB RAM) with zero SaaS dependencies. Use the SQLite database as a Grafana data source or pipe the REST API into your existing monitoring stack.

Get up and running in minutes

Choose the method that works best for you. Supports macOS, Linux, and Windows.

Windows & Docker

PowerShell installer for Windows with interactive setup and auto-detection of Claude Code/Codex credentials. Docker for containerized deployments with distroless image (~12 MB), non-root user, and persistent data volume.

Windows (PowerShell)
# Interactive installer - auto-detects credentials irm https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.ps1 | iex # Manage: onwatch, onwatch stop, onwatch --debug
Docker
# Clone & configure git clone https://github.com/onllm-dev/onwatch.git cd onwatch && cp .env.docker.example .env vim .env # add your API keys # Run with Docker Compose docker-compose up -d docker-compose logs -f # view logs
Manual Download

Download the binary for your platform from GitHub Releases. Best for Linux with systemd service management.

download & install
# Download (Linux AMD64) curl -L -o onwatch \ https://github.com/onllm-dev/onwatch/releases/latest/download/onwatch-linux-amd64 chmod +x onwatch && sudo mv onwatch /usr/local/bin/
configure & manage (Linux)
onwatch setup # configure providers # Manage via systemd systemctl start onwatch # start systemctl stop onwatch # stop systemctl status onwatch # check status journalctl -u onwatch -f # live logs
Build from Source

Clone the repo and build with app.sh. Requires Go 1.25+. Full control over build flags and development workflow. Includes 486 tests with race detection.

clone & build
git clone https://github.com/onllm-dev/onwatch.git cd onwatch cp .env.example .env vim .env # add your API keys ./app.sh --build # build binary
develop & test
./app.sh --test # run tests with -race ./app.sh --smoke # quick pre-commit check ./onwatch --debug # run in foreground ./onwatch # start daemon

See DEVELOPMENT.md for advanced build options and cross-compilation.

Frequently asked questions

How do I get started?

The fastest way is Homebrew: brew install onllm-dev/tap/onwatch, then onwatch setup to configure your providers interactively. Alternatively, install with one command: curl -fsSL https://raw.githubusercontent.com/onllm-dev/onwatch/main/install.sh | bash. The setup wizard auto-detects Claude Code and Codex credentials, prompts for API keys, and configures dashboard credentials. onWatch polls each configured provider every 60 seconds, stores snapshots in SQLite, and serves a dashboard at localhost:9211 with live countdowns, charts, and cross-provider views.

Does onWatch work with Cline, Roo Code, Kilo Code, or Claude Code?

Yes. onWatch monitors the API provider (Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, or Antigravity), not the coding tool. Any tool that uses these API keys-including Cline, Roo Code, Kilo Code, Claude Code, Cursor, GitHub Copilot, Antigravity, and others-will have its usage tracked automatically.

How does Anthropic API tracking work?

Anthropic's Pro/Max plan exposes utilization percentages and reset times for five_hour and seven_day windows, plus per-model breakdowns (seven_day_sonnet, seven_day_opus). onWatch polls this data, stores historical snapshots, and adds what Anthropic doesn't show: usage trends over time, reset cycle detection, rate projections, and cross-provider context alongside Synthetic, Z.ai, Codex, and GitHub Copilot. Set ANTHROPIC_TOKEN in your .env or let onWatch auto-detect from Claude Code credentials.

How does Antigravity tracking work?

Antigravity provides access to multiple AI models (Claude, Gemini, GPT). onWatch auto-detects the Antigravity language server running on your machine by scanning for the process and extracting connection details. Set ANTIGRAVITY_ENABLED=true in your .env file. Models are grouped into logical quota pools (Claude+GPT, Gemini Pro, Gemini Flash) for cleaner tracking. For Docker deployments, configure ANTIGRAVITY_BASE_URL and ANTIGRAVITY_CSRF_TOKEN manually.

What is the Both view and why does it matter?

The All Providers view is onWatch's cross-provider unified dashboard. It shows Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, and Antigravity quotas side-by-side so you can compare headroom across providers at a glance. For example, if your Synthetic search quota is at 95% but Z.ai tokens are at 34%, you know to route work to Z.ai. No other tool provides this cross-provider intelligence-provider dashboards only show their own data.

Does onWatch send any data to external servers?

No. onWatch has zero telemetry. All usage data is stored locally in a SQLite file on your machine. The only outbound network calls are to the Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, and Antigravity quota APIs you configure. No analytics, no tracking, no cloud. The source code is fully auditable on GitHub (GPL-3.0).

How much memory does onWatch use?

onWatch uses <50 MB RAM under all conditions (typically ~34 MB idle, ~43 MB under heavy load), measured with all six agents (Synthetic, Z.ai, Anthropic, Codex, GitHub Copilot, Antigravity) polling in parallel. Breakdown: Go runtime (5 MB), SQLite in-process (2 MB), HTTP server (1 MB), polling buffer (1 MB). This is lighter than a single browser tab and designed to run as a background daemon indefinitely.

How do email and push notifications work?

Configure SMTP settings in the Settings page to receive email alerts when quotas cross warning or critical thresholds, or when they reset. SMTP passwords are encrypted at rest with AES-256-GCM. For push notifications (Beta), enable them in Settings → Notifications → Delivery Channels and allow browser notifications. onWatch uses the Web Push protocol (VAPID) with no external service dependencies - VAPID keys are auto-generated and all encryption is handled locally. Note: Push notifications require HTTPS to work. You can choose email, push, or both.

What platforms does onWatch support?

Pre-built binaries are available for macOS (ARM64 and AMD64), Linux (AMD64 and ARM64), and Windows (AMD64). onWatch is written in pure Go with no CGO dependencies, so it cross-compiles cleanly to any Go-supported platform.

Start making intelligent API decisions

Every hour you run without onWatch is an hour of usage data you'll never get back. Install in under a minute, start collecting intelligence immediately. Free, open source, zero telemetry.

Star History

Star History Chart