Commit Graph

34 Commits

Author SHA1 Message Date
Alexander b07d999d86 docs: add metrics reference 2026-04-14 17:54:32 +02:00
Alexander 27b647e9b4 refactor(ratelimit): remove per-window token tracking from proxy
Window token counts are now computed in Grafana using the @ modifier
with dashboard variables derived from proxy_usage_resets_at. This
eliminates in-memory state, file persistence, and restart sensitivity.

Removes: TokensIn/Out, RecordTokens, setResetTime, persist.go,
window_tokens observable gauges. -171 lines.
2026-04-14 14:25:31 +02:00
Alexander 273213cbed feat(ratelimit): persist window token counters across restarts
Save window state (resets_at + token counts) to ~/.claude/ on shutdown
and every poll cycle. On startup, restore counters if the window hasn't
rolled over. Fixes token counters resetting to zero on deploy.
2026-04-14 14:07:28 +02:00
Alexander b864092dad fix(stream): extract input tokens from message_start event
message_delta only contains output_tokens. Input tokens are in the
message_start event under message.usage.input_tokens. This was causing
input token counts to be near-zero for all streaming requests.
2026-04-14 13:55:06 +02:00
Alexander 0ab1896eef Revert "refactor(ratelimit): remove in-memory per-window token tracking"
This reverts commit eda66ff7d4.
2026-04-14 13:50:34 +02:00
Alexander eda66ff7d4 refactor(ratelimit): remove in-memory per-window token tracking
Token counts per rate limit window are now derived in Grafana via
increase(counter[5h/168h]) on the existing cumulative OTel counters.
Removes TokensIn/Out from Window, RecordTokens, setResetTime, and
the window_tokens observable gauges.
2026-04-14 13:49:05 +02:00
Alexander 744abc1d24 fix(ratelimit): clear window token counters on reset from response headers
UpdateFromHeaders was silently updating ResetsAt without clearing token
counters. When a window rolled over, the poll method would see ResetsAt
already updated and skip the reset. Extract setResetTime helper used by
both code paths.
2026-04-14 13:37:06 +02:00
Alexander e8af26d626 docs: rewrite README to cover all proxy features 2026-04-14 13:17:54 +02:00
Alexander fac9578975 feat(ratelimit): track per-window token usage and utilization
Poll /api/oauth/usage every 5 min and extract utilization from
/v1/messages response headers for real-time updates. Track proxy
tokens in/out per rate limit window (5h/7d), resetting on window
change. Expose as OTel observable gauges for Grafana dashboards.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:51:31 +02:00
Alexander 76aeeb6be1 fix(auth): add oauth-2025-04-20 beta header + debug logging
Ensure anthropic-beta includes oauth-2025-04-20 when using OAuth tokens,
fixing 401 "OAuth authentication is currently not supported" errors.
Add debug-level logging for upstream requests/responses, sniffed headers,
and token refresh operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:08:08 +02:00
Alexander 9cc052c162 Add telemetry 2026-04-14 10:31:56 +02:00
Alexander 20049881ad Remove duplicate logging 2026-04-11 15:21:18 +02:00
Alexander 3435f5f4c5 Update example 2026-04-10 18:27:29 +02:00
Alexander 807e8ba133 fix(nix): update vendorHash and vendor dir for new deps 2026-04-10 18:25:19 +02:00
Alexander da59d8f83b refactor(auth): migrate to zerolog structured logging 2026-04-10 18:19:13 +02:00
Alexander 4e22c463cf refactor(proxy): migrate to zerolog structured logging 2026-04-10 18:19:13 +02:00
Alexander 76bf651742 refactor(server): migrate to zerolog, add request logging middleware 2026-04-10 18:19:13 +02:00
Alexander 3d1eb7bd4b refactor(main): migrate to zerolog structured logging 2026-04-10 18:19:13 +02:00
Alexander bfcbe0b37d feat(config): add logging configuration fields 2026-04-10 18:15:49 +02:00
Alexander a7b583839d feat(logging): add zerolog + lumberjack structured logging package 2026-04-10 18:15:49 +02:00
Alexander c5f6962104 Package proxy with nix 2026-04-10 14:44:07 +02:00
Alexander 5ec0004e4c Update example rules 2026-04-10 14:36:59 +02:00
Alexander bf68a0fbeb Update flake deps 2026-04-10 14:33:11 +02:00
Alexander e3c4854be0 fix(auth): bind callback server to localhost for IPv4/IPv6 compat, fix nil deref 2026-04-10 14:30:23 +02:00
Alexander 8b7d9bfff9 docs: update README and config for self-managed authentication 2026-04-10 14:17:46 +02:00
Alexander 65e843f57a feat: wire OAuth login into startup, auto-detect credentials 2026-04-10 14:17:46 +02:00
Alexander 9858530ff6 fix(auth): handle credential file creation in persistCredential 2026-04-10 14:14:42 +02:00
Alexander 21176949a6 feat(auth): add OAuth PKCE login flow with browser + manual fallback 2026-04-10 14:14:42 +02:00
Alexander 945a865bbe refactor(config): remove claude_credentials, add default credential path 2026-04-10 14:14:38 +02:00
Alexander 17cde479c3 Remove dead code, secure debug endpoints, fix encapsulation 2026-04-10 13:07:26 +02:00
Alexander 4abd4e68dc Fixes, readme
Drop cli-proxy-api token handling, use only native Claude credentials.
Simplify refresh to single endpoint (platform.claude.com) with scope.
Add debug/refresh and debug/shutdown endpoints. Graceful shutdown.
2026-04-10 12:56:42 +02:00
Alexander f22765d8f0 Fixes, readme 2026-04-09 23:06:17 +02:00
Alexander 909c8b1894 Add request sanitizer, background token refresh, and OpenCode support
Sanitizer renames tool names and replaces system prompt patterns
that Anthropic fingerprints to detect non-Claude-Code clients.
Lowercase tool names (bash, read, glob, etc.) combined together
trigger rejection — renaming to PascalCase bypasses this.
Configurable via YAML sanitize rules for tools, system, and body.

Background OAuth token refresh every 30s with 5-minute pre-expiry
lead. Uses Chrome TLS fingerprint for refresh endpoint too.

Adds /messages route (without /v1 prefix) for OpenCode compat.
2026-04-09 22:52:43 +02:00
Alexander c4c1d4daa4 Anthropic API proxy with OAuth credential rotation and Claude Code fingerprinting
Sniffs a real Claude Code request on startup to capture exact HTTP headers,
then replays them for all proxied requests. Injects the billing header with
per-request SHA256 fingerprint into the system prompt. Uses utls with Chrome
TLS fingerprint to pass Cloudflare's bot detection on api.anthropic.com.

Supports both streaming (SSE) and non-streaming modes, round-robin credential
selection with automatic failover, and loading OAuth tokens from both
cli-proxy-api auth files and native ~/.claude/.credentials.json.
2026-04-09 21:05:32 +02:00