* feat(gateway): add auth rate-limiting & brute-force protection
Add a per-IP sliding-window rate limiter to Gateway authentication
endpoints (HTTP, WebSocket upgrade, and WS message-level auth).
When gateway.auth.rateLimit is configured, failed auth attempts are
tracked per client IP. Once the threshold is exceeded within the
sliding window, further attempts are blocked with HTTP 429 + Retry-After
until the lockout period expires. Loopback addresses are exempt by
default so local CLI sessions are never locked out.
The limiter is only created when explicitly configured (undefined
otherwise), keeping the feature fully opt-in and backward-compatible.
* fix(gateway): isolate auth rate-limit scopes and normalize 429 responses
---------
Co-authored-by: buerbaumer <buerbaumer@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Add a new `/v1/responses` endpoint implementing the OpenResponses API
standard for agentic workflows. This provides:
- Item-based input (messages, function_call_output, reasoning)
- Semantic streaming events (response.created, response.output_text.delta,
response.completed, etc.)
- Full SSE event support with both event: and data: lines
- Configuration via gateway.http.endpoints.responses.enabled
The endpoint is disabled by default and can be enabled independently
from the existing Chat Completions endpoint.
Phase 1 implementation supports:
- String or ItemParam[] input
- system/developer/user/assistant message roles
- function_call_output items
- instructions parameter
- Agent routing via headers or model parameter
- Session key management
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>