Skip to main content
Agentgateway supports two rate limiting modes:
  • Local rate limiting (localRateLimit) — token-bucket rate limiting applied in-process. State is not shared across instances.
  • Remote rate limiting (remoteRateLimit) — rate limiting delegated to an external Envoy Rate Limit Service compatible gRPC server. State is shared across all gateway instances.
Both policies are configured under binds[].listeners[].routes[].policies.

Local rate limiting

Local rate limiting uses a token-bucket algorithm. Each route maintains its own bucket in memory.
localRateLimit
object[]
A list of token-bucket rate limit configurations applied to the route. State is kept local to the process.

Local rate limiting example

binds:
- port: 3000
  listeners:
  - routes:
    - policies:
        localRateLimit:
        - maxTokens: 10
          tokensPerFill: 1
          fillInterval: 60s
      backends:
      - mcp:
          targets:
          - name: everything
            stdio:
              cmd: npx
              args: ["@modelcontextprotocol/server-everything"]
In this example, the route allows a burst of up to 10 requests, then refills 1 token every 60 seconds (1 request/minute steady-state).

Remote rate limiting

Remote rate limiting delegates rate limit decisions to an external gRPC service compatible with the Envoy Rate Limit Service API. The gateway sends descriptor entries to the service, which returns allow/deny decisions.
remoteRateLimit
object
Rate limit incoming requests using an external rate limit service.

Remote rate limiting example

binds:
- port: 3000
  listeners:
  - routes:
    - policies:
        remoteRateLimit:
          domain: "agentgateway"
          host: "127.0.0.1:8081"
          failureMode: failOpen
          descriptors:
          - entries:
            - key: "user"
              value: '"test-user"'
            - key: "tool"
              value: '"echo"'
            type: "requests"
      backends:
      - mcp:
          targets:
          - name: everything
            stdio:
              cmd: npx
              args: ["@modelcontextprotocol/server-everything"]
The remote rate limit service must implement the Envoy Rate Limit Service gRPC API. The host value must include the port (e.g. 127.0.0.1:8081).
Local rate limit state is not shared across multiple Agentgateway instances. Use remoteRateLimit for distributed deployments where consistent rate limiting across instances is required.