Nginx Rate Limiting — limit_req, limit_conn, and fail2ban

Tested on: Ubuntu 24.04 LTS, Nginx 1.26.x (nginx.org stable repository). All directives are core Nginx — no third-party modules required.

Why this matters

A web tier with no rate limiting fails in three predictable ways:

Authentication brute force. A WordPress, application, or admin-panel login form with no rate cap is one credential-stuffing tool away from compromise.
Signup / forgot-password abuse. Endpoints that send email, provision accounts, or issue tokens are expensive and attractive to spammers.
Single-tenant noise becomes shared-tenant outage. One misbehaving client (or one bot) hammering an endpoint can starve FPM workers, application-server threads, and database connections.

Rate limiting is not DDoS protection — that lives at the CDN / WAF layer if you need it. Nginx rate limiting is for the predictable, day-to-day class of behaviour: too many requests from too few sources to too few endpoints. Get this right and your origin survives even when something at the edge fails.

1. Define rate-limit zones at `http` level

limit_req_zone defines a shared-memory zone, a key, and a target rate. Put these in the http {} block — typically in /etc/nginx/conf.d/00-ratelimit.conf so they are visible regardless of which server blocks load:

 1# Per-IP zone, used as the global default. 10 MB stores ~160,000 IPs.
 2limit_req_zone $binary_remote_addr zone=perip:10m rate=20r/s;
 3
 4# Authentication endpoints — much tighter. 5 requests/minute is enough
 5# for a real human and orders of magnitude too low for a credential
 6# stuffer.
 7limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
 8
 9# WordPress login specifically, used in the WordPress server block.
10limit_req_zone $binary_remote_addr zone=wplogin:10m rate=5r/m;
11
12# Signup / forgot-password — slightly more permissive than auth.
13limit_req_zone $binary_remote_addr zone=signup:10m rate=10r/m;
14
15# Concurrent connections per IP.
16limit_conn_zone $binary_remote_addr zone=conn_perip:10m;
17
18# When a limit fires, return 429 not the default 503.
19limit_req_status  429;
20limit_conn_status 429;

A note on the units: rate=20r/s means twenty requests per second sustained; rate=5r/m means five per minute. Nginx’s token-bucket implementation enforces these averages, with the burst directive (see section 2) controlling how much short-term overage is permitted.

2. Apply zones in server / location blocks

 1server {
 2    # ... TLS, server_name, etc. — see /guides/nginx-tls-2026/
 3
 4    # Global per-IP cap. burst=40 nodelay smooths bursts up to 40
 5    # requests without queuing; requests beyond burst get 429.
 6    limit_req zone=perip burst=40 nodelay;
 7
 8    # Concurrent connection cap.
 9    limit_conn conn_perip 20;
10
11    # Authentication endpoints.
12    location = /api/v1/login {
13        limit_req  zone=auth   burst=2 nodelay;
14        limit_conn conn_perip  3;
15        proxy_pass http://upstream;
16    }
17
18    location = /api/v1/signup {
19        limit_req  zone=signup burst=5 nodelay;
20        proxy_pass http://upstream;
21    }
22
23    location = /api/v1/forgot-password {
24        limit_req zone=signup burst=2 nodelay;
25        proxy_pass http://upstream;
26    }
27
28    # Default app traffic.
29    location / {
30        proxy_pass http://upstream;
31    }
32}

Two patterns worth understanding:

burst without nodelay queues exceeding requests and serves them at the configured rate. Use this for non-interactive workloads where queuing is acceptable.
burst=N nodelay allows N requests above the rate, then 429s. Use this for interactive endpoints — users would rather see a clear rate-limit error than experience a multi-second mystery delay.

3. Picking the right key

$binary_remote_addr is the default and right for most cases. Variants that come up:

Key	When to use
`$binary_remote_addr`	Default. Per-client IP.
`$http_authorization`	Per-API-key rate limiting (use a `map` to extract the bearer token). Effective against authenticated abuse where IP is shared.
`$cookie_session_id`	Per-logged-in-session — but the user can clear cookies, so weaker than `$http_authorization`.
`$server_name`	Per-vhost cap; rarely the right answer alone.

Behind a CDN, $binary_remote_addr becomes the CDN IP. You must either trust X-Forwarded-For from your CDN (via set_real_ip_from and real_ip_header) or use a different key. The Nginx TLS guide sets X-Real-IP and X-Forwarded-For correctly when Nginx is the proxy; when a CDN is in front of Nginx, configure real_ip_header X-Forwarded-For and allow-list the CDN’s source ranges via set_real_ip_from.

4. Returning 429 properly

Default Nginx returns the rate-limit status code with no body, no Retry-After header, and no friendly message. Improve this:

 1http {
 2    # ... zone definitions
 3
 4    # Custom 429 page.
 5    error_page 429 /errors/429.html;
 6}
 7
 8server {
 9    # ... server config
10
11    location = /errors/429.html {
12        internal;
13        add_header Retry-After 60 always;
14        add_header Cache-Control "no-store" always;
15        return 429 '{"error":"rate_limited","retry_after_seconds":60}';
16        default_type application/json;
17    }
18}

A Retry-After header tells well-behaved clients (and legitimate mobile apps, and some browsers) when to retry. Without it, clients guess — often badly.

5. Log what you limit

limit_req logs to the Nginx error log at the level configured by limit_req_log_level (default error). For high-volume rate-limit events this is noisy; tune down to warn:

1limit_req_log_level warn;
2limit_conn_log_level warn;

Then add an access-log marker so rate-limited requests are easy to find:

1log_format ratelimited '$remote_addr - [$time_iso8601] '
2                       '"$request" $status $request_time '
3                       'limit_req=$limit_req_status';
4
5# Or include $sent_http_retry_after etc. as needed.

Routine rate-limit events should be alertable as a trend (sudden spike), not per-event. A constant low-level background of 429s on /wp-login.php is the system working.

6. fail2ban for persistent offenders

limit_req punishes within an HTTP session. For sources that hit the limit repeatedly across reconnects — credential stuffers, vulnerability scanners — a temporary IP ban is more efficient than running every request through Nginx’s rate logic.

/etc/fail2ban/jail.d/nginx-limit-req.local:

1[nginx-limit-req]
2enabled  = true
3port     = http,https
4filter   = nginx-limit-req
5logpath  = /var/log/nginx/error.log
6maxretry = 10
7findtime = 600
8bantime  = 3600

The matching filter /etc/fail2ban/filter.d/nginx-limit-req.conf:

1[Definition]
2failregex = limiting requests, excess:.* by zone .*, client: <HOST>
3ignoreregex =

10 rate-limit events in 10 minutes earns an hour-long ban. Tune to your traffic shape; the right numbers are workload-specific.

7. Apply and verify

 1sudo nginx -t && sudo systemctl reload nginx
 2
 3# Verify the zones exist:
 4nginx -T 2>&1 | grep -E '^[[:space:]]*limit_(req|conn)_zone'
 5
 6# Quick smoke test of the auth zone — should start returning 429:
 7for i in $(seq 1 10); do
 8  curl -sI -o /dev/null -w "%{http_code}\n" \
 9    https://example.com/api/v1/login
10done

Gotchas

Setting limits “to be safe” before measuring

A limit_req zone=perip rate=20r/s burst=40 looks defensive, until your first marketing-email batch hits the homepage and produces a wave of 429s for legitimate users. Measure your normal traffic peaks before picking numbers. If you do not have the data, start permissive and tighten.

Rate-limiting health checks

If your load balancer probes /health once per second from a single internal IP, your tightest per-IP limit will rate-limit your health checks first. Exempt internal monitoring IPs via geo + map:

 1geo $skip_ratelimit {
 2    default     0;
 3    10.0.0.0/8  1;        # internal monitoring
 4}
 5
 6map $skip_ratelimit $req_limit_key {
 7    0  $binary_remote_addr;
 8    1  "";                # empty key — limit_req ignores empty keys
 9}
10
11limit_req_zone $req_limit_key zone=perip:10m rate=20r/s;

CDN IPs and rate limiting

Without real_ip_header, every request looks like it came from your CDN’s small pool of edge IPs. That triggers limit_conn and limit_req instantly under any traffic. Always configure trusted proxy headers when a CDN is in front of Nginx, and verify with curl -H testing.

Burst queuing tying up worker connections

burst without nodelay makes Nginx hold those requests until they can be served at the configured rate. Under sustained overload, every queued request occupies a worker connection. For interactive endpoints, prefer burst=N nodelay; queue only for batch / non-interactive paths.

What this guide deliberately does not cover

DDoS protection at the edge — Nginx rate limits do not protect against high-volume volumetric attacks. That is a CDN / scrubbing problem.
WAF rules / signature matching — separate concern, separate guide.
Application-layer rate limits (e.g. token-bucket inside a Rails or Flask app) — complementary, not a substitute. Application-aware rate limits can be richer (per-user, per-plan), but Nginx limits protect the origin even when the app is overwhelmed.

Why this matters#

1. Define rate-limit zones at http level#

2. Apply zones in server / location blocks#

3. Picking the right key#

4. Returning 429 properly#

5. Log what you limit#

6. fail2ban for persistent offenders#

7. Apply and verify#

Gotchas#

Setting limits “to be safe” before measuring#

Rate-limiting health checks#

CDN IPs and rate limiting#

Burst queuing tying up worker connections#

What this guide deliberately does not cover#

Why this matters

1. Define rate-limit zones at `http` level

2. Apply zones in server / location blocks

3. Picking the right key

4. Returning 429 properly

5. Log what you limit

6. fail2ban for persistent offenders

7. Apply and verify

Gotchas

Setting limits “to be safe” before measuring

Rate-limiting health checks

CDN IPs and rate limiting

Burst queuing tying up worker connections

What this guide deliberately does not cover