Tested on: Ubuntu 24.04 LTS, Nginx 1.26.x (nginx.org stable repository). All directives are core Nginx — no third-party modules required.

Why this matters

A web tier with no rate limiting fails in three predictable ways:

  1. Authentication brute force. A WordPress, application, or admin-panel login form with no rate cap is one credential-stuffing tool away from compromise.
  2. Signup / forgot-password abuse. Endpoints that send email, provision accounts, or issue tokens are expensive and attractive to spammers.
  3. Single-tenant noise becomes shared-tenant outage. One misbehaving client (or one bot) hammering an endpoint can starve FPM workers, application-server threads, and database connections.

Rate limiting is not DDoS protection — that lives at the CDN / WAF layer if you need it. Nginx rate limiting is for the predictable, day-to-day class of behaviour: too many requests from too few sources to too few endpoints. Get this right and your origin survives even when something at the edge fails.

1. Define rate-limit zones at http level

limit_req_zone defines a shared-memory zone, a key, and a target rate. Put these in the http {} block — typically in /etc/nginx/conf.d/00-ratelimit.conf so they are visible regardless of which server blocks load:

 1# Per-IP zone, used as the global default. 10 MB stores ~160,000 IPs.
 2limit_req_zone $binary_remote_addr zone=perip:10m rate=20r/s;
 3
 4# Authentication endpoints — much tighter. 5 requests/minute is enough
 5# for a real human and orders of magnitude too low for a credential
 6# stuffer.
 7limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
 8
 9# WordPress login specifically, used in the WordPress server block.
10limit_req_zone $binary_remote_addr zone=wplogin:10m rate=5r/m;
11
12# Signup / forgot-password — slightly more permissive than auth.
13limit_req_zone $binary_remote_addr zone=signup:10m rate=10r/m;
14
15# Concurrent connections per IP.
16limit_conn_zone $binary_remote_addr zone=conn_perip:10m;
17
18# When a limit fires, return 429 not the default 503.
19limit_req_status  429;
20limit_conn_status 429;

A note on the units: rate=20r/s means twenty requests per second sustained; rate=5r/m means five per minute. Nginx’s token-bucket implementation enforces these averages, with the burst directive (see section 2) controlling how much short-term overage is permitted.

2. Apply zones in server / location blocks

 1server {
 2    # ... TLS, server_name, etc. — see /guides/nginx-tls-2026/
 3
 4    # Global per-IP cap. burst=40 nodelay smooths bursts up to 40
 5    # requests without queuing; requests beyond burst get 429.
 6    limit_req zone=perip burst=40 nodelay;
 7
 8    # Concurrent connection cap.
 9    limit_conn conn_perip 20;
10
11    # Authentication endpoints.
12    location = /api/v1/login {
13        limit_req  zone=auth   burst=2 nodelay;
14        limit_conn conn_perip  3;
15        proxy_pass http://upstream;
16    }
17
18    location = /api/v1/signup {
19        limit_req  zone=signup burst=5 nodelay;
20        proxy_pass http://upstream;
21    }
22
23    location = /api/v1/forgot-password {
24        limit_req zone=signup burst=2 nodelay;
25        proxy_pass http://upstream;
26    }
27
28    # Default app traffic.
29    location / {
30        proxy_pass http://upstream;
31    }
32}

Two patterns worth understanding:

  • burst without nodelay queues exceeding requests and serves them at the configured rate. Use this for non-interactive workloads where queuing is acceptable.
  • burst=N nodelay allows N requests above the rate, then 429s. Use this for interactive endpoints — users would rather see a clear rate-limit error than experience a multi-second mystery delay.

3. Picking the right key

$binary_remote_addr is the default and right for most cases. Variants that come up:

Key When to use
$binary_remote_addr Default. Per-client IP.
$http_authorization Per-API-key rate limiting (use a map to extract the bearer token). Effective against authenticated abuse where IP is shared.
$cookie_session_id Per-logged-in-session — but the user can clear cookies, so weaker than $http_authorization.
$server_name Per-vhost cap; rarely the right answer alone.

Behind a CDN, $binary_remote_addr becomes the CDN IP. You must either trust X-Forwarded-For from your CDN (via set_real_ip_from and real_ip_header) or use a different key. The Nginx TLS guide sets X-Real-IP and X-Forwarded-For correctly when Nginx is the proxy; when a CDN is in front of Nginx, configure real_ip_header X-Forwarded-For and allow-list the CDN’s source ranges via set_real_ip_from.

4. Returning 429 properly

Default Nginx returns the rate-limit status code with no body, no Retry-After header, and no friendly message. Improve this:

 1http {
 2    # ... zone definitions
 3
 4    # Custom 429 page.
 5    error_page 429 /errors/429.html;
 6}
 7
 8server {
 9    # ... server config
10
11    location = /errors/429.html {
12        internal;
13        add_header Retry-After 60 always;
14        add_header Cache-Control "no-store" always;
15        return 429 '{"error":"rate_limited","retry_after_seconds":60}';
16        default_type application/json;
17    }
18}

A Retry-After header tells well-behaved clients (and legitimate mobile apps, and some browsers) when to retry. Without it, clients guess — often badly.

5. Log what you limit

limit_req logs to the Nginx error log at the level configured by limit_req_log_level (default error). For high-volume rate-limit events this is noisy; tune down to warn:

1limit_req_log_level warn;
2limit_conn_log_level warn;

Then add an access-log marker so rate-limited requests are easy to find:

1log_format ratelimited '$remote_addr - [$time_iso8601] '
2                       '"$request" $status $request_time '
3                       'limit_req=$limit_req_status';
4
5# Or include $sent_http_retry_after etc. as needed.

Routine rate-limit events should be alertable as a trend (sudden spike), not per-event. A constant low-level background of 429s on /wp-login.php is the system working.

6. fail2ban for persistent offenders

limit_req punishes within an HTTP session. For sources that hit the limit repeatedly across reconnects — credential stuffers, vulnerability scanners — a temporary IP ban is more efficient than running every request through Nginx’s rate logic.

/etc/fail2ban/jail.d/nginx-limit-req.local:

1[nginx-limit-req]
2enabled  = true
3port     = http,https
4filter   = nginx-limit-req
5logpath  = /var/log/nginx/error.log
6maxretry = 10
7findtime = 600
8bantime  = 3600

The matching filter /etc/fail2ban/filter.d/nginx-limit-req.conf:

1[Definition]
2failregex = limiting requests, excess:.* by zone .*, client: <HOST>
3ignoreregex =

10 rate-limit events in 10 minutes earns an hour-long ban. Tune to your traffic shape; the right numbers are workload-specific.

7. Apply and verify

 1sudo nginx -t && sudo systemctl reload nginx
 2
 3# Verify the zones exist:
 4nginx -T 2>&1 | grep -E '^[[:space:]]*limit_(req|conn)_zone'
 5
 6# Quick smoke test of the auth zone — should start returning 429:
 7for i in $(seq 1 10); do
 8  curl -sI -o /dev/null -w "%{http_code}\n" \
 9    https://example.com/api/v1/login
10done

Gotchas

Setting limits “to be safe” before measuring

A limit_req zone=perip rate=20r/s burst=40 looks defensive, until your first marketing-email batch hits the homepage and produces a wave of 429s for legitimate users. Measure your normal traffic peaks before picking numbers. If you do not have the data, start permissive and tighten.

Rate-limiting health checks

If your load balancer probes /health once per second from a single internal IP, your tightest per-IP limit will rate-limit your health checks first. Exempt internal monitoring IPs via geo + map:

 1geo $skip_ratelimit {
 2    default     0;
 3    10.0.0.0/8  1;        # internal monitoring
 4}
 5
 6map $skip_ratelimit $req_limit_key {
 7    0  $binary_remote_addr;
 8    1  "";                # empty key — limit_req ignores empty keys
 9}
10
11limit_req_zone $req_limit_key zone=perip:10m rate=20r/s;

CDN IPs and rate limiting

Without real_ip_header, every request looks like it came from your CDN’s small pool of edge IPs. That triggers limit_conn and limit_req instantly under any traffic. Always configure trusted proxy headers when a CDN is in front of Nginx, and verify with curl -H testing.

Burst queuing tying up worker connections

burst without nodelay makes Nginx hold those requests until they can be served at the configured rate. Under sustained overload, every queued request occupies a worker connection. For interactive endpoints, prefer burst=N nodelay; queue only for batch / non-interactive paths.

What this guide deliberately does not cover

  • DDoS protection at the edge — Nginx rate limits do not protect against high-volume volumetric attacks. That is a CDN / scrubbing problem.
  • WAF rules / signature matching — separate concern, separate guide.
  • Application-layer rate limits (e.g. token-bucket inside a Rails or Flask app) — complementary, not a substitute. Application-aware rate limits can be richer (per-user, per-plan), but Nginx limits protect the origin even when the app is overwhelmed.