Tested on: Ubuntu 24.04 LTS, Nginx 1.26.x (nginx.org stable repository). All directives are core Nginx — no third-party modules required.
Why this matters
A web tier with no rate limiting fails in three predictable ways:
- Authentication brute force. A WordPress, application, or admin-panel login form with no rate cap is one credential-stuffing tool away from compromise.
- Signup / forgot-password abuse. Endpoints that send email, provision accounts, or issue tokens are expensive and attractive to spammers.
- Single-tenant noise becomes shared-tenant outage. One misbehaving client (or one bot) hammering an endpoint can starve FPM workers, application-server threads, and database connections.
Rate limiting is not DDoS protection — that lives at the CDN / WAF layer if you need it. Nginx rate limiting is for the predictable, day-to-day class of behaviour: too many requests from too few sources to too few endpoints. Get this right and your origin survives even when something at the edge fails.
1. Define rate-limit zones at http level
limit_req_zone defines a shared-memory zone, a key, and a target rate.
Put these in the http {} block — typically in
/etc/nginx/conf.d/00-ratelimit.conf so they are visible regardless of
which server blocks load:
1# Per-IP zone, used as the global default. 10 MB stores ~160,000 IPs.
2limit_req_zone $binary_remote_addr zone=perip:10m rate=20r/s;
3
4# Authentication endpoints — much tighter. 5 requests/minute is enough
5# for a real human and orders of magnitude too low for a credential
6# stuffer.
7limit_req_zone $binary_remote_addr zone=auth:10m rate=5r/m;
8
9# WordPress login specifically, used in the WordPress server block.
10limit_req_zone $binary_remote_addr zone=wplogin:10m rate=5r/m;
11
12# Signup / forgot-password — slightly more permissive than auth.
13limit_req_zone $binary_remote_addr zone=signup:10m rate=10r/m;
14
15# Concurrent connections per IP.
16limit_conn_zone $binary_remote_addr zone=conn_perip:10m;
17
18# When a limit fires, return 429 not the default 503.
19limit_req_status 429;
20limit_conn_status 429;
A note on the units: rate=20r/s means twenty requests per second
sustained; rate=5r/m means five per minute. Nginx’s token-bucket
implementation enforces these averages, with the burst directive (see
section 2) controlling how much short-term overage is permitted.
2. Apply zones in server / location blocks
1server {
2 # ... TLS, server_name, etc. — see /guides/nginx-tls-2026/
3
4 # Global per-IP cap. burst=40 nodelay smooths bursts up to 40
5 # requests without queuing; requests beyond burst get 429.
6 limit_req zone=perip burst=40 nodelay;
7
8 # Concurrent connection cap.
9 limit_conn conn_perip 20;
10
11 # Authentication endpoints.
12 location = /api/v1/login {
13 limit_req zone=auth burst=2 nodelay;
14 limit_conn conn_perip 3;
15 proxy_pass http://upstream;
16 }
17
18 location = /api/v1/signup {
19 limit_req zone=signup burst=5 nodelay;
20 proxy_pass http://upstream;
21 }
22
23 location = /api/v1/forgot-password {
24 limit_req zone=signup burst=2 nodelay;
25 proxy_pass http://upstream;
26 }
27
28 # Default app traffic.
29 location / {
30 proxy_pass http://upstream;
31 }
32}
Two patterns worth understanding:
burstwithoutnodelayqueues exceeding requests and serves them at the configured rate. Use this for non-interactive workloads where queuing is acceptable.burst=N nodelayallows N requests above the rate, then 429s. Use this for interactive endpoints — users would rather see a clear rate-limit error than experience a multi-second mystery delay.
3. Picking the right key
$binary_remote_addr is the default and right for most cases. Variants
that come up:
| Key | When to use |
|---|---|
$binary_remote_addr |
Default. Per-client IP. |
$http_authorization |
Per-API-key rate limiting (use a map to extract the bearer token). Effective against authenticated abuse where IP is shared. |
$cookie_session_id |
Per-logged-in-session — but the user can clear cookies, so weaker than $http_authorization. |
$server_name |
Per-vhost cap; rarely the right answer alone. |
Behind a CDN, $binary_remote_addr becomes the CDN IP. You must
either trust X-Forwarded-For from your CDN (via set_real_ip_from
and real_ip_header) or use a different key. The
Nginx TLS guide sets X-Real-IP and
X-Forwarded-For correctly when Nginx is the proxy; when a CDN is in
front of Nginx, configure real_ip_header X-Forwarded-For and
allow-list the CDN’s source ranges via set_real_ip_from.
4. Returning 429 properly
Default Nginx returns the rate-limit status code with no body, no
Retry-After header, and no friendly message. Improve this:
1http {
2 # ... zone definitions
3
4 # Custom 429 page.
5 error_page 429 /errors/429.html;
6}
7
8server {
9 # ... server config
10
11 location = /errors/429.html {
12 internal;
13 add_header Retry-After 60 always;
14 add_header Cache-Control "no-store" always;
15 return 429 '{"error":"rate_limited","retry_after_seconds":60}';
16 default_type application/json;
17 }
18}
A Retry-After header tells well-behaved clients (and legitimate
mobile apps, and some browsers) when to retry. Without it, clients
guess — often badly.
5. Log what you limit
limit_req logs to the Nginx error log at the level configured by
limit_req_log_level (default error). For high-volume rate-limit
events this is noisy; tune down to warn:
1limit_req_log_level warn;
2limit_conn_log_level warn;
Then add an access-log marker so rate-limited requests are easy to find:
1log_format ratelimited '$remote_addr - [$time_iso8601] '
2 '"$request" $status $request_time '
3 'limit_req=$limit_req_status';
4
5# Or include $sent_http_retry_after etc. as needed.
Routine rate-limit events should be alertable as a trend (sudden
spike), not per-event. A constant low-level background of 429s on
/wp-login.php is the system working.
6. fail2ban for persistent offenders
limit_req punishes within an HTTP session. For sources that hit the
limit repeatedly across reconnects — credential stuffers, vulnerability
scanners — a temporary IP ban is more efficient than running every
request through Nginx’s rate logic.
/etc/fail2ban/jail.d/nginx-limit-req.local:
1[nginx-limit-req]
2enabled = true
3port = http,https
4filter = nginx-limit-req
5logpath = /var/log/nginx/error.log
6maxretry = 10
7findtime = 600
8bantime = 3600
The matching filter /etc/fail2ban/filter.d/nginx-limit-req.conf:
1[Definition]
2failregex = limiting requests, excess:.* by zone .*, client: <HOST>
3ignoreregex =
10 rate-limit events in 10 minutes earns an hour-long ban. Tune to your traffic shape; the right numbers are workload-specific.
7. Apply and verify
1sudo nginx -t && sudo systemctl reload nginx
2
3# Verify the zones exist:
4nginx -T 2>&1 | grep -E '^[[:space:]]*limit_(req|conn)_zone'
5
6# Quick smoke test of the auth zone — should start returning 429:
7for i in $(seq 1 10); do
8 curl -sI -o /dev/null -w "%{http_code}\n" \
9 https://example.com/api/v1/login
10done
Gotchas
Setting limits “to be safe” before measuring
A limit_req zone=perip rate=20r/s burst=40 looks defensive, until
your first marketing-email batch hits the homepage and produces a wave
of 429s for legitimate users. Measure your normal traffic peaks
before picking numbers. If you do not have the data, start permissive
and tighten.
Rate-limiting health checks
If your load balancer probes /health once per second from a single
internal IP, your tightest per-IP limit will rate-limit your health
checks first. Exempt internal monitoring IPs via geo + map:
1geo $skip_ratelimit {
2 default 0;
3 10.0.0.0/8 1; # internal monitoring
4}
5
6map $skip_ratelimit $req_limit_key {
7 0 $binary_remote_addr;
8 1 ""; # empty key — limit_req ignores empty keys
9}
10
11limit_req_zone $req_limit_key zone=perip:10m rate=20r/s;
CDN IPs and rate limiting
Without real_ip_header, every request looks like it came from your
CDN’s small pool of edge IPs. That triggers limit_conn and limit_req
instantly under any traffic. Always configure trusted proxy headers
when a CDN is in front of Nginx, and verify with curl -H testing.
Burst queuing tying up worker connections
burst without nodelay makes Nginx hold those requests until they
can be served at the configured rate. Under sustained overload, every
queued request occupies a worker connection. For interactive endpoints,
prefer burst=N nodelay; queue only for batch / non-interactive paths.
What this guide deliberately does not cover
- DDoS protection at the edge — Nginx rate limits do not protect against high-volume volumetric attacks. That is a CDN / scrubbing problem.
- WAF rules / signature matching — separate concern, separate guide.
- Application-layer rate limits (e.g. token-bucket inside a Rails or Flask app) — complementary, not a substitute. Application-aware rate limits can be richer (per-user, per-plan), but Nginx limits protect the origin even when the app is overwhelmed.