Back to articles
web servers
9 min readJanuary 15, 2026

Node.js Rate Limiting: What Happens When You Skip It

Your Express API is one curl loop away from a $4,000 cloud bill. Here's how rate limiting actually works in Node.js — and why the defaults will bite you.

Node.js Rate Limiting: What Happens When You Skip It

Node.js Rate Limiting: What Happens When You Skip It

A friend of mine shipped a Node.js API to production last year without rate limiting. Within 48 hours, someone discovered his open registration endpoint and hammered it with a script — 200,000 requests in under an hour. The app didn't go down (Node is surprisingly resilient), but the email verification service behind it racked up a bill that made his stomach drop. That's the thing about rate limiting: you don't think about it until the invoice arrives.

Node.js rate limiting is one of those "basic but most people still get it wrong" topics. Not because the concept is hard, but because the defaults are almost never right for your specific use case, and the gap between development and production configuration is wider than people expect.

Why Your API Needs Rate Limiting Yesterday

The obvious answer is "to prevent abuse," but that undersells it. Without rate limiting, your API is exposed on several fronts:

  • Brute force attacks on login endpoints — an attacker can try thousands of password combinations per minute
  • Credential stuffing using lists from previous breaches (these lists are freely available, by the way)
  • Resource exhaustion from runaway clients — sometimes it's not even malicious, just a buggy mobile app retrying in a tight loop
  • Cost amplification in cloud environments — every request costs something, and downstream services (email, SMS, payment processors) often charge per call

I've seen the "runaway client" scenario more than actual attacks. A junior developer writes a retry loop without exponential backoff, deploys to 10,000 devices, and suddenly your API is fielding 50x its normal traffic. Rate limiting catches this before your monitoring even fires.

Basic Rate Limiting with express-rate-limit

The standard approach in Express land is the

express-rate-limit
package. It works well enough to get started, but you need to understand what it's actually doing.

javascript
import rateLimit from 'express-rate-limit';

// Global rate limiter
const globalLimiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,                  // 100 requests per window
  standardHeaders: true,     // Return rate limit info in headers
  legacyHeaders: false,
  message: {
    error: 'Too many requests, please try again later.',
    retryAfter: '15 minutes'
  }
});

app.use('/api/', globalLimiter);

This is fine for a starting point, but here's what nobody tells you: that

max: 100
number is almost certainly wrong for your app. If you have a single-page app that makes 8 API calls on page load, a user refreshing a few times during a 15-minute window will hit that limit fast. You need to actually measure your normal traffic patterns before picking a number. I usually start with
max: 300
for general API traffic and tighten from there based on real data.

Also worth noting:

standardHeaders: true
sends back
RateLimit-Remaining
and
RateLimit-Reset
headers. Good API clients will respect these. Bad ones won't. But at least you gave them the information.

Strict Limiter for Auth Endpoints

General API traffic and authentication endpoints need completely different limits. This is where you want to be aggressive.

javascript
// Much stricter for login — prevent brute force
const authLimiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 5, // Only 5 attempts per 15 minutes
  skipSuccessfulRequests: true, // Don't count successful logins
  message: { error: 'Too many login attempts. Account temporarily locked.' }
});

app.post('/api/auth/login', authLimiter, loginHandler);
app.post('/api/auth/register', authLimiter, registerHandler);

The

skipSuccessfulRequests: true
flag is important — you don't want to punish users who type their password correctly. Only failed attempts count against the limit.

Gotcha: Don't forget password reset endpoints. I've seen apps with strict login limits but completely unprotected

/forgot-password
routes. An attacker can use that to enumerate valid email addresses or spam your users with reset emails. Apply the same strict limits there.

One more thing: rate limiting by IP alone is fragile. Users behind corporate NATs or VPNs share the same IP. If 500 employees at a company all hit your API from the same office IP, they'll collectively burn through the limit. For authenticated routes, consider keying on the user ID instead (or a combination of both). That's where the

keyGenerator
option comes in, which brings us to the production setup.

Redis-Backed Rate Limiting for Production

Here's the part that catches people during scaling. The default

express-rate-limit
store is in-memory. That means:

  1. Counters reset every time your server restarts (deploy = reset all limits)
  2. If you run multiple server instances behind a load balancer, each instance has its own counters — an attacker can multiply their effective limit by your instance count

For anything beyond a single-instance hobby project, you need Redis.

javascript
import { rateLimit } from 'express-rate-limit';
import { RedisStore } from 'rate-limit-redis';
import { createClient } from 'redis';

const redisClient = createClient({ url: process.env.REDIS_URL });
await redisClient.connect();

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000,
  max: 100,
  store: new RedisStore({
    sendCommand: (...args) => redisClient.sendCommand(args),
  }),
  keyGenerator: (req) => {
    // Rate limit by IP + user ID if authenticated
    return req.user?.id ? `user:${req.user.id}` : req.ip;
  }
});

That

keyGenerator
is doing the heavy lifting. For unauthenticated requests, it falls back to IP-based limiting. For authenticated users, it tracks by user ID, so corporate NAT users don't unfairly share a pool.

Gotcha: If your Redis connection drops, the default behavior of

rate-limit-redis
is to allow all requests through (fail-open). That's probably what you want — a Redis hiccup shouldn't lock out every user. But you should be aware it's happening. Log it. Alert on it. Don't discover your rate limiter was offline for three days because nobody checked.

The "It Depends" Section

Not every endpoint needs the same treatment. Here's how I typically tier rate limits:

  • Public health checks (
    /health
    ,
    /ready
    ): No rate limiting. Your load balancer hits these constantly.
  • Read-heavy public endpoints (product listings, search): Generous limits (500-1000/15min). You want these to be fast and accessible.
  • Write endpoints (create, update, delete): Moderate limits (50-100/15min). Writes are expensive.
  • Auth endpoints (login, register, password reset): Very strict (5-10/15min). The attack surface is highest here.
  • File upload endpoints: Extremely strict (5-10/hour) — uploads consume bandwidth and storage, and are a favorite for abuse.

If you're building an API consumed by other services (B2B), rate limiting gets more nuanced. You might need per-API-key limits, tiered plans, and proper 429 responses with

Retry-After
headers so clients can back off gracefully.

What Rate Limiting Won't Save You From

Rate limiting is necessary but not sufficient. It won't protect you from a distributed attack across thousands of IPs. It won't help if the attacker has valid credentials. And it won't substitute for proper input validation — a single carefully crafted request can do more damage than a million normal ones (see our SQL injection post for that side of the equation).

Think of rate limiting as one layer of defense. Pair it with request validation, authentication, monitoring, and if your threat model warrants it, a proper WAF or DDoS mitigation service like Cloudflare.

The Uncomfortable Truth

If I had to pick one thing most Node.js APIs get wrong about rate limiting, it's not the implementation — it's the configuration. Developers copy-paste the defaults from the README, deploy to production, and never revisit the numbers. Your rate limits should be based on your actual traffic patterns, reviewed quarterly, and adjusted as your user base grows. The 15-minute window with 100 requests that worked for your beta launch will be completely wrong at 10x scale.

Start with the code above, deploy it today, but put a reminder on your calendar to revisit those numbers next month. Your future self (and your cloud bill) will thank you.

Discussion

0 comments

Share your thoughts

No comments yet. Be the first to share your thoughts!