Your API works fine. Handles staging load. Then you launch, go viral, 10,000 users hit simultaneously. Response times spike. Errors cascade. Database screams.
This happens because scaling is counterintuitive.
Don't guess. Measure.
Is your API CPU-bound? I/O-bound? Memory-bound? The solution differs dramatically.
Most web APIs are I/O-bound. They spend time waiting—database, external APIs, file system. This is good news. I/O-bound problems have known solutions.
The fastest database query is the one you never make.
Cache aggressively. Start with short TTLs. Extend as you understand freshness requirements.
Vertical scaling (bigger server) hits limits. Horizontal scaling (more servers) is where real growth happens.
Requirements:
Your API scales. Can your database?
Connection pooling — opening connections is expensive. Keep a pool ready. PgBouncer for Postgres. This alone can double throughput.
Read replicas — primary handles writes, replicas handle reads. Most apps are 90%+ reads.
Missing indexes — often the whole bottleneck.
Not everything needs to happen during the request.
Return immediately. Process in background. Users don't wait.
One misbehaving client shouldn't take down your API.
100 requests per minute per API key. 429 when hit. Redis makes this easy.
Don't jump to step 7. Most never need it.
Scaling isn't one technique. It's a ladder. Climb it step by step.
— blanho
Partial failures, network lies, clock drift. Everything that makes distributed systems a nightmare.
ACID sounds simple until you learn what READ COMMITTED actually allows.
You optimized the wrong thing. Again. Here's why guessing at performance never works.
# Before: 200ms every time
def get_user(user_id):
return db.query(f"SELECT * FROM users WHERE id = {user_id}")
# After: 200ms first time, 1ms subsequent
def get_user(user_id):
cached = redis.get(f"user:{user_id}")
if cached:
return json.loads(cached)
user = db.query(f"SELECT * FROM users WHERE id = {user_id}")
redis.setex(f"user:{user_id}", 300, json.dumps(user))
return user-- This query taking 2 seconds?
SELECT * FROM orders WHERE user_id = 123;
-- Add this:
CREATE INDEX idx_orders_user_id ON orders(user_id);
-- Now 5ms# Before: request takes 2 seconds
def signup(user):
create_user(user)
send_welcome_email(user) # 500ms
create_audit_log(user) # 200ms
notify_slack(user) # 300ms
return {"status": "ok"}
# After: request takes 100ms
def signup(user):
create_user(user)
queue.push("welcome_email", user.id)
queue.push("audit_log", user.id)
queue.push("slack_notify", user.id)
return {"status": "ok"}