Netflix's ad system started simple. Client calls server, server calls Microsoft's API, response comes back. Synchronous. Clean. Works. One endpoint. One dependency. Ship it.
Then came scale.
At some point, every ad impression needs to notify five different systems. Billing needs to record the impression (50ms). Analytics needs to update the dashboard (100ms). Frequency capping needs to update the count (30ms). Third-party vendors need their tracking pixels fired (200ms—and that's optimistic for external calls). Fraud detection needs to check patterns (80ms).
With synchronous calls, you call each one in sequence. Call billing, wait. Call analytics, wait. Call frequency capping, wait. Call vendors, wait. Call fraud detection, wait. Finally respond.
Add up the wait times: 460 milliseconds. Your user is waiting almost half a second because of an analytics dashboard they'll never see. And if any one of those services is slow or down? Everything is blocked.
Synchronous calls become a web of dependencies where each new integration makes the whole system more fragile.
Put a queue in the middle.
When an ad plays, the producer publishes a single event to Kafka and immediately returns—5 milliseconds tops. The user gets their response. Done.
Meanwhile, five independent consumers pick up that event at their own pace. Billing processes it. Analytics processes it. Fraud detection does its thing. If billing is slow, analytics doesn't care. If the third-party vendor's endpoint is timing out, your user already got their response seconds ago.
Each consumer processes events independently. Failures are isolated—billing being down doesn't break analytics. You can replay events if something got missed. New consumers can subscribe without touching the producer code at all.
Here's what that looks like in practice:
Same event, five consumers, zero coupling. User gets response in 5ms instead of 460ms.
Don't build separate pipelines for each use case. I've seen teams create a billing pipeline that extracts timestamp, user_id, and ad_id—then create an analytics pipeline that extracts the exact same fields. Then a vendor pipeline doing it again.
Instead, publish one standardized event. Document the schema. Version it. Anyone who needs that data can subscribe and filter what they need. Common operations like encryption or enrichment happen once, not five times.
The benefits compound: new consumers join without changing producers, clean separation of concerns, and you get an audit trail for free.
There's a wrong time and a right time for events.
Too early: You have 2-3 services, maybe 100 requests per second. One team owns everything. Simple request/response works fine. No audit requirements. If something fails, acceptable downtime is… acceptable. Don't reach for Kafka. It's overkill.
Right time: You're at 5+ services. A thousand or more requests per second. Multiple teams need the same data. You need async processing. Audit trails matter. Failures in one system can't be allowed to cascade into others.
Event-driven isn't free. You get eventual consistency instead of immediate consistency. Kafka has operational burden—it's not a database you fire and forget. Debugging gets harder; you're tracing through queues instead of call stacks. Message ordering can bite you. There's a learning curve.
But you lose a lot of pain too. Tight coupling between services goes away. Cascading failures become isolated failures. Blocking calls disappear. Single points of failure are no longer single points. One slow consumer doesn't block the rest.
Netflix started with Microsoft's API. Direct synchronous call. They only built their own event-driven ad platform when they felt the pain and understood the problem.
That's the right path. Sync calls first. Background jobs when you need async (Redis queue, keep it simple). Event streaming when you need replay, audit trails, and serious scale.
The best architecture is the one that fits your current scale. Not the one you might need in three years.
Sync first. Event-driven when you feel the pain.
— blanho
Most devs treat payments like CRUD. Then money disappears.
When direct data transfer becomes unwieldy, add a layer of indirection. Netflix learned this the hard way.
You have 10 million saved searches. A new item comes in. How do you find all matches without running 10 million queries?
# Producer (simple, fast)
def on_ad_play(ad_id, user_id):
event = {
"type": "AD_IMPRESSION",
"ad_id": ad_id,
"user_id": user_id,
"timestamp": now(),
"device": get_device_info()
}
kafka.produce("ad-events", event)
return {"status": "ok"} # Return immediately, 5ms
# Consumer (Billing team owns this)
def billing_consumer():
for event in kafka.consume("ad-events"):
if event["type"] == "AD_IMPRESSION":
record_impression(event["ad_id"])
# Can take 50ms, 500ms, doesn't matter
# User already got their response