Some system design terms will click immediately, while others won’t until you’ve seen concrete definitions and visuals.
For me, “fan-in” and “fan-out” didn’t make sense until I found some visuals and real-world examples.
If those terms seem nebulous to you, read on.
Fan-out splits.
Fan-in consolidates.
Fan-out: one thing triggers many downstream operations (broadcast / scatter).
Fan-in: many things consolidate into fewer results (gather / reduce)
Fan-out splits a large job into smaller sub-tasks run in parallel
Fan-in consolidates outputs from those tasks into a single result.
Inputs are fanned out.
Outputs are fanned in.
Fan-out improves performance by running tasks in parallel.
Fan-in can improve performance by preventing unnecessary work. (But it can also hurt performance in the form of a bottleneck or queue.)
Fan-out happens first (scatter-gather).
Fan-in happens as a result of a fan-out. (But it can happen independently, like log aggregation.)
Fan-out examples
A message in the #announcements Discord channel is fanned out to members.
A user uploads a video, and YouTube triggers tasks for audio extraction, thumbnail generation, and content moderation.
A light switch fans-out the “on” operation for three light bulbs.
When you click “Place Order,” Amazon fans-out tasks for processing the payment, updating warehouse inventory, and sending a confirmation email.
When you tap “Schedule Pickup,” Uber fans-out notifications to dozens of drivers.
When you type “best pizza,” Google fans-out your query to many shards.
Fan-in examples
An API fans-in duplicate queries so that only one is sent to the DB (request coalescing).
Google Flights collects individual search results from multiple airlines and presents a single list to the user.
After each cluster returns its result, Google Search fans-in the results and serves the top 10.
When to fan
Fan-out when: work is independent, parallelizable, and latency matters.
Beware of thundering herd, cost spikes, downstream overload, and partial failures.
Fan-in when: you need a single answer, dedup, or batching.
Beware of hotspots, head-of-line blocking, slowest-task wins (“tail latency”), and complex retries.
(Check out our Discord case-study for a real-world example of these trade-offs.)
A good principle is to minimize the very high fan-outs and fan-ins.
The examples above focus on data flow, but this also applies to data modeling.
Would you rather maintain the fanned-out (purple) blog?
I hope these terms feel a little less weird now.
Forward this to someone on your team who is also combating fan jargon.




