Design a social media news feed system like Facebook or Twitter

Question

Accepted Answer

A social media news feed — the personalized stream of posts shown to each user — is one of the most studied system design problems because it forces hard trade-offs between write amplification, read latency, consistency, and ranking complexity. At Facebook or Twitter scale (billions of users, hundreds of millions of daily active users, millions of posts per hour), the naive approach of querying every followee's posts at read time completely collapses. The core architectural decision is the fan-out strategy : when and where to assemble the feed. Requirements and Scale Functional requirements: post creation, follow/unfollow, personalized feed retrieval (paginated, ranked), likes and comments, notifications. Non-functional requirements: feed load latency under 200ms at p99, high availability (99.99%), eventual consistency between "post created" and "visible to followers" (seconds, not milliseconds), and graceful handling of celebrity accounts with tens of millions of followers. A read-to-write ratio of 100:1 or higher is typical — the feed is read constantly; posts are created comparatively rarely. Fan-Out Strategies Fan-out on write (push model) — when a user posts, a background worker immediately copies the post ID into the precomputed feed of every follower. Reads are then a single Redis sorted-set lookup — extremely fast. The cost: a user with 10 million followers triggers 10 million write operations per post, straining the fan-out infrastructure and causing write amplification. Also wastes storage for inactive users whose feeds are never read. Fan-out on read (pull model) — the feed is assembled at read time by pulling recent posts from each followee and merging them. No fan-out work at write time; the post is written once to the author's store. The cost: reads are expensive — if a user follows 500 accounts, reading their recent posts requires 500 lookups plus a merge sort, which does not scale to hundreds of thousands of concurrent feed loads per second. Hybrid model (the production standard) — use fan-out on write for normal users ( Skip inactive users — fan-out on write can skip users who have not been active in the last 30 days (determined by a last-active timestamp lookup) to avoid wasting write operations on stale feeds. When an inactive user returns, rebuild their feed lazily on the first read. Architecture and Data Flow The write path : Post Service validates and stores the post in a sharded NoSQL store ( Cassandra or DynamoDB ) keyed by (user_id, post_id) where post IDs are Snowflake-generated (timestamp-ordered). The service publishes a post-created event to Kafka . Fan-out worker instances consume from Kafka, look up the author's followers from the social graph service , and write the post ID into each follower's feed — stored as a Redis sorted set keyed by user_id with the score as a Unix timestamp. The read path: the Feed Service reads the top N post IDs from Redis, merges any celebrity posts (pulled live), hydrates the IDs by fetching full post objects from the post store in parallel, applies the ranking model, and returns the final response. Ranking Chronological feeds are simple but engagement-poor. Modern feed systems apply a ranking model that scores each candidate post using signals such as recency, predicted engagement (likes, comments, shares — derived from an ML model), relationship strength (how often you interact with the poster), content type (video vs image vs text), and diversity (avoid showing five posts from the same author in a row). Ranking adds 20–50ms of latency but substantially increases user engagement. Key trade-off — fan-out on write vs fan-out on read: Fan-out on write optimizes for read latency at the cost of write amplification and storage. Fan-out on read optimizes for write simplicity at the cost of slow, expensive reads. No production system at scale uses either extreme — the hybrid model (push for normal users, pull for celebrities, skip inactive users) is the industry standard precisely because it makes each trade-off only where it hurts least. Storage Summary Posts — sharded NoSQL store (Cassandra/DynamoDB), keyed by (user_id, post_id) . Social graph (follow edges) — graph-friendly store or sharded MySQL adjacency list, sharded by user_id . Feed timelines — Redis sorted sets per user, holding the most recent ~500 post IDs with timestamp scores. Media (photos, video) — object storage (S3) behind a CDN; post records hold S3 keys, never raw bytes. Counters (likes, comments) — Redis counters with periodic flush to durable storage, to avoid hot-row contention on popular posts. The interviewer expects you to know the hybrid fan-out model — say it early. The nuances that impress: (1) explaining why you skip fan-out for inactive users, (2) the Redis sorted-set data structure choice and why sorted sets (not lists or hash maps) fit this use case, (3) how you hydrate post IDs into full objects in parallel, and (4) what signals a ranking model uses. Avoid spending the wh