Software ArchitectureMarch 24, 2026

How to Design a System to Support One Million Users: A Step-by-Step, X-Scale System Design

Just writing code isn't enough. So, how do you design a system from scratch that can handle the load of a million users, like X? We explore every step in this giant guide, from load balancers to database selection, timeline optimization to caching, along with all the 'whys'.

Introduction: Where Code Ends and Architecture Begins

Once you reach a certain level in software development, the architecture of the system in which your code runs becomes more important than the quality of the code itself. You can develop a feature, but what happens when that feature receives thousands of requests per second? Why does a system that works for ten users go up in flames when scaled to a million? The answer to these questions lies in the discipline of 'System Design'.

This guide will make the abstract concept of system design concrete through an example we all know: a 'X-like service'. We will start from scratch and design a scalable, durable, and performant system step-by-step, explaining the 'why' behind every decision. This isn't just an answer to an interview question; it's a roadmap to becoming a better engineer.

Part 1: Laying the Foundation - Understanding Requirements and Constraints

We can't start a journey blindly. First, we need to clarify what we are building and what conditions it must withstand.

Functional Requirements (What should the system do?)

Users should be able to post short messages ('Tweets') containing text and (optionally) an image.
Users should be able to follow each other.
Users should be able to see a homepage feed ('Timeline') consisting of Tweets from the people they follow, sorted chronologically.
Users should be able to view their own profiles and Tweets.

Non-Functional Requirements (How should the system behave?)

High Availability: The system must always be online. A target of 99.99% availability should be aimed for.
High Durability: A posted Tweet should never be lost.
Low Latency: The homepage timeline should load very quickly, preferably under 200ms.
Scalability: The system must be able to handle increased load as the number of users grows.

Part 2: Speaking in Numbers - A Rough Capacity Estimation

The most important thing that will guide our architectural decisions is the scale of the load we will face. Let's do some math.

Total Users: 1,000,000
Daily Active Users (DAU): Let's say 20% of total -> 200,000 DAU.
Tweet Write Rate: Each active user posts an average of 0.5 Tweets per day -> 100,000 Tweets/day.
Writes Per Second (Write TPS): 100,000 / (24 hours * 3600 sec) ≈ 1.2 TPS. However, traffic is never evenly distributed. Let's assume at peak times, this rate could be 10x -> ~12 TPS (write).
Read/Write Ratio: Social media platforms are overwhelmingly read-heavy. A user reads hundreds of Tweets for every one they post. Let's assume a ratio of 1 write to 100 reads (1:100).
Reads Per Second (Read QPS): ~12 TPS (write) * 100 ≈ ~1200 QPS (read). As you can see, our main problem will be managing the read load.
Storage (Text): A Tweet: id (8 bytes) + user_id (8 bytes) + content (280 chars * 2 bytes/char = 560 bytes) + meta (100 bytes) ≈ 700 bytes/Tweet.
Daily: 100,000 Tweets * 700 bytes ≈ 70 MB/day.
For 5 Years: 70 MB * 365 * 5 ≈ 128 GB. This is a manageable size.
Storage (Media): If 10% of Tweets contain an image and each image is 1 MB on average:
Daily: 10,000 Tweets * 1 MB = 10 GB/day.
For 5 Years: 10 GB * 365 * 5 ≈ 18.25 TB. Now that's a large number. This tells us we need to store text and media data in different places.

Conclusion: Our system will be read-heavy and will require a dedicated solution for media storage.

Part 3: High-Level Architecture - The First Sketch

Now we can draw our services and the relationships between them.


[User] <--> [Load Balancer] <--> [API Gateway]
                                    |
            +------------------+------------------+---------------------+-----------------+
            |                  |                  |                     |                 |
      [User Service]   [Tweet Service]    [Timeline Service]    [Media Service]
            |                  |                  |                     |                 |
      [User DB]          [Tweet DB]         [Timeline Cache]      [Blob Storage (S3)]

Load Balancer: Distributes incoming requests across multiple API servers, preventing any single server from being overloaded.
API Gateway: A single entry point for all requests. It handles common tasks like authentication, rate limiting, and routes the request to the appropriate service.
Services: Each service is a small, independent unit responsible for a specific business domain (Microservices Architecture).

Part 4: Database Selection and Schema Design

At this scale, database decisions determine the fate of the system. SQL or NoSQL?

Our data (users, tweets, follows) is highly relational. A user has Tweets, users follow each other. To manage these relationships, Relational Databases (SQL), like PostgreSQL, are an excellent choice to start with. They provide strong consistency guarantees and flexible querying capabilities. NoSQL (e.g., Cassandra) could be considered for the timeline at a much larger scale (billions of Tweets), but for 1 million users, SQL is both simpler and more effective.

Database Schema (PostgreSQL)


CREATE TABLE Users (
    id BIGSERIAL PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE Tweets (
    id BIGSERIAL PRIMARY KEY,
    user_id BIGINT NOT NULL REFERENCES Users(id),
    content VARCHAR(280) NOT NULL,
    media_url VARCHAR(255),
    created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE Follows (
    follower_id BIGINT NOT NULL REFERENCES Users(id),
    followee_id BIGINT NOT NULL REFERENCES Users(id),
    created_at TIMESTAMPTZ DEFAULT NOW(),
    PRIMARY KEY (follower_id, followee_id) -- A user can follow another user only once.
);

Part 5: Deep Dive - Designing the Timeline (The Hardest Part)

The most critical and challenging part of our system is generating a user's timeline. The vast majority of our read load comes from here.

Method 1: The Naive Approach (Compute on Read - Pull Model)

When a user requests their timeline, we compute it from the database every single time.

Get the list of people the user follows: SELECT followee_id FROM Follows WHERE follower_id = :current_user_id.
Fetch the latest Tweets from these people: SELECT * FROM Tweets WHERE user_id IN (followed_users_list) ORDER BY created_at DESC LIMIT 50.

The Problem: This method becomes incredibly slow as the number of people a user follows increases. For a user following thousands of people, the IN query can cripple the database. This is not a scalable solution.

Method 2: Pre-computation (Fan-out on Write - Push Model)

This is the method used by professional systems. When a user posts a Tweet, we "push" that Tweet into the timelines of all their followers.

Where do we store these timelines? An ideal place for fast reads: A Cache. An in-memory database like Redis is perfect for this job. We'll maintain a Redis list for each user's timeline: timeline:user_id.

Write Flow:
- User A posts a Tweet.
- The Tweet Service writes the Tweet to the Tweets table in PostgreSQL.
- It then finds all of User A's followers from the Follows table.
- For each follower (B, C, D...), it adds the new Tweet's ID to their timeline list (timeline:B, timeline:C, ...) using the LPUSH command.
Read Flow:
- When User B requests their timeline, the Timeline Service simply reads the first 50 Tweet IDs from Redis with the command LRANGE timeline:B 0 49. This is a lightning-fast operation.
- The necessary Tweet details (content, author name, etc.) are then fetched in a batch from the database or another cache using these IDs (Hydration).

The New Problem: The "Celebrity Effect". What happens when a celebrity with 500,000 followers posts a Tweet? Pushing that Tweet to 500,000 lists one by one at write time could take minutes and slow down the system. This is called a 'fan-out on write' storm.

Method 3: The Hybrid Approach (The Best of Both Worlds)

We don't have to use the same strategy for everyone.

For Regular Users (e.g., < 1000 followers): Continue using Method 2 (Push Model). When a user Tweets, push it to their followers' Redis lists.
For Celebrities (e.g., > 1000 followers): Don't push! When a celebrity Tweets, just write the Tweet to the database.
Timeline Merging: When a user requests their timeline:
1. Fetch their pre-computed timeline from Redis (Tweets from regular users).
2. Get the list of celebrities the user follows from a separate source (e.g., a Redis Set).
3. Fetch the latest Tweets from these celebrities from the database or a cache.
4. Merge these two lists (the user's own feed and the celebrities' Tweets) in the application layer, sort them by date, and present them to the user.

This hybrid model preserves the read speed for normal users and prevents the write performance from being crippled by celebrities. This is a perfect example of making trade-offs in system design.

Part 6: Fortifying the System - Scaling and Overcoming Bottlenecks

Our basic flows are complete. Now let's make the system more robust.

Message Queues for Asynchronous Operations

When a user posts a Tweet, the fan-out process to their followers can take seconds. Making the user wait for this to complete is a bad experience. The solution: Message Queues (e.g., RabbitMQ, Kafka).

The new flow: User posts Tweet -> Tweet service writes it to the database and drops a message like {"tweet_id": 123, "user_id": 456} into a queue -> Returns an immediate "Success" response to the user. A separate 'worker' service, running in the background, reads messages from this queue and performs the slow fan-out job. This improves user experience and makes the system more resilient.

Cache Everything

In a read-heavy system, cache is king. We should cache not only the timelines but also:

User Profiles: In Redis with a key like user:user_id.
Tweet Objects: In Redis with a key like tweet:tweet_id.
Follower/Following Lists: Using Redis Sets.

The rule: Always check the cache before hitting the database.

CDN for Media

Serving user-uploaded images from our own servers would consume our bandwidth. Instead, we store the images in Blob Storage (e.g., AWS S3) and serve them through a CDN (e.g., Cloudflare, AWS CloudFront). A CDN dramatically reduces load times by serving images from servers geographically closest to the users.

Database Scaling

Read Replicas: Since our read load is very high, we create read-only copies (replicas) of our main database. All write operations go to the master database, and all read operations are directed to the replicas. This greatly alleviates the read bottleneck.
Sharding: When our data grows too large to fit on a single machine (unlikely for 1 million users, but good to know), we partition the database into shards. For example, we could split users by their IDs across different database servers. This is the ultimate form of horizontal scaling.

Conclusion: A Never-Ending Journey

Congratulations! We started from scratch and, based on simple requirements, designed a complex yet robust system capable of handling a million users, using concepts like load balancers, microservices, SQL and NoSQL (Redis) databases, message queues, CDNs, and database replication. As you can see, system design is less about finding one "correct" answer and more about understanding the trade-offs between different solutions and making the decisions that best fit the project's requirements. It's a journey of continuous learning and evolution.