Designing a Chat App Infrastructure - Part 1

A multi-part series around designing a chat app infrastructure to be scalable and fast

Background

As a self-taught engineer, I've been designing systems since I started programming. Working mostly alone, I eagerly learned every aspect of software development and built everything to support my applications. This led me to experiment with various technologies, from frontend development to infrastructure as code using Kubernetes. Recently, I've been motivated to act on more of my ideas and design them as I go. One of my recent ideas was a real-time chat app. While I had a basic understanding of how they work, I was intrigued by challenges such as scalability, efficient data queries, and response time. This series will outline my design process and outcomes.

What we're designing

At the most basic level, we are designing a real-time chat app. For Part 1, the functional requirements are:

  • Send and receive messages from others (2 person chats for now)

  • Authenticate the user

  • Create and update profile

Non-Functional Requirements:

  • Assume that we don't have a lot of users and don't need to scale for now

The Design

Profile Service

First, a user needs to create their profile and log into the app. We can use a REST API for this. The main steps are:

  1. The user creates a profile on the client and submits it to the API.

  2. The API checks that the profile details are valid (unique username, strong password, etc.) and creates the profile in the database.

  3. The API generates a JWT and Refresh Token, storing a token version in a database linked to the user.

    • Note: We store a version so we can easily add a feature to Log Out of All Devices. Since we can't revoke JWTs, we give JWTs a short lifespan and use a refresh token for reauthenticating the user. If the user wants to log out of all devices, we can increment the refresh token version in our database, invalidating all previous Refresh Tokens.

    • Benjamin Awad has a good video on this. Here's a link - How to Roll Your Own Auth - Benjamin Awad

  4. The API returns those tokens to the client so the client can provide the user experience.

Here is a sequence diagram to outline that:

Profile Service Sequence Diagram

Establishing a WebSocket Connection

To deliver messages in real-time, we will be utilizing WebSockets. Here is a small overview of what WebSockets are:

WebSockets are a technology that allows for real-time, two-way communication between a client (like a web browser) and a server. Unlike traditional HTTP, which requires a new connection for each request and response, WebSockets keep a single connection open, enabling continuous data exchange. This makes WebSockets ideal for applications that need instant updates, such as chat apps, live sports scores, or online games. The process starts with an HTTP handshake to establish the connection, after which the communication switches to the WebSocket protocol, allowing both the client and server to send messages to each other at any time.

To start, the user sends a connection request to the WebSocket server, including the JWT and Refresh Token. The WebSocket server will validate the tokens, and if they are valid, it will store the client's connection in an object we will call the ConnectionMap. The server will then send a message to the client to confirm that the connection has been established. If the validation is unsuccessful, the WebSocket server will send an error message. Here is a sequence diagram for that:

WebSocket Connection Sequence Diagram

Handling Messages

Lastly, to enable chat functionality, we need to handle and distribute messages. To do this, there are two parts. Part 1 is handling a message from a sender and part 2 is delivering the message to the recipient.

Part 1: Handling a Sent Message

Let's break this down - when a new message is sent, we need to store the message in the database, and then acknowledge to the sender that the message was successfully sent. This part is pretty basic so here are the steps:

  1. Message is received on WS Server

  2. WS Server attempts to store message in database

    • If successful, send success message to client

    • If unsuccessful, send error message to client

Handling Sent Message Sequence Diagram

Part 2: Delivering Message to Recipient in Real-Time

To finish this messaging flow, we need to deliver the message to the recipient in real-time, if possible. For this, we turn back to when we established the WS connection. If you remember, we stored the client in an object we called the ConnectionMap. Since we have the client stored, on a successful storage of a message, we can look up the recipient and send the message to them. Here is the flow for that:

Delivering Message Sequence Diagram

Message Flow Summary

Here is a flow chart for the messaging flow end to end, when both the recipient and client have a connection established.

Message Flow Summary

Architecture Diagram

Here is our architecture diagram after this first design session:

Architecture Diagram

Closing Comments

Stay tuned for Part 2, where I will handle some edge cases with the WebSocket connection, such as handling dropped connections. We will also design our Chat Rest API and begin to use a MicroServices architecture, which we skipped in this section.

Primeagen if you're reading this, please critique my first article.

If you want to check out more of my work, here are some links below:

Twitter / X

GitHub

LinkedIn