Designing a Chat App Infrastructure - Part 1
A multi-part series around designing a chat app infrastructure to be scalable and fast
Background
As a self-taught engineer, I've been designing systems since I started programming. Working mostly alone, I eagerly learned every aspect of software development and built everything to support my applications. This led me to experiment with various technologies, from frontend development to infrastructure as code using Kubernetes. Recently, I've been motivated to act on more of my ideas and design them as I go. One of my recent ideas was a real-time chat app. While I had a basic understanding of how they work, I was intrigued by challenges such as scalability, efficient data queries, and response time. This series will outline my design process and outcomes.
What we're designing
At the most basic level, we are designing a real-time chat app. For Part 1, the functional requirements are:
Send and receive messages from others (2 person chats for now)
Authenticate the user
Create and update profile
Non-Functional Requirements:
- Assume that we don't have a lot of users and don't need to scale for now
The Design
Profile Service
First, a user needs to create their profile and log into the app. We can use a REST API for this. The main steps are:
The user creates a profile on the client and submits it to the API.
The API checks that the profile details are valid (unique username, strong password, etc.) and creates the profile in the database.
The API generates a JWT and Refresh Token, storing a token version in a database linked to the user.
Note: We store a version so we can easily add a feature to
Log Out of All Devices
. Since we can't revoke JWTs, we give JWTs a short lifespan and use a refresh token for reauthenticating the user. If the user wants to log out of all devices, we can increment the refresh token version in our database, invalidating all previous Refresh Tokens.Benjamin Awad has a good video on this. Here's a link - How to Roll Your Own Auth - Benjamin Awad
The API returns those tokens to the client so the client can provide the user experience.
Here is a sequence diagram to outline that:
Establishing a WebSocket Connection
To deliver messages in real-time, we will be utilizing WebSockets. Here is a small overview of what WebSockets are:
WebSockets are a technology that allows for real-time, two-way communication between a client (like a web browser) and a server. Unlike traditional HTTP, which requires a new connection for each request and response, WebSockets keep a single connection open, enabling continuous data exchange. This makes WebSockets ideal for applications that need instant updates, such as chat apps, live sports scores, or online games. The process starts with an HTTP handshake to establish the connection, after which the communication switches to the WebSocket protocol, allowing both the client and server to send messages to each other at any time.
To start, the user sends a connection request to the WebSocket server, including the JWT and Refresh Token. The WebSocket server will validate the tokens, and if they are valid, it will store the client's connection in an object we will call the ConnectionMap. The server will then send a message to the client to confirm that the connection has been established. If the validation is unsuccessful, the WebSocket server will send an error message. Here is a sequence diagram for that:
Handling Messages
Lastly, to enable chat functionality, we need to handle and distribute messages. To do this, there are two parts. Part 1 is handling a message from a sender and part 2 is delivering the message to the recipient.
Part 1: Handling a Sent Message
Let's break this down - when a new message is sent, we need to store the message in the database, and then acknowledge to the sender that the message was successfully sent. This part is pretty basic so here are the steps:
Message is received on WS Server
WS Server attempts to store message in database
If successful, send success message to client
If unsuccessful, send error message to client
Part 2: Delivering Message to Recipient in Real-Time
To finish this messaging flow, we need to deliver the message to the recipient in real-time, if possible. For this, we turn back to when we established the WS connection. If you remember, we stored the client in an object we called the ConnectionMap. Since we have the client stored, on a successful storage of a message, we can look up the recipient and send the message to them. Here is the flow for that:
Message Flow Summary
Here is a flow chart for the messaging flow end to end, when both the recipient and client have a connection established.
Architecture Diagram
Here is our architecture diagram after this first design session:
Closing Comments
Stay tuned for Part 2, where I will handle some edge cases with the WebSocket connection, such as handling dropped connections. We will also design our Chat Rest API and begin to use a MicroServices architecture, which we skipped in this section.
Primeagen if you're reading this, please critique my first article.
If you want to check out more of my work, here are some links below: