🍪 Why You're Still Logged In: Demystifying Session-Based Authentication

Imagine a developer, Samantha, building a blog. Initially, she keeps the architecture straightforward, serving only HTML, CSS, and JavaScript files. The blog posts themselves are basic Markdown files stored directly on the server’s disk.

After a year or two, her blog gains significant readership, leading her to consider monetization. She opts for a membership model, offering readers two distinct tiers:

Economy Tier: Readers can create an account on her website and gain access to some of the posts written by Samantha.
Premium Tier: Readers can pay a small monthly or yearly subscription fee to get access to all of Samantha’s content.

To implement this, Samantha must now modify her blog to support personal reader accounts. Furthermore, the blog’s backend needs the capability to identify a reader’s tier and ensure they can only view the posts their tier gives them access to.

Let’s explore a traditional, but still widely used, approach to handling this: session-based authentication.

Session-Bassed Authentication

Samantha wants her readers to be able to:

Register on her website
Log in to their accounts
Stay logged in, meaning they don’t have to re-enter their credentials if they revisit her blog just 5 minutes later
Accessing only permitted posts based on their tier

In other words, she needs an authentication system to identify who each reader is, and an authorization system to control which posts each reader can access.

Authentication and authorization are distinct, yet complementary, security processes. Authentication is the act of verifying a user’s identity, often by checking their username and password. Once a user is authenticated, authorization determines what resources or actions that verified user is permitted to access within the system. In essence, authentication answers the question “Who are you?”, while authorization answers “What are you allowed to do?”.

To get started, Samantha creates a registration page and a login page. She also sets up a backend service, let’s call it blog_svc, to handle reader registration and login. She also sets up a readers_db database, which stores information about each reader, including reader_id, reader_email, reader_name, hashed_password, and reader_tier. At this stage, this database is simply a CSV file residing on her server’s disk.

With this foundational setup, readers can successfully create accounts and log in. However, when a reader navigates to a different page on the blog, they are prompted to log in again. This occurs because the browser initiates an entirely new request to blog_svc, and the service lacks a mechanism to identify the reader without their login credentials. This means that unless the reader’s login credentials accompany every request to blog_svc, the backend cannot recognize the reader and, therefore, cannot determine if the reader is authorized to view the requested page.

No one wants this experience. Samantha certainly doesn’t want readers to re-enter their credentials every time they navigate to a new page, or if they close a tab and return to her blog later. She needs to maintain a state that helps the backend authenticate the reader across requests. This is precisely what sessions enable her to achieve.

Samantha modifies her backend to send a cookie with the response to the browser when a reader logs in. This cookie holds a session_id, generated by the backend to be unique across all sessions and all readers. Browsers are configured to automatically send all cookies they’ve received from a particular domain back to that domain with every subsequent request. Therefore, when a reader moves to a different page on her blog, the browser transmits the cookie it received during the login process back to the backend.

Let’s examine some of the fields Samantha configures for this cookie:

name: This specifies the cookie’s name. She sets it to session_id.
value: This holds the cookie’s actual content. She assigns it a unique string generated by blog_svc.
max-age: This parameter defines the cookie’s expiration duration in seconds. The browser will delete the cookie after this period. Samantha sets it to 172800 seconds (48 hours).
httpOnly: This parameter controls whether the cookie can be accessed by the website’s JavaScript. Setting httpOnly to True prevents the webpage’s JavaScript from reading or modifying the cookie, thereby protecting it from Cross-Site Scripting (XSS) attacks.

📖 Definition: XSS is a web security vulnerability where malicious code, typically JavaScript, is injected into a trusted website through user input (e.g., comments or search queries). An attacker might embed hidden code (e.g., within a <script> tag that isn’t visually rendered but is executed by the browser). When other users visit the affected page, their browsers unknowingly run this malicious code. This can allow the attacker to steal sensitive information (like session cookies) or manipulate parts of the site as if they were the legitimate user.

Samantha sets httpOnly to True to protect the cookie from XSS attacks. While HttpOnly prevents JavaScript from directly accessing or stealing the cookie’s value, an XSS attack can still force the browser to send requests that automatically include the cookie. Therefore, for robust XSS protection, it’s essential to sanitize and validate all user inputs on both the client and server. Additionally, any data displayed on the page must be properly escaped when rendered in HTML, JavaScript, or CSS.
secure: This parameter instructs the browser to send the cookie only over HTTPS connections. Samantha sets this to True to ensure network-level security.
samesite: This attribute controls the cookie’s behavior regarding cross-site requests, with options being Strict, Lax, or None:
- Strict: The cookie is sent only when the request originates directly from the same site as the cookie, meaning the reader is actively navigating within the website.
- Lax: The cookie can be sent when the reader is redirected to the blog from another website, such as via an email link. However, it is only sent with “safe” HTTP methods, like GET or HEAD requests, which are not intended to modify server state (unlike modifying methods like POST).
- None: The cookie is sent with all requests, whether direct or cross-site. This setting is unsafe for sensitive cookies as it greatly increases vulnerability to Cross-Site Request Forgery (CSRF) attacks, as an attacker could trick a user’s browser into sending requests to the backend containing the cookie without the user’s knowledge.
Definition: Cross-Site Request Forgery (CSRF) is an attack that tricks a user into unknowingly sending unauthorized requests to a website where they are currently logged in. The attacker crafts a malicious web page, email, or link. When the victim, who is already authenticated with the target website, interacts with this malicious content, their browser sends a request to the trusted website.

Samantha sets SameSite to Lax as she wants the readers to be able to click links (e.g., in an email about newly published posts) and be redirected to their authenticated account on her blog with the post already opened. To protect her readers and her blog from CSRF attacks, she ensures that her backend follows HTTP guidelines strictly and doesn’t allow non-modifying HTTP requests like GET or HEAD to create, update or delete anything.

Sidenote

While browser implementations can vary slightly, the generally accepted and recommended maximum size for a single HTTP cookie is 4096 bytes (or 4KB). The 4KB limit applies to the entire string sent in the Set-Cookie header, which includes the semicolons and equal signs. For example: Set-Cookie: session_id=abc123xyz456; Max-Age=3600; Path=/; Secure; HttpOnly; SameSite=Lax.

Browsers usually impose limits on the number of cookies per domain (often around 50) and the total number of cookies a browser can store across all domains. Exceeding these limits can lead to cookies being silently truncated, rejected, or old cookies being discarded, which can break website functionality. Also, larger cookies increase the data sent with every HTTP request, potentially impacting performance.

Back to Samantha’s Blog

Samantha maintains an in-memory hash map, named active_sessions_map, to store all active sessions and their corresponding readers. This map establishes a direct mapping of session_id to reader_id. Consistent with the max-age parameter set for cookies, she configures a Time-To-Live (TTL) for all entries within this hash map, using the same duration. This mechanism prevents the hash map from growing indefinitely over time.

📝 Note: It’s good practice to align the cookie’s max-age with the server-side session’s TTL. However, it’s crucial to understand that the server-side TTL is the definitive control. Even if a cookie is present in the browser, if its corresponding session has expired or been explicitly invalidated on the server, the session is no longer valid. The client-side cookie’s expiry merely tells the browser when to stop sending it; the server decides if the session is still valid.

Now, when a reader navigates to a different page on her blog, returns to the website after a period, or is redirected to it, the browser sends the session cookie (provided it has not expired) with the request to the backend. The blog_svc service extracts the session_id from the cookie. It then uses this session_id to look up the associated reader_id in the active_sessions_map. Subsequently, blog_svc fetches metadata about the identified reader from the readers_db database, thereby determining whether the reader belongs to the economy tier or the premium tier.

Scaling Up

This system functions effectively for a significant period until the traffic on Samantha’s blog experiences a further surge. Her single-server backend becomes overwhelmed, struggling to handle the increased request volume. Also, the server’s disk approaches full capacity due to the accumulated posts, and the reader_db file has grown considerably in size. Consequently, Samantha implements the following modifications:

She establishes two distinct databases, posts_db for storing her posts, and readers_db for maintaining reader metadata.
She deploys multiple replicas of blog_svc and positions them behind a Layer 7 load balancer. This load balancer contributes to blog_svc’s resilience by implementing rate limiting for incoming requests, offering a degree of protection against certain types of Distributed Denial of Service (DDoS) attacks. It also inspects incoming requests, forwarding only correctly formed ones to the backend replicas.

📖 Definition: DDoS (Distributed Denial of Service) attacks involve overwhelming a server or network with a flood of traffic originating from numerous compromised sources, with the aim of making the targeted service unavailable to legitimate users.

Now, Samantha encounters a new challenge: the active_sessions_map differs across all blog_svc replicas because it is an in-memory data structure. This implies that if a load balancer directs a login request to Replica A, which generates a session_id and sends it back to the browser in a cookie, a subsequent request from the same reader might be routed to Replica B. In this scenario, Replica B possesses no entry for a reader_id associated with that session_id, nor can it ascertain the session_id’s validity.

Samantha now faces a common challenge in scaled architectures: managing session state across multiple backend replicas. Here are a few options to consider:

1. Sticky Sessions

One approach is sticky sessions, where the load balancer is configured to route all requests from a specific reader to the same blog_svc replica. This ensures that the replica’s in-memory active_sessions_map can consistently identify the reader.

However, sticky sessions introduce a single point of failure. If that particular replica goes down, all readers whose sessions were managed by it get logged out because their session data in that replica’s memory is lost permanently. Additionally, the load balancer would need a rebalancing strategy to redistribute those orphaned sessions among the remaining active replicas. Samantha would need to configure this strategy carefully to minimize disruption for already logged-in readers.

2. Load Balancer Handles Authentication

Another option involves offloading authentication and authorization responsibilities to the load balancer itself. Once the load balancer verifies a reader’s identity and tier, it forwards the request to a blog_svc replica, including this authentication and authorization information.

The primary drawback here is that the load balancer then becomes a critical single point of failure for all session management. Even with standby replicas for the load balancer, a failover event would likely log out all active readers, as the session state managed by the primary load balancer instance would be lost.

3. Centralized Session Store

A robust solution is to implement a centralized key-value store, such as active_sessions_store. This store would house the session_id to reader_id mapping for all active sessions, accessible by all blog_svc replicas. This design makes the blog_svc replicas stateless concerning sessions, meaning readers’ active sessions remain unaffected even if the load balancer or any blog_svc replica fails.

The main consideration for this approach is the potential for heavy load on the active_sessions_store. Since all blog_svc replicas will query this same centralized store for every request, its performance and scalability become crucial.

Samantha opts for the third solution and implements a centralized key-value store. With all these changes, her blog is more robust and resilient, and also provides enhanced security against certain application-layer attacks. While her databases (DBs) remain single points of failure (SPOF), she has the option to introduce read replicas for them if future needs dictate.

Despite these improvements, further enhancements are possible. With a centralized session store, instantly revoking sessions becomes more challenging due to the inherent network call latency between blog_svc and active_sessions_store, as well as potential replica lag if the active_sessions_store itself uses replication.

Replica lag is the delay between when data is written to the primary database and when it appears on the read replicas. When revoking a session, the write (invalidation) happens on the primary active_sessions_store. The “lag” means read replicas might not immediately reflect that invalidation. So, a request hitting a blog_svc instance that reads from a lagged replica could still consider the session valid for a short period.

For critical operations like session revocation, strong consistency is essential. Replica lag is a challenge to consistency and can be mitigated by:

Centralized session validation: Always read from the primary or from a cache with strong consistency guarantees.
Quorum reads: Consider a read successful only if a majority of replicas respond with the same or most recent value.
Session invalidation via push: Use pub-sub or event-driven mechanisms to notify all backend services of the revocation promptly.

This raises two key questions:

Is there a method which enables blog_svc to validate a session_id without needing to query the active_sessions_store for every request?
If a session cookie is compromised, an attacker currently has a significant window (48 hours, based on the cookie’s max-age setting) to exploit the compromised cookie and perform unauthorized actions on Samantha’s blog. How can this exposure window be reduced?

In the next post, we will explore how JSON Web Tokens (JWTs) address these two limitations.

We explored storing session IDs within HTTP cookies due to their simplicity and browser-native handling for session management. However, session identifiers are not exclusively limited to cookies for client-side storage. Alternatives include Local Storage and Session Storage in the browser but these often introduce their own security and usability challenges. While they offer more space and JavaScript access, for session IDs, cookies with proper flags (HttpOnly, Secure, SameSite) are generally preferred for security reasons related to XSS and CSRF, as Local/Session Storage are directly accessible by JavaScript, making them vulnerable if an XSS attack occurs.

localStorage and sessionStorage are part of the Web Storage API and are used by websites running in the browser (like Chrome) to store data as key-value pairs for their own use:

Session storage is a web storage mechanism that allows websites to store key-value pairs specific to a single page session; the data is cleared when the page session ends, such as when the browser tab is closed. It is ideal for storing temporary data that does not need to persist across page reloads or browser restarts.
Local storage is a persistent web storage solution that enables websites to store key-value pairs that remain available even after the browser is closed and reopened. It is useful for saving user preferences (like theme and language preferences), or cached data (like incomplete forms) that should persist across sessions.

This was all for session-based authorization. May your cookies be secure and your sessions ever fresh!

Thanks for reading!

Tags: system-design