In the previous post on computer networks, we explored network and transport layer protocols. But for most developers, it’s the application layer (Layer 5 - 7 of the OSI framework) that has the most practical relevance. At the application layer, we have protocols (a language contract) defining how applications talk with each other, even when they are running on machines oceans apart.
The Hypertext Transfer Protocol (HTTP
) is a stateless, application-level protocol that has become the quiet workhorse powering everything we do on the internet. It is the language that your browser uses to talk with websites. Let’s understand this invisible thread tying the internet together.
HTTP/0.9 (1991)
Tim Berners-Lee, an English computer scientist, developed the first version of HTTP
, now called HTTP/0.9
. It ran over TCP/IP protocol and was extremely minimal by design. What made it extremely minimal?
- Requests and responses had no headers (metadata about the message)
- Requests were single line and only
GET
method was allowed - Only raw HTML files could be fetched from a server (no images, no buttons, no CSS)
- No status codes would be returned. If something went wrong, the server might return an error message as raw HTML, but there was no formal way to tell the client what happened
In HTTP/0.9
, a request looked as simple as GET /index.html
. Despite being very basic in design, it laid down the foundations for everything that followed.
HTTP/1.0 (1996)
HTTP/1.0
built on the primitive design of HTTP/0.9
and brought the following upgrades:
POST
andHEAD
methods were added:POST
allowed the client to send form data and create new resourcesHEAD
(similar toGET
) could be used to retrieve just the response headers without the actual resource
- Headers (metadata) could now be added to each request and response. This made the web much more flexible. For example:
- Thanks to the
Content-Type
header, servers could now send documents other than plain HTML, such as plain text withtext/plain
, images withimage/png
orimage/jpeg
, binary data withapplication/octet-stream
, and more - The
Expires
header (which tells the client when the returned resource will go stale) andLast-Modified
header (which tells the client when a resource was last modified) allowed clients to do basic caching. For example, a client could now send aHEAD
request to get the headers of a resource and checkLast-Modified
to know if the resource has changed since the last fetch - Servers could now provide the number of bytes in the response body with the
Content-Length
header. This was important for reading responses correctly, especially for binary data like images
- Thanks to the
- Status codes were sent at the beginning of the response, making it easier for clients to adapt their behavior based on whether requests succeeded or failed
HTTP/1.0
added version info to the request line, e.g.,GET /index.html HTTP/1.0
, to maintain backward compatibility, and pave the way for future upgrades
Though it advanced HTTP/0.9
quite a lot, HTTP/1.0
still had its limitations:
- No Persistent Connections: Every request required a new TCP connection to the server. Thus, if an HTML file referenced 5 images, the browser would open 5 separate TCP connections to the server to fetch them all. This wasn’t efficient, especially as the websites grew heavier and more complex
- No Pipelining or Multiplexing: Only 1 request could be handled per TCP connection at a time
- No Built-in Compression: All payloads were sent raw, which slowed things down when transferring large responses
HTTP/1.1 (1997)
With web exploding in its popularity, HTTP/1.1
was introduced to handle the growing demand. It became the backbone of the internet for nearly two decades and is still widely used today.
So, what made HTTP/1.1
such a big deal?
Persistent Connections
HTTP/1.1
made persistent connections the default (can be overriden with the Connection: close
header in the request). Unlike HTTP/1.0
, the browser no longer had to open a brand new TCP connection for every request - a single connection could be reused for multiple requests. Thus, all the resources referenced on a webpage could be downloaded over a single connection, resulting in fewer round trips, less overhead from repeatedly setting up and tearing down connections, and faster page loads. Persistent connections made the web feel noticeably snappier.
Chunked Transfer Encoding
Before HTTP/1.1
, the server had to know the exact length of the response before sending it (so it could set the value of Content-Length
in the response header). While this worked fine for static pages, it wasn’t efficient for dynamic content or streaming data where the final size isn’t known upfront. For example: streaming logs, live data from a DB, or any scenario where the response is built on the fly.
HTTP/1.1
introduced chunked transfer encoding, which lets the server start sending data immediately (as chunks) without knowing the final size ahead of time.
How it works:
- The server sets the
Transfer-Encoding: chunked
header in the response - The response body is sent as a series of chunks. Each chunk is sent like:
{chunk size in hex}\r\n {chunk data}\r\n
- When all chunks have been sent, the server sends a chunk size of 0 followed by an empty line to indicate the end of response
0\r\n \r\n
Modern applications use chunked encoding behind the scenes for things like server-sent events (SSE), streaming APIs, and real-time dashboards.
Better Caching Mechanisms
HTTP/1.1
allowed both, servers and clients, to have more control over how content is stored and reused, with headers like:
Cache-Control
:- In requests, this is used by browsers to control how aggressively they want to bypass cached data. For example: when you hard refresh a page, the browser may send
Cache-Control: no-cache
to force a fresh version - In responses, the server sets the
Cache-Control
header to specify the caching behavior for the sent resource. For example:Cache-Control: max-age=1800
tells the browser that it can cache and use this resource for 30 minutes without checking back with the server
- In requests, this is used by browsers to control how aggressively they want to bypass cached data. For example: when you hard refresh a page, the browser may send
ETag
: An Entity Tag is a unique identifier assigned by a server to a resource. The client can cache the resource with this ETag (returned in response to the client). Later, it can ask the server if the resource has changed usingIf-None-Match: {ETag-value}
header. If the server’s current ETag for the resource matches the client’s, it responds with a304 Not Modified
status code, indicating that the client’s version is still valid
Better Error Handling
More status codes were introduced in HTTP/1.1
, allowing servers to give more meaningful feedback to the clients. This made error handling and the client-server communication more precise.
For example: 206 Partial Content
allows serving range requests (important for resumable downloads), 101 Switching Protocols
signals the client that the server is switching protocols as requested (used in websockets), 410 Gone
tells the client that the requested resource is permanently gone, and more.
Host Header
This small feature helped the internet scale massively. Before HTTP/1.1
, a typical request looked like GET /index.html HTTP/1.0
. Imagine you want to run multiple websites on a single machine: how would the server distinguish which website the request is for? This meant that one IP could only server one website.
With Host
header, HTTP/1.1
allowed the client to specify different domains in the request, like this:
GET /index.html HTTP/1.1
Host: awesomeHTTP.com
This simple feature allowed many websites to live on the same server/IP (virtual hosting), preventing IP exhaustion and also made it possible for shared hosting providers (like GoDaddy) to offer cheap hosting plans.
Additional Methods
HTTP/1.1
expanded the set of HTTP
methods beyond the basic trio from HTTP/1.0
(GET
, POST
, HEAD
) to offer more control over resources:
Method | Purpose |
---|---|
OPTIONS |
Asks the server which HTTP methods are supported for a specific resource |
PUT |
Replace or create resource |
DELETE |
Remove a resource |
TRACE |
Echo back the sent request; useful for diagnostics |
CONNECT |
Convert the connection into a tunnel (e.g., for TLS) |
Limitations
HTTP/1.1
served us well for over two decades, but as websites got heavier and more interactive, its cracks started to show:
- Even though
HTTP/1.1
introduced persistent connections, it still handles one request at a time per TCP connection. Thus, the client has to wait for the previous request to be served before sending the next request. This means that if a request is slow, it will be blocking all the subsequent requests (head-of-line blocking at the application-layer level) HTTP/1.1
sends headers as plain text without any built-in compression. On top of that, the same headers often get exchanged over and over on the same connection, which is wastefulHTTP/1.1
uses TCP connections, which are relatively expensive to open and manage, wasting bandwidth unnecessarily
Many of these limitations directly inspired the development of HTTP/2
and HTTP/3
.
HTTP/2 (2015)
By mid-2000s, it was apparent that HTTP/1.1
couldn’t sustain the growing weight and complexity of the internet. In response, Google introduced SPDY
in 2009, an experimental protocol designed to make web browsing faster by tackling some shortcomings of HTTP/1.1
. It served as a proof of concept and helped shape what eventually became HTTP/2
.
HTTP/2
didn’t change the semantics of HTTP
i.e. methods like GET
, POST
, and status codes like 200 OK
still worked exactly the same. It’s contribution were mostly under the hood, focusing on making the protocol faster and efficient at the transport layer level.
Binary Framing Layer & Multiplexing
Instead of sending data in plain text, HTTP/2
opted for binary format, which is easier for machines to parse and process - it’s their native language, after all. At the core of this is the binary framing layer, a set of rules for how to chop up the data, label it, send it, and rebuild it on the other end. Let’s look at how it works.
In HTTP/2
, all communications is done through frames. Each HTTP
message (whether it’s a request or response) is split into multiple frames. Every frame consists of:
- A 9 byte frame header
- A variable-length frame payload
All the frames of a message are associated with the same Stream ID
. Stream ID
is best understood with an example:
Suppose a client sends 2 requests to a server. Each request gets chopped up into frames:
Request 1 (Stream ID = 1): frame #1, frame #2, frame #3 (all have Stream ID = 1)
Request 2 (Stream ID = 2): frame #4, frame #5 (all have Stream ID = 2)
Now, instead of sending all of Request 1 and then all of Request 2, the client can interleave frames on a single TCP connection like this:
Frame #1 → Frame #4 → Frame #2 → Frame #5 → Frame #3
Thus, the frames of both requests get sent together and the second request gets delivered even before the first request. Unlike HTTP/1.1
, the second request didn’t have to wait for the first request to be responded to.
This ability to send multiple streams concurrently over the same connection, without blocking, is called multiplexing. Multiplexing helps HTTP/2
overcome the head-of-line blocking issue at the application layer.
Note: Multiplexing in HTTP/2
doesn’t do anything about the head-of-line blocking issue at the transport layer level. The server might still be waiting for a lost or corrputed TCP segment before it can assemble the frame and pass it over to the application.
Thus, Stream ID
enables the client to interleave frames on a single TCP connection and also allows the server to assemble full messages from different frames it receives.
Each HTTP/2
frame has a 9-byte header, which includes:
- Length: Size of the frame’s payload
- Type: What kind of a frame it is ()
HEADERS
,DATA
, etc) - Flags: Control bits like
END_STREAM
orEND_HEADERS
- Stream ID: The stream this frame belongs to
Commonly used frame types:
HEADERS
frame (different from a frame’s header) carries the header of the request or the responseDATA
frame carries the body of the request or the response
A typical message starts with one or more HEADERS
frames followed by one or more DATA
frames. The sender sets:
END_HEADERS
in the lastHEADERS
frameEND_STREAM
in the last frame (whetherHEADERS
orDATA
) to indicate that its the last fame from the sender
Now, what’s a stream? It’s a virtual bidirectional channel within a TCP connection that carries a single request-response pair. Here’s how it works in practice:
- Imagine you’re loading a webpage. Your browser sends a request for
index.html
. All frames of this request get assigned the sameStream ID
. The last frame includes theEND_STREAM
flag to indicate the request is complete. Now the stream is half-closed - The server replies using the same
Stream ID
, with its ownHEADERS
andDATA
frames. It also setsEND_STREAM
in the last frame of its response. Now the stream is fully closed
Thus, the request and response shared the same stream. A stream goes through several states:
- Idle: Not used yet
- Open: A request is being sent or received
- Half-closed: Client is done sending, but the server isn’t
- Closed: Both sides are done; no more frames can be sent on this stream
Stream Prioritization
HTTP/2
enables multiple streams to be active together. Imagine a web page:
HTML (index.html)
CSS (style.css)
JavaScript (app.js)
2 images (img1.jpg, img2.jpg)
Even though all those requests can be active at the same time with HTTP/2
multiplexing, we want:
HTML first (so the browser can parse the layout)
CSS and JS next (to render and interact)
Images last (they’re not critical for page structure)
Stream prioritization is a mechanism that lets the client tell the server which streams are more important, so the server can allocate its resources smarter and faster to serve those streams at a priority.
For example: The client can tell the server “Please send this HTML stream first, it’s more important than the rest”. The client can express this prioritization in two ways:
- By attaching a priority section to the stream’s initial
HEADERS
frame - Or by sending a separate
PRIORITY
frame in the stream
This information (attached or sent separately) contains the following fields:
- Stream dependency ID: Which stream is this one dependent on
- Weight (value between 1 to 256): Relative importance among sibling streams; higher weights get more bandwidth
- Exclusive Flag: Tells the server to prioritize only this stream among all its siblings
Imagine a client assigns the following stream priorities:
Stream 3 = HTML (no dependency)
Stream 5 = CSS (depends on 3, weight 200, exclusive=1)
Stream 7 = Image 1 (depends on 3, weight 40)
Stream 9 = Image 2 (depends on 3, weight 10)
This forms the following dependency tree:
Stream 3
└── Stream 5 (exclusive)
├── Stream 7
└── Stream 9
Here’s how the server interprets this:
- The server will process stream 3 first as everything else depends on it
- Then stream 5 gets processed as exclusive = 1 makes it take all the bandwidth
- Then stream 7 and 9 are processed parallely but stream 7 gets 80% of the bandwidth of server while stream 9 gets 20%
Thus, stream prioritization allows clients to control how content is delivered.
Header Compression
In HTTP/1.1
, headers are sent as plain text every time, even if they’re identical across messages. This creates unnecessary overhead, especially for common headers like Accept
, User-Agent
, etc. HTTP/2
addresses this with HPACK, a header compression format designed to reduce bandwidth usage allowing faster message transmission.
Here’s how HPACK works: HPACK uses 2 shared tables, maintained independently by both the client and server:
- Static Table: Holds commonly used values of headers, eg:
method: GET
,method: POST
,scheme: HTTP/2
, etc - Dynamic Table: Starts empty and gets populated with the new headers encountered during the communication
To reuse a header, the sender doesn’t need to send its full key-value pair. Instead, it can just reference the index of the header (from the shared header tables) in its message, drastically reducing the size of the header block. For example: 1st request:
:method: GET
:path: /
user-agent: MyBrowser/1.0
2nd request:
:method: GET
:path: /search
user-agent: MyBrowser/1.0
For the second request, the client will just have to send :path: /search
along with the index references to :method: GET
and user-agent: MyBrowser/1.0
. The server looks up the referenced indexes in its copy of the headers tables and reconstructs the full headers, allowing for smaller and faster transmissions.
Server Push
HTTP/2
allows the server to proactively send resources to the client before the client explicitly asks for them. Without Server Push:
- Client asks for
index.html
- Client parses
index.html
and encounters<link href="style.css">
- Client makes another request for
style.css
With Server Push, the server predicts that style.css
will be needed:
- The client sends a request
GET /index.html
on stream 1 - The server responds with:
- The usual
HEADER
andDATA
frames on stream 1 forindex.html
- A
PUSH_PROMISE
frame (also on stream 1) saying: “Hey, I’m going to pushstyle.css
on stream 2”
- The usual
- The server then initiates stream 2 to send the
HEADERS
andDATA
frames forstyle.css
Note:
- The
PUSH_PROMISE
frame is sent on a client initiated stream while a new server initiated stream carries the pushed resource - Clients can reject pushed resources, either per stream or gloabally via a
SETTINGS
frame - The server can’t force the client to use the pushed content; the client may discard it, especially if it has the asset already cached (
HTTP/2
server push is cache-aware and rejects the server sent assets if the client already has them cached)
While Server Push sounds like a great way to reduce latency on paper, it is rarely used today due to its complexity and poor results in real world usage. The complexity comes from the fact that it’s hard to predict what the client truly needs. Also, bandwidth is wasted if the client already has the resource cached. Due to its underwhelming performance, HTTP/3
removed Server Push entirely.
Limitations
Even though it brought significant improvements over HTTP/1.1
, HTTP/2
has its limitations:
- Still relies on TCP, which means it still suffers from head-of-line blocking at the transport layer
- In high latency or lossy networks, the benefits of multiplexing and header compression can be negated by TCP’s retransmission behavior
- Because it uses binary framing,
HTTP/2
traffic isn’t human-readable. UnlikeHTTP/1.1
, you can’t simply inspect it with basic tools liketelnet
. Debugging now requires more specialized tools like Wireshark or browser developer tools HTTP/2
uses HPACK, which maintains synchronized header tables on both client and server. While it reduces bandwidth usage, this shared state must be carefully managed. Desynchronization (due to lost packets or bugs) can cause errors that are hard to detect and recover from
HTTP/3 (2022)
HTTP/3
solved some of the biggest bottlenecks in HTTP/2
by moving away from TCP entirely and building on top of QUIC
(see the last post for how QUIC
works). Just like HTTP/2
, HTTP/3
didn’t change any HTTP
semantics as well. By using QUIC
as the transport layer protocol:
HTTP/3
no longer suffers from head-of-line blocking at the transport layer levelQUIC
allows faster connection setup between the client and the serverQUIC
doesn’t use (IP address, port) to define a connection. Instead, it each connection is identied with a Connection ID (CID). This means if your device changes its network (switching from Wi-Fi to mobile data), which changes its IP addresses, the old connection between your device and any server it is connected to remains usable. The server won’t care that the IP address has changed as it now reles on the CID to idenify your device. Thus,QUIC
connections can survive a network changeQUIC
runs overUDP
and is implemented in the user space and not the OS kernel space. This facilitates rapid development and deployment of updates and new features. For example, it gives browser vendors more freedom to iterate and deploy improvements fasterQUIC
makesHTTP/3
more resilient to high latency and lossy networks
Usage Trends (2024)
Version | Global traffic share (Source) |
---|---|
HTTP/1.1 | 29.9% |
HTTP/2 | 49.6% |
HTTP/3 | 20.5% |
That’s all for the HTTP story - from humble text based roots to multiplexed, compressed, and QUIC. And all of it, just to load cat pictures faster. Thanks for reading!