Let’s say you want to build a new awesome web app where people could play
<insert-your-favorite-card-game> with each other. Of course, you would want them to immediately see when a card was played. It turns out: this is not that easy.
The web has been built with client to server requests in mind. But the other way around - server to client ("server push") - was not a priority. In this post, I’m trying to outline a short history on what was done to implement server push, until finally WebSockets came along and solved our problems.
The basis: HTTP
At first, in 1991, there was the Hypertext Transfer Protocol (HTTP). This protocol is the foundation of the web. The basic idea: A client sends a request to a URL on a server and receives a response. The response can be empty, but can also be some content like HTML text. The client then can do something with the response. In the case of the web browser, the response is then rendered as a website.
Every request by the web browser resulted in a new page load - and there were no options to trigger a new request without a causing a new page load.
One simple & often-used way to create a sense of server push was polling. The client would repeatedly send requests to the server (e.g. every five seconds) and ask if a new event had arrived. Downside: This created a lot of network traffic, especially if you wanted to get the new event near-instantly after it occurred - as then, you had to set your polling interval very low.
Techniques like long-polling made polling better and less strainful on network. With long-polling, the client also makes sequential requests against the server.
The difference to polling: The requests are not sent in a fixed interval. Instead, each request would be kept open until the server has a message for the client. When the server has a message, it answers the request with the message. This closes the connection. The client then immediately sends out a new request to listen for new messages.
What also very likely might happen: As both browsers and web servers have a request timeout duration set (default configurations range from 30s to 5min), the request might time out. In this case, the client also has to send out a fresh request.
While long-polling allows for instant event pushes to the client, it still has some major disadvantages. The main being that each message (server to client as well as client to server) requires a full HTTP request. First and foremost, each request requires a new TCP connection being set up, which involves quite a latency. Second, each HTTP request is accompanied by data like request & response headers, and cookies. With cookies, this can become around several kilobytes. Overall, long polling is not a good solution for scenarios that involve sending many subsequent events where a low latency is important. This StackOverflow answer gives a good & more detailed overview on why WebSockets are better.
Besides simple polling and long polling, there were also other clever hacks in place, like loading a hidden iFrame that would never finish loading. And using this connection to maintain a connection and transfer information. Most of these techniques were put together under the “Comet” umbrella term. But I won’t go too much into detail here, as I’m not that familiar with these techniques.
Instead…the modern solution is here…
WebSocket is a protocol that enables a website and a web server to have a persistent communication connection. The communication is bi-directional, which means that now finally, both parts - server and client - can initiate messages once the connection is established. The initial connection still has to be initiated by the client via a HTTP request. The connection then gets upgraded to the WebSocket protocol. With WebSocket, message overhead is kept very low. All in all, this makes WebSocket suitable for sending many events that are near-instantaneous. The name stems from the “socket” concept, which means that two processes communiciate via a shared object (either on the same machine, or over the network). This shared object is provided by the operating system on ask.
Nowadays, WebSocket support is built-in to most browsers. Browsers started adding support around late 2009 (Google Chrome 4 being the first). Nowadays, you can count on every relevant browser to have support - even IE10 already has it. See caniuse.com for detailed browser support!
If you are starting a new project where server side pushes are a thing, or you generally want to have a persistent connection between client and server (e.g. maybe also because you want to send many small events to the server), then you really should take a closer look into WebSockets.
Chances are high that they are exactly what you need in making your development life easier!
I published a follow-up post: “Presenting Socket.IO: Building a chat in 70 lines”. Socket.IO is a library that uses WebSockets under the hood, but fixes some annoyances compared to using WebSockets directly.
- Customizing my shell: From bash to zsh to fish
- JSON5: JSON with comments (and more!)
- jq: Analyzing JSON data on the command line
- Get Total Directory Size via Command Line
- Changing DNS server for Huawei LTE router
- Notes on Job & Career Development
- Adding full-text search to a static site (= no backend needed)
- Generating a random string on Linux & macOS
- Caddy web server: Why use it? How to use it?
- Tailwind CSS: A Primer