A short historic overview: Building JavaScript apps that receive server events in real-time

May 25, 2020 23:30 · 1150 words · 6 minute read

Let’s say you want to build a new awesome web app where people could play <insert-your-favorite-card-game> with each other. Of course, you would want them to immediately see when a card was played. It turns out: this is not that easy.

The web has been built with client to server requests in mind. But the other way around - server to client ("server push") - was not a priority. In this post, I’m trying to outline a short history on what was done to implement server push, until finally WebSockets came along and solved our problems.

The basis: HTTP

At first, in 1991, there was the Hypertext Transfer Protocol (HTTP). This protocol is the foundation of the web. The basic idea: A client sends a request to a URL on a server and receives a response. The response can be empty, but can also be some content like HTML text. The client then can do something with the response. In the case of the web browser, the response is then rendered as a website.

Every request by the web browser resulted in a new page load - and there were no options to trigger a new request without a causing a new page load.

JavaScript XMLHttpRequest Object (Ajax)

Then came JavaScript and, in the early 2000s, its feature XMLHTTPRequest. With JavaScript, there was finally a built-in non-proprietary programming language that ran on the client side (= browser side). And the XMLHTTPRequest feature made it possible to trigger new HTTP requests and handle their responses without extra page loads.

Ground-breaking web apps like GMail would have looked very differently without this feature. This method - sending asynchronous requests to the server - was then called Ajax (Asynchronous JavaScript and XML). Ajax is still focused on sending requests from client to server, but it led the way for some clever hacks to finally get server push rolling.

Simple Polling

One simple & often-used way to create a sense of server push was polling. The client would repeatedly send requests to the server (e.g. every five seconds) and ask if a new event had arrived. Downside: This created a lot of network traffic, especially if you wanted to get the new event near-instantly after it occurred - as then, you had to set your polling interval very low.

Long-Polling

Techniques like long-polling made polling better and less strainful on network. With long-polling, the client also makes sequential requests against the server.

The difference to polling: The requests are not sent in a fixed interval. Instead, each request would be kept open until the server has a message for the client. When the server has a message, it answers the request with the message. This closes the connection. The client then immediately sends out a new request to listen for new messages.

What also very likely might happen: As both browsers and web servers have a request timeout duration set (default configurations range from 30s to 5min), the request might time out. In this case, the client also has to send out a fresh request.

While long-polling allows for instant event pushes to the client, it still has some major disadvantages. The main being that each message (server to client as well as client to server) requires a full HTTP request. First and foremost, each request requires a new TCP connection being set up, which involves quite a latency. Second, each HTTP request is accompanied by data like request & response headers, and cookies. With cookies, this can become around several kilobytes. Overall, long polling is not a good solution for scenarios that involve sending many subsequent events where a low latency is important. This StackOverflow answer gives a good & more detailed overview on why WebSockets are better.

Other techniques

Besides simple polling and long polling, there were also other clever hacks in place, like loading a hidden iFrame that would never finish loading. And using this connection to maintain a connection and transfer information. Most of these techniques were put together under the “Comet” umbrella term. But I won’t go too much into detail here, as I’m not that familiar with these techniques.

Instead…the modern solution is here…

Join WebSockets!

WebSocket is a protocol that enables a website and a web server to have a persistent communication connection. The communication is bi-directional, which means that now finally, both parts - server and client - can initiate messages once the connection is established. The initial connection still has to be initiated by the client via a HTTP request. The connection then gets upgraded to the WebSocket protocol. With WebSocket, message overhead is kept very low. All in all, this makes WebSocket suitable for sending many events that are near-instantaneous. The name stems from the “socket” concept, which means that two processes communiciate via a shared object (either on the same machine, or over the network). This shared object is provided by the operating system on ask.

Nowadays, WebSocket support is built-in to most browsers. Browsers started adding support around late 2009 (Google Chrome 4 being the first). Nowadays, you can count on every relevant browser to have support - even IE10 already has it. See caniuse.com for detailed browser support!

To use WebSockets, you need to set up both your server and client. When staying in the JavaScript ecosytem: On the server side, WebSockets are not part of Node.js’ standard library, so the easiest way is to use a library like ws. On the client side, things are easy - as said, browsers have WebSockets support built-in, so you can directly start using the WebSocket API within your client-side JavaScript.

If you are starting a new project where server side pushes are a thing, or you generally want to have a persistent connection between client and server (e.g. maybe also because you want to send many small events to the server), then you really should take a closer look into WebSockets.

Chances are high that they are exactly what you need in making your development life easier!

Update 2020-05-30

I published a follow-up post: “Presenting Socket.IO: Building a chat in 70 lines”. Socket.IO is a library that uses WebSockets under the hood, but fixes some annoyances compared to using WebSockets directly.

Update 2020-06-02

Cagatay and Johnny (many thanks, you two!) mentioned to me that another popular method to realize server push was to use Adobe Flash. Flash applications could be embedded into websites. Flash was an external program and offered more system permissions to its own programming language ActionScript than the browsers would to JavaScript. A server could quite easily message the corresponding Flash application. Then, the Flash application would transport the message into the JavaScript context - apparently one common way was to modify the site’s URL (<normal-url>#<data-string-from-server>). The JavaScript application could read the URL and extract the data string from it. Sounds complicated, but apparently that method was quite popular! 🤯