Web services' best friend – the load balancer

A load balancer is a tool that allows distributing HTTP requests (or other kinds of network requests) among several backend resources.

The main operation of a load balancer is to allow traffic to be directed to a single address to be distributed among several identical backend servers that can spread the load and achieve better throughput. Typically, the traffic will be distributed through round-robin, that is, sequentially across all of them:

First one worker, then the other, consecutively:

That's the normal operation. But it can also be used to replace services. The load balancer ensures that each request goes cleanly to one worker or another. The services in the pool of workers can be different, so we can use it to cleanly make the transition between one version of the web service and another. 

For our purposes, a group of old web services that are behind a load balancer can add one or more replacement services that are backward compatible, without interrupting the operation. The new service replacing the old one will be added in small numbers (maybe one or two workers) to split the traffic in a reasonable configuration, and ensure that everything is working as expected. After the verification, replace it completely by stopping sending new requests to the old services, draining them, and leaving only the new servers.

If done in a quick movement, like when deploying a new version of a service, this is called a rolling update, so the workers are replaced one by one.

But for migrating from the old monolith to the new microservices, a slower pace is wiser. A service can live for days in a split of 5%/95% so any unexpected error will appear only a twentieth of the time, before moving to 33/66, then 50/50, then 100% migrated.

A highly loaded system with good observability will be able to detect problems very quickly and may only need to wait minutes before proceeding. Most legacy systems will likely not fall into this category, though.

Any web server capable of acting in reverse proxy mode, such as NGINX, is capable of working as a load balancer, but, for this task, probably the most complete option is HAProxy (http://www.haproxy.org/).

HAProxy is specialized in acting as a load balancer in situations of high availability and high demand. It's very configurable and accepts traffic from HTTP to lower-level TCP connection if necessary. It also has a fantastic status page that will help to monitor the traffic going through it, as well as taking fast action such as disabling a failing worker.

Cloud providers such as AWS or Google also offer integrated load balancer products. They are very interesting to work from the edge of our network, as their stability makes them great, but they won't be as configurable and easy to integrate into your operating system as HAProxy. For example, the offering by Amazon Web Services is called Elastic Load Balancing (ELB)—https://aws.amazon.com/elasticloadbalancing/.

To migrate from a traditional server with an external IP referenced by DNS and put a load balancer in the front, you need to follow the following procedure:

  1. Create a new DNS to access the current system. This will allow you to refer to the old system independently when the transition is done.
  2. Deploy your load balancer, configured to serve the traffic to your old system on the old DNS. This way, accessing either the load balancer or the old system, the request will ultimately be delivered in the same place. Create a DNS just for the load balancer, to allow referring specifically to it.
  3. Test that sending a request to the load balancer directed to the host of the old DNS works as expected. You can send a request using the following curl command:
$ curl --header "Host:old-dns.com" http://loadbalancer/path/
  1. Change the DNS to point to the load balancer IP. Changing DNS registries takes time, as caches will be involved. During that time, no matter where the request is received, it will be processed in the same way. Leave this state for a day or two, to be totally sure that every possible cache is outdated and uses the new IP value.
  2. The old IP is no longer in use. The server can (and should) be removed from the externally facing network, leaving only the load balancer to connect. Any request that needs to go to the old server can use its specific new DNS.

Note that a load balancer like HAProxy can work with URL paths, meaning it can direct different paths to different microservices, something extremely useful in the migration from a monolith.

Because a load balancer is a single point of failure, you'll need to load balance your load balancer. The easiest way of doing it is creating several identical copies of HAProxy, as you'd do with any other web service, and adding a cloud provider load balancer on top. 

Because HAProxy is so versatile and fast, when properly configured, you can use it as a central point to redirect your requests—in true microservices fashion!