Creating a reverse proxy service in charge of routing requests depending on their base URLs

We can implement a reverse proxy in a couple of ways. One would be to create a new image based on HAProxy (https://hub.docker.com/_/haproxy/) and include configuration files inside it. That approach would be a good one if the number of different services is relatively static. Otherwise, we'd need to create a new image with a new configuration every time there is a new service (not a new release).
The second approach would be to expose a volume. That way, when needed, we could modify the configuration file instead building a whole new image. However, that has downsides as well. When Deploying to a cluster, we should avoid using volumes whenever they're not necessary. As you'll see soon, a proxy is one of those that do not require a volume. As a side note, --volume has been replaced with the docker service argument --mount.

The third option is to use one of the proxies designed to work with Docker Swarm. In this case, we'll use the container vfarcic/docker-flow-proxy (https://hub.docker.com/r/vfarcic/docker-flow-proxy/) It is based on HAProxy with additional features that allow us to reconfigure it by sending HTTP requests.

Let's give it a spin.

The command that creates the proxy service is as follows:

docker service create --name proxy \
-p 80:80 \
-p 443:443 \
-p 8080:8080 \
--network proxy \
-e MODE=swarm \
vfarcic/docker-flow-proxy

We opened ports 80 and 443 that will serve Internet traffic (HTTP and HTTPS). The third port is 8080. We'll use it to send configuration requests to the proxy. Further on, we specified that it should belong to the proxy network. That way, since go-demo is also attached to the same network, the proxy can access it through the proxy-SDN.

Through the proxy we just ran, we can observe one of the cool features of the network routing mesh. It does not matter which server the proxy is running in. We can send a request to any of the nodes and Docker networking will make sure that it is redirected to one of the proxies. We'll see that in action very soon.

The last argument is the environment variable MODE that tells the proxy that containers will be deployed to a Swarm cluster. Please consult the project README (https://github.com/vfarcic/docker-flow-proxy) for other combinations.

Figure 3-4: Docker Swarm cluster with the proxy service

Please note that the proxy, even though it is running inside one of the nodes, is placed outside to illustrate the logical separation better.

Before we move on, let's confirm that the proxy is running.

docker service ps proxy

We can proceed if the CURRENT STATE is Running. Otherwise, please wait until the service is up and running.

Now that the proxy is Deployed, we should let it know about the existence of the go-demo service:

curl "$(docker-machine ip node-1):8080/v1/docker-flow-\
proxy/reconfigure?serviceName=go-demo&servicePath=/demo&port=8080"

The request was sent to reconfigure the proxy specifying the service name go-demo, base URL path of the API /demo, and the internal port of the service 8080. From now on, all the requests to the proxy with the path that starts with /demo will be redirected to the go-demo service. This request is one of the additional features Docker Flow Proxy provides on top of HAProxy.
Please note that we sent the request to node-1. The proxy could be running inside any of the nodes and, yet, the request was successful. That is where Docker's Routing Mesh plays a critical role. We'll explore it in more detail later. For now, the important thing to note is that we can send a request to any of the nodes, and it will be redirected to the service that listens to the same port (in this case 8080).

The output of the request is as follows (formatted for readability):

{
"Mode": "swarm",
"Status": "OK",
"Message": "",
"ServiceName": "go-demo",
"AclName": "",
"ConsulTemplateFePath": "",
"ConsulTemplateBePath": "",
"Distribute": false,
"HttpsOnly": false,
"HttpsPort": 0,
"OutboundHostname": "",
"PathType": "",
"ReqMode": "http",
"ReqRepReplace": "",
"ReqRepSearch": "",
"ReqPathReplace": "",
"ReqPathSearch": "",
"ServiceCert": "",
"ServiceDomain": null,
"SkipCheck": false,
"TemplateBePath": "",
"TemplateFePath": "",
"TimeoutServer": "",
"TimeoutTunnel": "",
"Users": null,
"ServiceColor": "",
"ServicePort": "",
"AclCondition": "",
"FullServiceName": "",
"Host": "",
"LookupRetry": 0,
"LookupRetryInterval": 0,
"ServiceDest": [
{
"Port": "8080",
"ServicePath": [
"/demo"
]
,
"SrcPort": 0,
"SrcPortAcl": "",
"SrcPortAclName": ""
}
]
}

I won't go into details but note that the Status is OK indicating that the proxy was reconfigured correctly.

We can test that the proxy indeed works as expected by sending an HTTP request:

curl -i "$(docker-machine ip node-1)/demo/hello"

The output of the curl command is as follows.

HTTP/1.1 200 OK
Date: Thu, 01 Sep 2016 14:23:33 GMT
Content-Length: 14
Content-Type: text/plain; charset=utf-8

hello, world!

The proxy works! It responded with the HTTP status 200 and returned the API response hello, world!. As before, the request was not, necessarily, sent to the node that hosts the service but to the routing mesh that forwarded it to the proxy.

As an example, let's send the same request but this time, to node-3:

curl -i "$(docker-machine ip node-3)/demo/hello"

The result is still the same.

Let's explore the configuration generated by the proxy. It will give us more insights into the Docker Swarm Networking inner workings. As another benefit, if you choose to roll your own proxy solution, it might be useful to understand how to configure the proxy and leverage new Docker networking features.

We'll start by examining the configuration Docker Flow Proxy (https://github.com/vfarcic/docker-flow-proxy) created for us. We can do that by entering the running container to take a sneak peek at the file /cfg/haproxy.cfg. The problem is that finding a container run by Docker Swarm is a bit tricky. If we deployed it with Docker Compose, the container name would be predictable. It would use the format <PROJECT>_<SERVICE>_<INDEX>.

The docker service command runs containers with hashed names. The docker-flow-proxy created on my laptop has the name proxy.1.e07jvhdb9e6s76mr9ol41u4sn. Therefore, to get inside a running container deployed with Docker Swarm, we need to use a filter with, for example, an image name.

First, we need to find out on which node the proxy is running execute the following command:

NODE=$(docker service ps proxy | tail -n +2 | awk '{print $4}')

We listed the proxy service processes docker service ps proxy, removed the header tail -n +2, and output the node that resides inside the fourth column awk '{print $4}'. The output is stored as the environment variable NODE.

Now we can point our local Docker Engine to the node where the proxy resides:

eval $(docker-machine env $NODE)

Finally, the only thing left is to find the ID of the proxy container. We can do that with the following command:

ID=$(docker ps -q \
--filter label=com.docker.swarm.service.name=proxy)

Now that we have the container ID stored inside the variable, we can execute the command that will retrieve the HAProxy configuration:

docker exec -it \
$ID cat /cfg/haproxy.cfg

The important part of the configuration is as follows:

frontend services
bind *:80
bind *:443
mode http

acl url_go-demo8080 path_beg /demo
use_backend go-demo-be8080 if url_go-demo8080

backend go-demo-be8080
mode http
server go-demo go-demo:8080

The first part frontend should be familiar to those who have used HAProxy. It accepts requests on ports 80 HTTP and 443 HTTPS. If the path starts with /demo, it will be redirected to the backend go-demo-be. Inside it, requests are sent to the address go-demo on the port 8080. The address is the same as the name of the service we deployed. Since go-demo belongs to the same network as the proxy, Docker will make sure that the request is redirected to the destination container. Neat, isn't it? There is no need, anymore, to specify IPs and external ports.

The next question is how to do load balancing. How should we specify that the proxy should, for example, perform round-robin across all instances? Should we use a proxy for such a task?