HAProxy’s load-balancing algorithms

HAProxy supports many load-balancing algorithms which may be used in many different type of cases.
That said, cache servers, which deliver most of the time the static content from your web applications, may require some specific load-balancing algorithms.

HAProxy stands in front of your cache server for some good reasons:

  • SSL offloading (read PHK’s feeling about SSL, Varnish and HAProxy)
  • HTTP content switching capabilities
  • advanced load-balancing algorithms

The main purpose of this article is to show how HAProxy can be used to aggregate Varnish servers memory storage in some kind of “JBOD” mode (like the “Just a Bunch Of Disks“).
Main purpose of the examples delivered here are to optimize the resources on the cache, mainly its memory, in order to improve the HIT rate. This will also improve your application response time and make your site top ranked on google 🙂

Content Switching in HAProxy

This has been covered many times on this blog.
As a quick introduction for readers who are not familiar with HAProxy, let’s explain how it works.

Clients will get connected to HAProxy through a Frontend. Then HAProxy routes traffic to a backend (server farm) where the load-balancing algorithm is used to choose a server.
A frontend can points to multiple backends and the choice of a backend is made through acls and use_backend rules..
Acls can be formed using fetches. A fetch is a directive which instructs HAProxy where to get content from.

Enough theory, let’s make a practical example: splitting static and dynamic traffic using the following rules:

  • Static content is hosted on domain names starting by ‘static.’ and ‘images.’
  • Static content files extensions are ‘.jpg’ ‘.png’ ‘.gif’ ‘.css’ ‘.js’
  • Static content can match any of the rule above
  • anything which is not static is considered as dynamic

The configuration sniplet below should be integrated into the HAProxy frontend. It matches the rules above to do traffic splitting. The varnish servers will stands in the bk_static farm.
[sourcecode language=”text”]
frontend ft_public
<frontend settings>
acl static_domain req.hdr_beg(Host) -i static. images.
acl static_content path_end -i .jpg .png .gif .css .js
use_backend bk_static if static_domain or static_content
default_backend bk_dynamic

backend bk_static
<parameters related to static content delivery>
[/sourcecode]

The configuration above creates 2 named acls ‘static_domain‘ and ‘static_content‘ which are used by the used_backend rule to route the traffic to varnish servers.

HAProxy and hash based load-balancing algotithm


Later in this article, we’ll heavily used the hash based load-balancing algorithms from HAProxy.
So a few information here (non exhaustive, it would deserve a long blog article) which will be useful for people wanting to understand what happens deep inside HAProxy.

The following parameters are taken into account when computing a hash algorithm:

  • number of servers in the farm
  • weight of each server in the farm
  • status of the servers (UP or DOWN)

If any of the parameter above changes, the whole hash computation also changes, hence request may hit an other server. This may lead to a negative impact on the response time of the application (during a short period of time).
Fortunately, HAProxy allows ‘consistent’ hashing, which means that only the traffic related to the change will be impacted.
That’s why you’ll see a lot of hash-type consistent directives in the configuration samples below.

Load-Balancing varnish cache server

Now, let’s focus on the magic we can add in the bk_static server farm.

Hashing the URL

HAProxy can hash the URL to pick up a server. With this load-balancing algorithm, we guarantee that a single URL will always hit the same Varnish server.

hashing the URL path only


In the example below, HAProxy hashes the URL path, which is from the first slash ‘/’ character up to the question mark ‘?’:
[sourcecode language=”text”]
backend bk_static
balance uri
hash-type consistent
[/sourcecode]

hashing the whole url, including the query string


In some cases, the query string may contain some variables in the query string, which means we must include the query string in the hash:
[sourcecode language=”text”]
backend bk_static
balance uri whole
hash-type consistent
[/sourcecode]

Query string parameter hash


That said, in some cases (API, etc…), hashing the whole URL is not enough. We may want to hash only on a particular query string parameter.
This applies well in cases where the client can forge itself the URL and all the parameters may be randomly ordered.
The configuration below tells HAProxy to apply the hash to the query string parameter named ‘id’ (IE: /image.php?width=512&id=12&height=256)
[sourcecode language=”text”]
backend bk_static
balance url_param id
hash-type consistent
[/sourcecode]

hash on a HTTP header


HAProxy can apply the hash to a specific HTTP header field.
The example below applies it on the Host header. This can be used for people hosting many domain names with a few pages, like users dedicated pages.
[sourcecode language=”text”]
backend bk_static
balance hdr(Host)
hash-type consistent
[/sourcecode]

Compose your own hash: concatenation of Host header and URL


Nowadays, HAProxy becomes more and more flexible and we can use this flexibility in its configuration.
Imagine, in your varnish configuration, you have a storage hash key based on the concatenation of the host header and the URI, then you may want to apply the same load-balancing algorithm into HAProxy, to optimize your caches.

The configuration below creates a new HTTP header field named X-LB which contains the host header (converted to lowercase) concatenated to the request uri (converted in lowercase too).
[sourcecode language=”text”]
backend bk_static
http-request set-header X-LB %[req.hdr(Host),lower]%[req.uri,lower]
balance hdr(X-LB)
hash-type consistent
[/sourcecode]

Conclusion


HAProxy and Varnish works very well together. Each soft can benefit from performance and flexibility of the other one.

Links