HAProxy’s load-balancing algorithms
HAProxy supports many load-balancing algorithms which may be used in many different type of cases.
That said, cache servers, which deliver most of the time the static content from your web applications, may require some specific load-balancing algorithms.
HAProxy stands in front of your cache server for some good reasons:
- SSL offloading (read PHK’s feeling about SSL, Varnish and HAProxy)
- HTTP content switching capabilities
- advanced load-balancing algorithms
The main purpose of this article is to show how HAProxy can be used to aggregate Varnish servers memory storage in some kind of « JBOD » mode (like the « Just a Bunch Of Disks« ).
Main purpose of the examples delivered here are to optimize the resources on the cache, mainly its memory, in order to improve the HIT rate. This will also improve your application response time and make your site top ranked on google 🙂
Content Switching in HAProxy
This has been covered many times on this blog.
As a quick introduction for readers who are not familiar with HAProxy, let’s explain how it works.
Clients will get connected to HAProxy through a Frontend. Then HAProxy routes traffic to a backend (server farm) where the load-balancing algorithm is used to choose a server.
A frontend can points to multiple backends and the choice of a backend is made through acls and use_backend rules..
Acls can be formed using fetches. A fetch is a directive which instructs HAProxy where to get content from.
Enough theory, let’s make a practical example: splitting static and dynamic traffic using the following rules:
- Static content is hosted on domain names starting by ‘static.’ and ‘images.’
- Static content files extensions are ‘.jpg’ ‘.png’ ‘.gif’ ‘.css’ ‘.js’
- Static content can match any of the rule above
- anything which is not static is considered as dynamic
The configuration sniplet below should be integrated into the HAProxy frontend. It matches the rules above to do traffic splitting. The varnish servers will stands in the bk_static farm.
frontend ft_public <frontend settings> acl static_domain req.hdr_beg(Host) -i static. images. acl static_content path_end -i .jpg .png .gif .css .js use_backend bk_static if static_domain or static_content default_backend bk_dynamic backend bk_static <parameters related to static content delivery>
The configuration above creates 2 named acls ‘static_domain‘ and ‘static_content‘ which are used by the used_backend rule to route the traffic to varnish servers.
HAProxy and hash based load-balancing algotithm
Later in this article, we’ll heavily used the hash based load-balancing algorithms from HAProxy.
So a few information here (non exhaustive, it would deserve a long blog article) which will be useful for people wanting to understand what happens deep inside HAProxy.
The following parameters are taken into account when computing a hash algorithm:
- number of servers in the farm
- weight of each server in the farm
- status of the servers (UP or DOWN)
If any of the parameter above changes, the whole hash computation also changes, hence request may hit an other server. This may lead to a negative impact on the response time of the application (during a short period of time).
Fortunately, HAProxy allows ‘consistent’ hashing, which means that only the traffic related to the change will be impacted.
That’s why you’ll see a lot of hash-type consistent directives in the configuration samples below.
Load-Balancing varnish cache server
Now, let’s focus on the magic we can add in the bk_static server farm.
Hashing the URL
HAProxy can hash the URL to pick up a server. With this load-balancing algorithm, we guarantee that a single URL will always hit the same Varnish server.
hashing the URL path only
In the example below, HAProxy hashes the URL path, which is from the first slash ‘/’ character up to the question mark ‘?’:
backend bk_static balance uri hash-type consistent
hashing the whole url, including the query string
In some cases, the query string may contain some variables in the query string, which means we must include the query string in the hash:
backend bk_static balance uri whole hash-type consistent
Query string parameter hash
That said, in some cases (API, etc…), hashing the whole URL is not enough. We may want to hash only on a particular query string parameter.
This applies well in cases where the client can forge itself the URL and all the parameters may be randomly ordered.
The configuration below tells HAProxy to apply the hash to the query string parameter named ‘id’ (IE: /image.php?width=512&id=12&height=256)
backend bk_static balance url_param id hash-type consistent
hash on a HTTP header
HAProxy can apply the hash to a specific HTTP header field.
The example below applies it on the Host header. This can be used for people hosting many domain names with a few pages, like users dedicated pages.
backend bk_static balance hdr(Host) hash-type consistent
Compose your own hash: concatenation of Host header and URL
Nowadays, HAProxy becomes more and more flexible and we can use this flexibility in its configuration.
Imagine, in your varnish configuration, you have a storage hash key based on the concatenation of the host header and the URI, then you may want to apply the same load-balancing algorithm into HAProxy, to optimize your caches.
The configuration below creates a new HTTP header field named X-LB which contains the host header (converted to lowercase) concatenated to the request uri (converted in lowercase too).
backend bk_static http-request set-header X-LB %[req.hdr(Host),lower]%[req.uri,lower] balance hdr(X-LB) hash-type consistent
HAProxy and Varnish works very well together. Each soft can benefit from performance and flexibility of the other one.