HAProxy Enterprise Documentation 2.2r1

Queue connections to servers

Rate limiting involves dropping requests when request rates or request counts exceed defined thresholds. While this approach can be tuned to provide significant benefits, it is possible that you can use connection queueing to achieve the desired level of fairness without resorting to rate limiting.

With connection queueing, the proxy stores excess connections until the web servers are freed up to handle them. HAProxy Enterprise is designed to hold lots of connections without a sharp increase in memory or CPU usage.

Use the maxconn parameter to specify the maximum number of concurrent connections that will be sent to the web server.

In the following example, up to 30 connections will be sent to each server. Once all servers reach their maximum, the connections queue up in HAProxy Enterprise:

backend servers
     server s1 192.168.30.10:80 check  maxconn 30
     server s2 192.168.31.10:80 check  maxconn 30
     server s3 192.168.31.10:80 check  maxconn 30

With this configuration, at most 90 connections can be active at a time. New connections will be queued on the proxy until an active connection closes.

To define how long clients should be queued, add the timeout queue directive:

backend servers
     timeout queue 10s
     server s1 192.168.30.10:80 check  maxconn 30
     server s2 192.168.31.10:80 check  maxconn 30
     server s3 192.168.31.10:80 check  maxconn 30

If a connection request still cannot be dispatched within the timeout period, the client receives a 503 Service Unavailable error. This more timely response is generally more desireable than allowing servers to become overwhelmed. From the client's perspective, it's better to receive a timely error that can be handled programmatically than to wait an extended amount of time and possibly cause errors that are more difficult to resolve.


Next up

Limit HTTP requests per day