Queue connections to servers
Rate limiting involves dropping requests when request rates or request counts exceed defined thresholds. While this approach can be tuned to provide significant benefits, it is possible that you can use connection queueing to achieve the desired level of fairness without resorting to rate limiting.
With connection queueing, the proxy stores excess connections until the web servers are freed up to handle them. HAProxy Enterprise is designed to hold lots of connections without a sharp increase in memory or CPU usage.
Use the maxconn
parameter to specify the maximum number of concurrent connections that will be sent to the web server.
In the following example, up to 30 connections will be sent to each server. Once all servers reach their maximum, the connections queue up in HAProxy Enterprise:
backend servers
server s1 192.168.30.10:80 check maxconn 30
server s2 192.168.31.10:80 check maxconn 30
server s3 192.168.31.10:80 check maxconn 30
With this configuration, at most 90 connections can be active at a time. New connections will be queued on the proxy until an active connection closes.
To define how long clients should be queued, add the timeout queue
directive:
backend servers
timeout queue 10s
server s1 192.168.30.10:80 check maxconn 30
server s2 192.168.31.10:80 check maxconn 30
server s3 192.168.31.10:80 check maxconn 30
If a connection request still cannot be dispatched within the timeout
period, the client receives a 503 Service Unavailable
error. This more timely response is generally more desireable than allowing servers to become overwhelmed. From the client's perspective, it's better to receive a timely error that can be handled programmatically than to wait an extended amount of time and possibly cause errors that are more difficult to resolve.
Next up
Limit HTTP requests per day