Administration
Performance tuning HAProxy Enterprise
In this guide, we’ll describe ways to get the optimal performance from your HAProxy Enterprise load balancer when under heavy load. This ensures fast, seamless, and scalable operation of the load balancer. We recommend that you implement these best practices before executing any performance testing or benchmarking, and before going live with your system:
- Configure the load balancer for high availabilty
- When you group multiple load balancers into a cluster, it allows you to scale out your load balancing capacity. It also facilitates failover and reduces the risk of service interruption.
- Tune the load balancer settings for performance
- Tuning your operating system settings can provide the best performance for the load balancer. In addition, there are settings within the load balancer configuration that you should change to accompany the OS settings.
After you’ve tuned the load balancer for performance, you can then monitor it under high load to confirm that it and the kernel are performing optimally together.
Configure the load balancer for high availability Jump to heading
You can configure multiple load balancer instances for high availability in one of the following modes:
- Active/Active clustering
- In this mode, two load balancers run, but only one will be active and receive traffic at a time.
- Active/Standby clustering
- In this mode, two load balancers run, but only one will be active and receive traffic at a time.
Tune your settings for performance Jump to heading
Your load balancer’s settings and your kernel’s settings must work together to achieve optimal performance. You will use sysctl
to set the kernel settings, and you will set the load balancer settings in the HAProxy Enterprise configuration file.
Tune the operating system Jump to heading
We provide VM images for OpenStack, VMware, AWS, and Azure. These have the recommended kernel settings for the load balancer applied automatically. If you aren’t using one of these, you’ll need to manually configure the settings.
Use sysctl
, a Linux program for reading and modifying the attributes of the system kernel, to set kernel settings to the values we recommend. To tune your system:
-
Edit the file that begins with
30-hapee
in the/etc/sysctl.d/
directory. This file contains kernel settings for HAProxy Enterprise that are, by default, disabled. Enable the recommended settings by un-commenting them (remove the prefixing hash sign). The example below is for version 3.1:Info
The recommended settings could be different across versions. The settings present in the file on your system correspond to the recommended settings for your installed version. More information is located above each setting in the file.
/etc/sysctl.d/30-hapee-3.1.conftext#### HAPEE-3.1 : recommended settings for best performance. Uncomment the#### lines you'd like to enable.#### To reload: systemctl restart systemd-sysctl#### or service procps start#### or sysctl -p /etc/sysctl.d/*.conf# Limit the per-socket default receive/send buffers to limit memory usage# when running with a lot of concurrent connections. Values are in bytes# and represent minimum, default and maximum. Defaults: 4096 87380 4194304## net.ipv4.tcp_rmem = 4096 16060 262144# net.ipv4.tcp_wmem = 4096 16384 262144# Allow early reuse of a same source port for outgoing connections. It is# required above a few hundred connections per second. Defaults: 0#net.ipv4.tcp_tw_reuse = 1# Extend the source port range for outgoing TCP connections. This limits early# port reuse and makes use of 64000 source ports. Defaults: 32768 61000#net.ipv4.ip_local_port_range = 1024 65023# Increase the TCP SYN backlog size. This is generally required to support very# high connection rates as well as to resist SYN flood attacks. Setting it too# high will delay SYN cookie usage though. Defaults: 1024#net.ipv4.tcp_max_syn_backlog = 60000# Timeout in seconds for the TCP FIN_WAIT state. Lowering it speeds up release# of dead connections, though it will cause issues below 25-30 seconds. It is# preferable not to change it if possible. Default: 60#net.ipv4.tcp_fin_timeout = 30# Limit the number of outgoing SYN-ACK retries. This value is a direct# amplification factor of SYN floods, so it is important to keep it reasonably# low. However, too low will prevent clients on lossy networks from connecting.# Using 3 as a default value gives good results (4 SYN-ACK total) and lowering# it to 1 under SYN flood attack can save a lot of bandwidth. Default: 5#net.ipv4.tcp_synack_retries = 3# Set this to one to allow local processes to bind to an IP which is not yet# present on the system. This is typically what happens with a shared VRRP# address, where you want both master and backup to be started eventhough the# IP is not yet present. Always leave it to 1. Default: 0#net.ipv4.ip_nonlocal_bind = 1net.ipv6.ip_nonlocal_bind = 1# Serves as a higher bound for all of the system's SYN backlogs. Put it at# least as high as tcp_max_syn_backlog, otherwise clients may experience# difficulties to connect at high rates or under SYN attacks. Default: 128#net.core.somaxconn = 60000# Number of unprocessed incoming packets that can be queued for later# processing. This has minimal effect. Default: 1000net.core.netdev_max_backlog = 10000#### HAPEE-3.1 : end of recommended settings./etc/sysctl.d/30-hapee-3.1.conftext#### HAPEE-3.1 : recommended settings for best performance. Uncomment the#### lines you'd like to enable.#### To reload: systemctl restart systemd-sysctl#### or service procps start#### or sysctl -p /etc/sysctl.d/*.conf# Limit the per-socket default receive/send buffers to limit memory usage# when running with a lot of concurrent connections. Values are in bytes# and represent minimum, default and maximum. Defaults: 4096 87380 4194304## net.ipv4.tcp_rmem = 4096 16060 262144# net.ipv4.tcp_wmem = 4096 16384 262144# Allow early reuse of a same source port for outgoing connections. It is# required above a few hundred connections per second. Defaults: 0#net.ipv4.tcp_tw_reuse = 1# Extend the source port range for outgoing TCP connections. This limits early# port reuse and makes use of 64000 source ports. Defaults: 32768 61000#net.ipv4.ip_local_port_range = 1024 65023# Increase the TCP SYN backlog size. This is generally required to support very# high connection rates as well as to resist SYN flood attacks. Setting it too# high will delay SYN cookie usage though. Defaults: 1024#net.ipv4.tcp_max_syn_backlog = 60000# Timeout in seconds for the TCP FIN_WAIT state. Lowering it speeds up release# of dead connections, though it will cause issues below 25-30 seconds. It is# preferable not to change it if possible. Default: 60#net.ipv4.tcp_fin_timeout = 30# Limit the number of outgoing SYN-ACK retries. This value is a direct# amplification factor of SYN floods, so it is important to keep it reasonably# low. However, too low will prevent clients on lossy networks from connecting.# Using 3 as a default value gives good results (4 SYN-ACK total) and lowering# it to 1 under SYN flood attack can save a lot of bandwidth. Default: 5#net.ipv4.tcp_synack_retries = 3# Set this to one to allow local processes to bind to an IP which is not yet# present on the system. This is typically what happens with a shared VRRP# address, where you want both master and backup to be started eventhough the# IP is not yet present. Always leave it to 1. Default: 0#net.ipv4.ip_nonlocal_bind = 1net.ipv6.ip_nonlocal_bind = 1# Serves as a higher bound for all of the system's SYN backlogs. Put it at# least as high as tcp_max_syn_backlog, otherwise clients may experience# difficulties to connect at high rates or under SYN attacks. Default: 128#net.core.somaxconn = 60000# Number of unprocessed incoming packets that can be queued for later# processing. This has minimal effect. Default: 1000net.core.netdev_max_backlog = 10000#### HAPEE-3.1 : end of recommended settings.Caution
Be mindful of the port range you specify in
net.ipv4.ip_local_port_range
. The lower port number should be higher than the highest port on which other services on your system may be listening. For example, if the highest port your other services on the same machine are listening on is 3306, the lower port number in the range should be higher than 3306. -
Reload the file using the command
systemctl restart systemd-sysctl
. Systemctl will apply the changes, and these changes will persist after a reboot.nixsudo systemctl restart systemd-sysctlnixsudo systemctl restart systemd-sysctlThis command produces no output.
Tip
To test a
sysctl
settings without persisting it after a reboot, set it individually using thesysctl -w command
. For example:nixsudo sysctl -w net.ipv4.ip_local_port_range="1024 65023"nixsudo sysctl -w net.ipv4.ip_local_port_range="1024 65023"Tip
To change a
sysctl
value when running a Docker container, add the--sysctl
parameter to yourdocker run
command with the name of the kernel setting and its value, for example:nix--sysctl net.ipv4.tcp_fin_timeout=30nix--sysctl net.ipv4.tcp_fin_timeout=30 -
Optionally, disable swap for best performance.
Tune the load balancer settings Jump to heading
In this section, we’ll show changes you can make to the load balancer configuration to improve performance.
If, after you have tuned your kernel settings and your load balancer settings, your monitoring shows that the load balancer performance is still insufficient, you may be able to optimize the load balancer settings for your hardware to show improved performance.
Configure the load balancer for high traffic volume Jump to heading
Use these global configuration directives to tune the load balancer’s behavior. Consider their values carefully, as they directly impact CPU and memory usage. These directives most often need adjustment, as their optimal values depend on your traffic volume.
Option | Description |
---|---|
maxconn |
This defines the maximum number of per-process, concurrent, TCP connections. If the number of concurrent connections exceeds the value set by If you see that the load balancer frequently approaches the default See the overload protection configuration tutorial for more information about configuring maximum connection limits. Once you have configured your |
fd-hard-limit |
This setting limits the number of file descriptors the load balancer can use. When your monitoring shows that the load balancer is accepting fewer connections than expected, it may be because you’ve specified a value for fd-hard-limit . Be aware that you must take into account the file descriptor limits of your system when setting a value for fd-hard-limit . |
tune.pool-high-fd-ratio |
Use Reusing connections is beneficial, as reusing a connection uses less CPU than does creating a new connection. The drawback to this is that more idle connections available for reuse ultimately means that the load balancer uses more file descriptors, which uses more memory. The tradeoff here is that you use more memory at the expense of using less CPU. This number of file descriptors often translates to number of simultaneous connections, though with many threads, this isn’t 1:1. For a more detailed explanation about how the load balancer manages sockets across multiple threads, see the configuration tutorial for performance optimization for large systems. If you see high CPU usage, and the load balancer is creating many more new connections than it’s reusing connections, you may see improvement by setting |
tune.pool-low-fd-ratio |
Use |
http-reuse |
You can specify how you want connections to backend servers to be shared0 between HTTP requests. Keeping idle connections open for reuse reduces the amount of overhead required for initiating new connections to backend servers. This defaults to Setting this to Setting this to |
tune.bufsize |
We recommend that you don’t change the default for tune.bufsize , which sets the buffer size in bytes. If you increase this value, you should decrease your maxconn values by the same factor that you increase tune.bufsize . This setting has a direct impact on the amount of memory the load balancer uses, but you may need to increase it if large HTTP requests cause errors. |
For more information about these settings and other tunables that impact performance, see the reference guide for performance tuning.
Tip
Also consider enabling HTTP/3 (QUIC) on your frontends as this will enahance performance for connections with clients that support it.
Optimize the load balancer settings for your hardware Jump to heading
As of version 2.4, the load balancer automatically optimizes its operations specifically for your hardware, and generally you don’t need to tune how it interacts with your CPUs, though there may be cases where you should consider enabling some additional settings. Starting in version 3.2, there are some new options for tuning these behaviors. See our configuration tutorial for performance optimization for large systems for more information and to determine which settings are best for your hardware.
Monitor the load balancer under high load Jump to heading
You can use the HAProxy Runtime API to gather statistics about the load balancer’s performance.
To benchmark the load balancer:
-
Make sure that you have tuned your settings for high performance by implementing the recommended settings for the load balancer and for the kernel for performance under high traffic load.
-
On a regular interval (such as one second), call the
show info
command to retrieve the current load balancer statistics and record the results of each call. Here, we usewatch
utility to make this call once per second:nixwatch -n 1 'echo "show info" | \sudo socat stdio tcp4-connect:127.0.0.1:9999'nixwatch -n 1 'echo "show info" | \sudo socat stdio tcp4-connect:127.0.0.1:9999'The following statistics provide insights into how the load balancer is performing:
Idle_pct
: This is the percentage of CPU that is still idle; the portion in use is usually the amount the load balancer is using across all of its threads. A value greater than50
generally doesn’t indicate that performance issues are the result of the hardware. A value less 50, and even more so, a value less than20
, indicates that there may be CPU limitations. This is one case where you may need to consider optimizing the load balancer settings for your hardware.CurrConns
: This is the number of connections or requests the load balancer is currently processing, or the number of currently established connections. Monitor this value to make sure it’s less than yourmaxconn
value.SessRate
: This is the number of requests per second.SslRate
: This is the number of requests over SSL/TLS per second.SslFrontendKeyRate
: This is the number of SSL/TLS keys calculated per second.
The example below shows the output of
show info
on version 3.1. The statistics present in the output may vary per version.outputtextName: hapee-lbVersion: 3.1.0-1.0.0-348.519Release_date: 2025/08/26Nbthread: 22Nbproc: 1Process_num: 1Pid: 8Uptime: 0d 0h08m16sUptime_sec: 496Memmax_MB: 0PoolAlloc_MB: 636PoolUsed_MB: 636PoolFailed: 0Ulimit-n: 1000429Maxsock: 1000429Maxconn: 500001Hard_maxconn: 500001CurrConns: 263462CumConns: 1594300CumReq: 7025819MaxSslConns: 0CurrSslConns: 189296CumSslConns: 1399631Maxpipes: 0PipesUsed: 0PipesFree: 0ConnRate: 2278ConnRateLimit: 0MaxConnRate: 22813SessRate: 2277SessRateLimit: 0MaxSessRate: 23383SslRate: 2062SslRateLimit: 0MaxSslRate: 22889SslFrontendKeyRate: 1098SslFrontendMaxKeyRate: 8125SslFrontendSessionReuse_pct: 46SslBackendKeyRate: 0SslBackendMaxKeyRate: 0SslCacheLookups: 92SslCacheMisses: 89CompressBpsIn: 0CompressBpsOut: 0CompressBpsRateLim: 0Tasks: 265114Run_queue: 4Idle_pct: 50node: 7de1a8dbdbbfStopping: 0Jobs: 263546Unstoppable Jobs: 1Listeners: 77ActivePeers: 2ConnectedPeers: 2DroppedLogs: 0BusyPolling: 0FailedResolutions: 0TotalBytesOut: 22954321242TotalSplicedBytesOut: 0BytesOutRate: 112438176DebugCommandsIssued: 0CumRecvLogs: 0Build info: 3.1.0-1.0.0-348.519Memmax_bytes: 0PoolAlloc_bytes: 666931408PoolUsed_bytes: 666931408Start_time_sec: 1746127957Tainted: 0TotalWarnings: 0MaxconnReached: 0BootTime_ms: 37Niced_tasks: 0CurrStreams: 1CumStreams: 2BlockedTrafficWarnings: 0outputtextName: hapee-lbVersion: 3.1.0-1.0.0-348.519Release_date: 2025/08/26Nbthread: 22Nbproc: 1Process_num: 1Pid: 8Uptime: 0d 0h08m16sUptime_sec: 496Memmax_MB: 0PoolAlloc_MB: 636PoolUsed_MB: 636PoolFailed: 0Ulimit-n: 1000429Maxsock: 1000429Maxconn: 500001Hard_maxconn: 500001CurrConns: 263462CumConns: 1594300CumReq: 7025819MaxSslConns: 0CurrSslConns: 189296CumSslConns: 1399631Maxpipes: 0PipesUsed: 0PipesFree: 0ConnRate: 2278ConnRateLimit: 0MaxConnRate: 22813SessRate: 2277SessRateLimit: 0MaxSessRate: 23383SslRate: 2062SslRateLimit: 0MaxSslRate: 22889SslFrontendKeyRate: 1098SslFrontendMaxKeyRate: 8125SslFrontendSessionReuse_pct: 46SslBackendKeyRate: 0SslBackendMaxKeyRate: 0SslCacheLookups: 92SslCacheMisses: 89CompressBpsIn: 0CompressBpsOut: 0CompressBpsRateLim: 0Tasks: 265114Run_queue: 4Idle_pct: 50node: 7de1a8dbdbbfStopping: 0Jobs: 263546Unstoppable Jobs: 1Listeners: 77ActivePeers: 2ConnectedPeers: 2DroppedLogs: 0BusyPolling: 0FailedResolutions: 0TotalBytesOut: 22954321242TotalSplicedBytesOut: 0BytesOutRate: 112438176DebugCommandsIssued: 0CumRecvLogs: 0Build info: 3.1.0-1.0.0-348.519Memmax_bytes: 0PoolAlloc_bytes: 666931408PoolUsed_bytes: 666931408Start_time_sec: 1746127957Tainted: 0TotalWarnings: 0MaxconnReached: 0BootTime_ms: 37Niced_tasks: 0CurrStreams: 1CumStreams: 2BlockedTrafficWarnings: 0 -
Run your traffic load test.
-
After your load test has finished, you can discontinue collecting the statistics.
See also Jump to heading
- For complete information about performance tuning configuration options, see performance tuning reference.
- For a complete guide on multithreading in HAProxy and performance tuning for your hardware, see performance optimization for large systems.
- For more infomation about
maxconn
, see maxconn reference. - For more information about
tune.bufsize
, see tune.bufsize reference. - For more information about
http-reuse
, see http-reuse reference. - For more information about the
show info
HAProxy Runtime API command, see show info. Related commands include:- show sess for streams
- show dev which shows file descriptor limits
- show pools which shows information about the load balancer’s memory pools
- For more information about high availability see the tutorials for active/active clustering and active/standby clustering.
Do you have any suggestions on how we can improve the content of this page?