Performance tuning HAProxy Enterprise

In this guide, we’ll describe ways to get the optimal performance from your HAProxy Enterprise load balancer when under heavy load. This ensures fast, seamless, and scalable operation of the load balancer. We recommend that you implement these best practices before executing any performance testing or benchmarking, and before going live with your system:

Configure the load balancer for high availability
- When you group multiple load balancers into a cluster, it allows you to scale out your load balancing capacity. It also facilitates failover and reduces the risk of service interruption.
Tune the load balancer settings for performance
- Tuning your operating system settings can provide the best performance for the load balancer. In addition, there are settings within the load balancer configuration that you should change to accompany the OS settings.

After you’ve tuned the load balancer for performance, you can then monitor it under high load to confirm that it and the kernel are performing optimally together.

Configure the load balancer for high availability Jump to heading

You can configure multiple load balancer instances for high availability in one of the following modes:

Active/Active clustering
- In this mode, two load balancers run, and both are active and receive traffic.
Active/Standby clustering
- In this mode, two load balancers run, but only one is active and receives traffic at a time.

Tune your settings for performance Jump to heading

Your load balancer’s settings and your kernel’s settings must work together to achieve optimal performance. You will use sysctl to set the kernel settings, and you will set the load balancer settings in the HAProxy Enterprise configuration file.

Tune the operating system Jump to heading

We provide VM images for OpenStack, VMware, AWS, and Azure. These have the recommended kernel settings for the load balancer applied automatically. If you aren’t using one of these, you’ll need to manually configure the settings.

Use sysctl, a Linux program for reading and modifying the attributes of the system kernel, to set kernel settings to the values we recommend. To tune your system:

Edit the file that begins with 30-hapee in the /etc/sysctl.d/ directory. This file contains kernel settings for HAProxy Enterprise that are, by default, disabled. Enable the recommended settings by un-commenting them (remove the prefixing hash sign). The example below is for version 3.1:

Info

The recommended settings could be different across versions. The settings present in the file on your system correspond to the recommended settings for your installed version. More information is located above each setting in the file.

/etc/sysctl.d/30-hapee-3.1.conf
text
#### HAPEE-3.1 : recommended settings for best performance. Uncomment the
#### lines you'd like to enable.
#### To reload:     systemctl restart systemd-sysctl
####          or    service procps start
####          or    sysctl -p /etc/sysctl.d/*.conf
# Limit the per-socket default receive/send buffers to limit memory usage
# when running with a lot of concurrent connections. Values are in bytes
# and represent minimum, default and maximum. Defaults: 4096 87380 4194304
#
# net.ipv4.tcp_rmem            = 4096 16060 262144
# net.ipv4.tcp_wmem            = 4096 16384 262144  
# Allow early reuse of a same source port for outgoing connections. It is
# required above a few hundred connections per second. Defaults: 0
#
net.ipv4.tcp_tw_reuse        = 1
# Extend the source port range for outgoing TCP connections. This limits early
# port reuse and makes use of 64000 source ports. Defaults: 32768 61000
#
net.ipv4.ip_local_port_range = 1024 65023
# Increase the TCP SYN backlog size. This is generally required to support very
# high connection rates as well as to resist SYN flood attacks. Setting it too
# high will delay SYN cookie usage though. Defaults: 1024
#
net.ipv4.tcp_max_syn_backlog = 60000
# Timeout in seconds for the TCP FIN_WAIT state. Lowering it speeds up release
# of dead connections, though it will cause issues below 25-30 seconds. It is
# preferable not to change it if possible. Default: 60
#
net.ipv4.tcp_fin_timeout     = 30
# Limit the number of outgoing SYN-ACK retries. This value is a direct
# amplification factor of SYN floods, so it is important to keep it reasonably
# low. However, too low will prevent clients on lossy networks from connecting.
# Using 3 as a default value gives good results (4 SYN-ACK total) and lowering
# it to 1 under SYN flood attack can save a lot of bandwidth. Default: 5
#
net.ipv4.tcp_synack_retries  = 3
# Set this to one to allow local processes to bind to an IP which is not yet
# present on the system. This is typically what happens with a shared VRRP
# address, where you want both master and backup to be started eventhough the
# IP is not yet present. Always leave it to 1. Default: 0
#
net.ipv4.ip_nonlocal_bind    = 1
net.ipv6.ip_nonlocal_bind    = 1
# Serves as a higher bound for all of the system's SYN backlogs. Put it at
# least as high as tcp_max_syn_backlog, otherwise clients may experience
# difficulties to connect at high rates or under SYN attacks. Default: 128
#
net.core.somaxconn           = 60000
# Number of unprocessed incoming packets that can be queued for later
# processing. This has minimal effect. Default: 1000
net.core.netdev_max_backlog  = 10000
#### HAPEE-3.1 : end of recommended settings.

/etc/sysctl.d/30-hapee-3.1.conf
text
#### HAPEE-3.1 : recommended settings for best performance. Uncomment the
#### lines you'd like to enable.
#### To reload:     systemctl restart systemd-sysctl
####          or    service procps start
####          or    sysctl -p /etc/sysctl.d/*.conf
# Limit the per-socket default receive/send buffers to limit memory usage
# when running with a lot of concurrent connections. Values are in bytes
# and represent minimum, default and maximum. Defaults: 4096 87380 4194304
#
# net.ipv4.tcp_rmem            = 4096 16060 262144
# net.ipv4.tcp_wmem            = 4096 16384 262144  
# Allow early reuse of a same source port for outgoing connections. It is
# required above a few hundred connections per second. Defaults: 0
#
net.ipv4.tcp_tw_reuse        = 1
# Extend the source port range for outgoing TCP connections. This limits early
# port reuse and makes use of 64000 source ports. Defaults: 32768 61000
#
net.ipv4.ip_local_port_range = 1024 65023
# Increase the TCP SYN backlog size. This is generally required to support very
# high connection rates as well as to resist SYN flood attacks. Setting it too
# high will delay SYN cookie usage though. Defaults: 1024
#
net.ipv4.tcp_max_syn_backlog = 60000
# Timeout in seconds for the TCP FIN_WAIT state. Lowering it speeds up release
# of dead connections, though it will cause issues below 25-30 seconds. It is
# preferable not to change it if possible. Default: 60
#
net.ipv4.tcp_fin_timeout     = 30
# Limit the number of outgoing SYN-ACK retries. This value is a direct
# amplification factor of SYN floods, so it is important to keep it reasonably
# low. However, too low will prevent clients on lossy networks from connecting.
# Using 3 as a default value gives good results (4 SYN-ACK total) and lowering
# it to 1 under SYN flood attack can save a lot of bandwidth. Default: 5
#
net.ipv4.tcp_synack_retries  = 3
# Set this to one to allow local processes to bind to an IP which is not yet
# present on the system. This is typically what happens with a shared VRRP
# address, where you want both master and backup to be started eventhough the
# IP is not yet present. Always leave it to 1. Default: 0
#
net.ipv4.ip_nonlocal_bind    = 1
net.ipv6.ip_nonlocal_bind    = 1
# Serves as a higher bound for all of the system's SYN backlogs. Put it at
# least as high as tcp_max_syn_backlog, otherwise clients may experience
# difficulties to connect at high rates or under SYN attacks. Default: 128
#
net.core.somaxconn           = 60000
# Number of unprocessed incoming packets that can be queued for later
# processing. This has minimal effect. Default: 1000
net.core.netdev_max_backlog  = 10000
#### HAPEE-3.1 : end of recommended settings.

Caution

Be mindful of the port range you specify in net.ipv4.ip_local_port_range. The lower port number should be higher than the highest port on which other services on your system may be listening. For example, if the highest port your other services on the same machine are listening on is 3306, the lower port number in the range should be higher than 3306.

Reload the file using the command systemctl restart systemd-sysctl. Systemctl will apply the changes, and these changes will persist after a reboot.
```
nix
sudo systemctl restart systemd-sysctl
```
```
nix
sudo systemctl restart systemd-sysctl
```
This command produces no output.
Tip

To test a sysctl settings without persisting it after a reboot, set it individually using the sysctl -w command. For example:
```
nix
sudo sysctl -w net.ipv4.ip_local_port_range="1024 65023"
```
```
nix
sudo sysctl -w net.ipv4.ip_local_port_range="1024 65023"
```
Tip

To change a sysctl value when running a Docker container, add the --sysctl parameter to your docker run command with the name of the kernel setting and its value, for example:
```
nix
--sysctl net.ipv4.tcp_fin_timeout=30
```
```
nix
--sysctl net.ipv4.tcp_fin_timeout=30
```
Optionally, disable swap for best performance.

Tune the load balancer settings Jump to heading

In this section, we’ll show changes you can make to the load balancer configuration to improve performance.

If, after you have tuned your kernel settings and your load balancer settings, your monitoring shows that the load balancer performance is still insufficient, you may be able to optimize the load balancer settings for your hardware to show improved performance.

Configure the load balancer for high traffic volume Jump to heading

Use these global configuration directives to tune the load balancer’s behavior. Consider their values carefully, as they directly impact CPU and memory usage. These directives most often need adjustment, as their optimal values depend on your traffic volume.

Option	Description
`maxconn`	This defines the maximum number of per-process, concurrent, TCP connections. If the number of concurrent connections exceeds the value set by `maxconn`, the load balancer will stop processing them and allow them to queue up. If you see that the load balancer frequently approaches the default `maxconn` value, you could consider increasing it. Keep in mind that you should monitor the number of file descriptors the load balancer uses, as this affects memory usage, and you want to prevent the load balancer from using up all available file descriptors, as other processes need them as well. See the overload protection configuration tutorial for more information about configuring maximum connection limits. Once you have configured your `maxconn` values, you can implement queues in the load balancer, which help prevent overwhelming your servers.
`fd-hard-limit`	This setting limits the number of file descriptors the load balancer can use. When your monitoring shows that the load balancer is accepting fewer connections than expected, it may be because you’ve specified a value for `fd-hard-limit`. Be aware that you must take into account the file descriptor limits of your system when setting a value for `fd-hard-limit`.
`tune.pool-high-fd-ratio`	Use `tune.pool-high-fd-ratio`, to define what percentage of the maximum number of available file descriptors the load balancer can use before it will kill idle connections in order to create new connections. In other words, this defines a threshold after which the load balancer will kill an idle connection in the pool of connections available for reuse in order to free its associated file descriptor, replacing it with a new connection. The default is 25 to indicate 25%. If the load balancer must kill a large number of idle connections, this prevents it from reusing connections. Reusing connections is beneficial, as reusing a connection uses less CPU than does creating a new connection. The drawback to this is that more idle connections available for reuse ultimately means that the load balancer uses more file descriptors, which uses more memory. The tradeoff here is that you use more memory at the expense of using less CPU. This number of file descriptors often translates to number of simultaneous connections, though with many threads, this isn’t 1:1. For a more detailed explanation about how the load balancer manages sockets across multiple threads, see the configuration tutorial for performance optimization for large systems. If you see high CPU usage, and the load balancer is creating many more new connections than it’s reusing connections, you may see improvement by setting `tune.pool-high-fd-ratio` to 90 and `tune.pool-low-fd-ratio` to 80, but be mindful of the number of file descriptors the load balancer uses under high load. If that results in the load balancer using too many file descriptors for your system, set `tune.pool-high-fd-ratio` to 50 and `tune.pool-low-fd-ratio` to 40.
`tune.pool-low-fd-ratio`	Use `tune.pool-low-fd-ratio` to define the percentage of the maximum number of available file descriptors the load balancer will use before it stops pooling idle connections. The default is 20 to indicate 20%, but you can ultimately raise the number of connections that are reused by raising this value.
`http-reuse`	You can specify how you want connections to backend servers to be shared0 between HTTP requests. Keeping idle connections open for reuse reduces the amount of overhead required for initiating new connections to backend servers. This defaults to `safe`, where the first request of a session is always sent over a new connection, and then subsequent requests may be sent over existing connections. By default, the load balancer runs in keep-alive mode which keeps connections idle to both the client and the server to facilitate connection reuse. Setting this to `aggressive` improves connection reuse, and therefore requires less CPU, but can result in failed requests if the client can’t retry requests. This is safer than `always`, but imposes more risk than `safe`. Setting this to `always` can significantly improve performance over the default, `safe`, and even potentially provide more performance than `aggressive`, but has a greatly increased risk of connection failures, since requests will always travel over existing connections. Make sure that your servers don’t break connections quickly after releasing them, as this will result in connection failures.
`tune.bufsize`	We recommend that you don’t change the default for `tune.bufsize`, which sets the buffer size in bytes. If you increase this value, you should decrease your `maxconn` values by the same factor that you increase `tune.bufsize`. This setting has a direct impact on the amount of memory the load balancer uses, but you may need to increase it if large HTTP requests cause errors.

For more information about these settings and other tunables that impact performance, see the reference guide for performance tuning.

Tip

Also consider enabling HTTP/3 (QUIC) on your frontends as this will enhance performance for connections with clients that support it.

Optimize the load balancer settings for your hardware Jump to heading

As of version 2.4, the load balancer automatically optimizes its operations specifically for your hardware, and generally you don’t need to tune how it interacts with your CPUs, though there may be cases where you should consider enabling some additional settings. Starting in version 3.2, there are some new options for tuning these behaviors. See our configuration tutorial for performance optimization for large systems for more information and to determine which settings are best for your hardware.

Monitor the load balancer under high load Jump to heading

You can use the HAProxy Runtime API to gather statistics about the load balancer’s performance.

To benchmark the load balancer:

Make sure that you have tuned your settings for high performance by implementing the recommended settings for the load balancer and for the kernel for performance under high traffic load.
Enable the HAProxy Runtime API.

On a regular interval (such as one second), call the show info command to retrieve the current load balancer statistics and record the results of each call. Here, we use watch utility to make this call once per second:

nix
watch -n 1 'echo "show info" | \
  sudo socat stdio tcp4-connect:127.0.0.1:9999'

nix
watch -n 1 'echo "show info" | \
  sudo socat stdio tcp4-connect:127.0.0.1:9999'

The following statistics provide insights into how the load balancer is performing:

Idle_pct: This is the percentage of CPU that is still idle; the portion in use is usually the amount the load balancer is using across all of its threads. A value greater than 50 generally doesn’t indicate that performance issues are the result of the hardware. A value less 50, and even more so, a value less than 20, indicates that there may be CPU limitations. This is one case where you may need to consider optimizing the load balancer settings for your hardware.
CurrConns: This is the number of connections or requests the load balancer is currently processing, or the number of currently established connections. Monitor this value to make sure it’s less than your maxconn value.
SessRate: This is the number of requests per second.
SslRate: This is the number of requests over SSL/TLS per second.
SslFrontendKeyRate: This is the number of SSL/TLS keys calculated per second.

The example below shows the output of show info on version 3.1. The statistics present in the output may vary per version.

output
text
Name: hapee-lb
Version: 3.1.0-1.0.0-348.519
Release_date: 2025/08/26
Nbthread: 22
Nbproc: 1
Process_num: 1
Pid: 8
Uptime: 0d 0h08m16s
Uptime_sec: 496
Memmax_MB: 0
PoolAlloc_MB: 636
PoolUsed_MB: 636
PoolFailed: 0
Ulimit-n: 1000429
Maxsock: 1000429
Maxconn: 500001
Hard_maxconn: 500001
CurrConns: 263462
CumConns: 1594300
CumReq: 7025819
MaxSslConns: 0
CurrSslConns: 189296
CumSslConns: 1399631
Maxpipes: 0
PipesUsed: 0
PipesFree: 0
ConnRate: 2278
ConnRateLimit: 0
MaxConnRate: 22813
SessRate: 2277
SessRateLimit: 0
MaxSessRate: 23383
SslRate: 2062
SslRateLimit: 0
MaxSslRate: 22889
SslFrontendKeyRate: 1098
SslFrontendMaxKeyRate: 8125
SslFrontendSessionReuse_pct: 46
SslBackendKeyRate: 0
SslBackendMaxKeyRate: 0
SslCacheLookups: 92
SslCacheMisses: 89
CompressBpsIn: 0
CompressBpsOut: 0
CompressBpsRateLim: 0
Tasks: 265114
Run_queue: 4
Idle_pct: 50
node: 7de1a8dbdbbf
Stopping: 0
Jobs: 263546
Unstoppable Jobs: 1
Listeners: 77
ActivePeers: 2
ConnectedPeers: 2
DroppedLogs: 0
BusyPolling: 0
FailedResolutions: 0
TotalBytesOut: 22954321242
TotalSplicedBytesOut: 0
BytesOutRate: 112438176
DebugCommandsIssued: 0
CumRecvLogs: 0
Build info: 3.1.0-1.0.0-348.519
Memmax_bytes: 0
PoolAlloc_bytes: 666931408
PoolUsed_bytes: 666931408
Start_time_sec: 1746127957
Tainted: 0
TotalWarnings: 0
MaxconnReached: 0
BootTime_ms: 37
Niced_tasks: 0
CurrStreams: 1
CumStreams: 2
BlockedTrafficWarnings: 0

output
text
Name: hapee-lb
Version: 3.1.0-1.0.0-348.519
Release_date: 2025/08/26
Nbthread: 22
Nbproc: 1
Process_num: 1
Pid: 8
Uptime: 0d 0h08m16s
Uptime_sec: 496
Memmax_MB: 0
PoolAlloc_MB: 636
PoolUsed_MB: 636
PoolFailed: 0
Ulimit-n: 1000429
Maxsock: 1000429
Maxconn: 500001
Hard_maxconn: 500001
CurrConns: 263462
CumConns: 1594300
CumReq: 7025819
MaxSslConns: 0
CurrSslConns: 189296
CumSslConns: 1399631
Maxpipes: 0
PipesUsed: 0
PipesFree: 0
ConnRate: 2278
ConnRateLimit: 0
MaxConnRate: 22813
SessRate: 2277
SessRateLimit: 0
MaxSessRate: 23383
SslRate: 2062
SslRateLimit: 0
MaxSslRate: 22889
SslFrontendKeyRate: 1098
SslFrontendMaxKeyRate: 8125
SslFrontendSessionReuse_pct: 46
SslBackendKeyRate: 0
SslBackendMaxKeyRate: 0
SslCacheLookups: 92
SslCacheMisses: 89
CompressBpsIn: 0
CompressBpsOut: 0
CompressBpsRateLim: 0
Tasks: 265114
Run_queue: 4
Idle_pct: 50
node: 7de1a8dbdbbf
Stopping: 0
Jobs: 263546
Unstoppable Jobs: 1
Listeners: 77
ActivePeers: 2
ConnectedPeers: 2
DroppedLogs: 0
BusyPolling: 0
FailedResolutions: 0
TotalBytesOut: 22954321242
TotalSplicedBytesOut: 0
BytesOutRate: 112438176
DebugCommandsIssued: 0
CumRecvLogs: 0
Build info: 3.1.0-1.0.0-348.519
Memmax_bytes: 0
PoolAlloc_bytes: 666931408
PoolUsed_bytes: 666931408
Start_time_sec: 1746127957
Tainted: 0
TotalWarnings: 0
MaxconnReached: 0
BootTime_ms: 37
Niced_tasks: 0
CurrStreams: 1
CumStreams: 2
BlockedTrafficWarnings: 0

Run your traffic load test.
After your load test has finished, you can discontinue collecting the statistics.

Product Documentation

API Documentation

Configuration Documentation

Install the VM image

Performance tuning HAProxy Enterprise

Configure the load balancer for high availability Jump to heading

Tune your settings for performance Jump to heading

Tune the operating system Jump to heading

Tune the load balancer settings Jump to heading

Configure the load balancer for high traffic volume Jump to heading

Optimize the load balancer settings for your hardware Jump to heading

Monitor the load balancer under high load Jump to heading

See also Jump to heading

Install the VM image

Product Documentation

API Documentation

Configuration Documentation

Configure the load balancer for high availability Jump to heading #

Tune your settings for performance Jump to heading #

Tune the operating system Jump to heading #

Tune the load balancer settings Jump to heading #

Configure the load balancer for high traffic volume Jump to heading #

Optimize the load balancer settings for your hardware Jump to heading #

Monitor the load balancer under high load Jump to heading #

See also Jump to heading #

Privacy Settings

Configure the load balancer for high availability Jump to heading

Tune your settings for performance Jump to heading

Tune the operating system Jump to heading

Tune the load balancer settings Jump to heading

Configure the load balancer for high traffic volume Jump to heading

Optimize the load balancer settings for your hardware Jump to heading

Monitor the load balancer under high load Jump to heading

See also Jump to heading