Administration

Performance tuning HAProxy Enterprise

In this guide, we’ll describe ways to get the optimal performance from your HAProxy Enterprise load balancer when under heavy load. This ensures fast, seamless, and scalable operation of the load balancer. We recommend that you implement these best practices before executing any performance testing or benchmarking, and before going live with your system:

  • Configure the load balancer for high availabilty
    • When you group multiple load balancers into a cluster, it allows you to scale out your load balancing capacity. It also facilitates failover and reduces the risk of service interruption.
  • Tune the load balancer settings for performance
    • Tuning your operating system settings can provide the best performance for the load balancer. In addition, there are settings within the load balancer configuration that you should change to accompany the OS settings.

After you’ve tuned the load balancer for performance, you can then monitor it under high load to confirm that it and the kernel are performing optimally together.

Configure the load balancer for high availability Jump to heading

You can configure multiple load balancer instances for high availability in one of the following modes:

  • Active/Active clustering
    • In this mode, two load balancers run, but only one will be active and receive traffic at a time.
  • Active/Standby clustering
    • In this mode, two load balancers run, but only one will be active and receive traffic at a time.

Tune your settings for performance Jump to heading

Your load balancer’s settings and your kernel’s settings must work together to achieve optimal performance. You will use sysctl to set the kernel settings, and you will set the load balancer settings in the HAProxy Enterprise configuration file.

Tune the operating system Jump to heading

We provide VM images for OpenStack, VMware, AWS, and Azure. These have the recommended kernel settings for the load balancer applied automatically. If you aren’t using one of these, you’ll need to manually configure the settings.

Use sysctl, a Linux program for reading and modifying the attributes of the system kernel, to set kernel settings to the values we recommend. To tune your system:

  1. Edit the file that begins with 30-hapee in the /etc/sysctl.d/ directory. This file contains kernel settings for HAProxy Enterprise that are, by default, disabled. Enable the recommended settings by un-commenting them (remove the prefixing hash sign). The example below is for version 3.1:

    Info

    The recommended settings could be different across versions. The settings present in the file on your system correspond to the recommended settings for your installed version. More information is located above each setting in the file.

    /etc/sysctl.d/30-hapee-3.1.conf
    text
    #### HAPEE-3.1 : recommended settings for best performance. Uncomment the
    #### lines you'd like to enable.
    #### To reload: systemctl restart systemd-sysctl
    #### or service procps start
    #### or sysctl -p /etc/sysctl.d/*.conf
    # Limit the per-socket default receive/send buffers to limit memory usage
    # when running with a lot of concurrent connections. Values are in bytes
    # and represent minimum, default and maximum. Defaults: 4096 87380 4194304
    #
    # net.ipv4.tcp_rmem = 4096 16060 262144
    # net.ipv4.tcp_wmem = 4096 16384 262144
    # Allow early reuse of a same source port for outgoing connections. It is
    # required above a few hundred connections per second. Defaults: 0
    #
    net.ipv4.tcp_tw_reuse = 1
    # Extend the source port range for outgoing TCP connections. This limits early
    # port reuse and makes use of 64000 source ports. Defaults: 32768 61000
    #
    net.ipv4.ip_local_port_range = 1024 65023
    # Increase the TCP SYN backlog size. This is generally required to support very
    # high connection rates as well as to resist SYN flood attacks. Setting it too
    # high will delay SYN cookie usage though. Defaults: 1024
    #
    net.ipv4.tcp_max_syn_backlog = 60000
    # Timeout in seconds for the TCP FIN_WAIT state. Lowering it speeds up release
    # of dead connections, though it will cause issues below 25-30 seconds. It is
    # preferable not to change it if possible. Default: 60
    #
    net.ipv4.tcp_fin_timeout = 30
    # Limit the number of outgoing SYN-ACK retries. This value is a direct
    # amplification factor of SYN floods, so it is important to keep it reasonably
    # low. However, too low will prevent clients on lossy networks from connecting.
    # Using 3 as a default value gives good results (4 SYN-ACK total) and lowering
    # it to 1 under SYN flood attack can save a lot of bandwidth. Default: 5
    #
    net.ipv4.tcp_synack_retries = 3
    # Set this to one to allow local processes to bind to an IP which is not yet
    # present on the system. This is typically what happens with a shared VRRP
    # address, where you want both master and backup to be started eventhough the
    # IP is not yet present. Always leave it to 1. Default: 0
    #
    net.ipv4.ip_nonlocal_bind = 1
    net.ipv6.ip_nonlocal_bind = 1
    # Serves as a higher bound for all of the system's SYN backlogs. Put it at
    # least as high as tcp_max_syn_backlog, otherwise clients may experience
    # difficulties to connect at high rates or under SYN attacks. Default: 128
    #
    net.core.somaxconn = 60000
    # Number of unprocessed incoming packets that can be queued for later
    # processing. This has minimal effect. Default: 1000
    net.core.netdev_max_backlog = 10000
    #### HAPEE-3.1 : end of recommended settings.
    /etc/sysctl.d/30-hapee-3.1.conf
    text
    #### HAPEE-3.1 : recommended settings for best performance. Uncomment the
    #### lines you'd like to enable.
    #### To reload: systemctl restart systemd-sysctl
    #### or service procps start
    #### or sysctl -p /etc/sysctl.d/*.conf
    # Limit the per-socket default receive/send buffers to limit memory usage
    # when running with a lot of concurrent connections. Values are in bytes
    # and represent minimum, default and maximum. Defaults: 4096 87380 4194304
    #
    # net.ipv4.tcp_rmem = 4096 16060 262144
    # net.ipv4.tcp_wmem = 4096 16384 262144
    # Allow early reuse of a same source port for outgoing connections. It is
    # required above a few hundred connections per second. Defaults: 0
    #
    net.ipv4.tcp_tw_reuse = 1
    # Extend the source port range for outgoing TCP connections. This limits early
    # port reuse and makes use of 64000 source ports. Defaults: 32768 61000
    #
    net.ipv4.ip_local_port_range = 1024 65023
    # Increase the TCP SYN backlog size. This is generally required to support very
    # high connection rates as well as to resist SYN flood attacks. Setting it too
    # high will delay SYN cookie usage though. Defaults: 1024
    #
    net.ipv4.tcp_max_syn_backlog = 60000
    # Timeout in seconds for the TCP FIN_WAIT state. Lowering it speeds up release
    # of dead connections, though it will cause issues below 25-30 seconds. It is
    # preferable not to change it if possible. Default: 60
    #
    net.ipv4.tcp_fin_timeout = 30
    # Limit the number of outgoing SYN-ACK retries. This value is a direct
    # amplification factor of SYN floods, so it is important to keep it reasonably
    # low. However, too low will prevent clients on lossy networks from connecting.
    # Using 3 as a default value gives good results (4 SYN-ACK total) and lowering
    # it to 1 under SYN flood attack can save a lot of bandwidth. Default: 5
    #
    net.ipv4.tcp_synack_retries = 3
    # Set this to one to allow local processes to bind to an IP which is not yet
    # present on the system. This is typically what happens with a shared VRRP
    # address, where you want both master and backup to be started eventhough the
    # IP is not yet present. Always leave it to 1. Default: 0
    #
    net.ipv4.ip_nonlocal_bind = 1
    net.ipv6.ip_nonlocal_bind = 1
    # Serves as a higher bound for all of the system's SYN backlogs. Put it at
    # least as high as tcp_max_syn_backlog, otherwise clients may experience
    # difficulties to connect at high rates or under SYN attacks. Default: 128
    #
    net.core.somaxconn = 60000
    # Number of unprocessed incoming packets that can be queued for later
    # processing. This has minimal effect. Default: 1000
    net.core.netdev_max_backlog = 10000
    #### HAPEE-3.1 : end of recommended settings.

    Caution

    Be mindful of the port range you specify in net.ipv4.ip_local_port_range. The lower port number should be higher than the highest port on which other services on your system may be listening. For example, if the highest port your other services on the same machine are listening on is 3306, the lower port number in the range should be higher than 3306.

  2. Reload the file using the command systemctl restart systemd-sysctl. Systemctl will apply the changes, and these changes will persist after a reboot.

    nix
    sudo systemctl restart systemd-sysctl
    nix
    sudo systemctl restart systemd-sysctl

    This command produces no output.

    Tip

    To test a sysctl settings without persisting it after a reboot, set it individually using the sysctl -w command. For example:

    nix
    sudo sysctl -w net.ipv4.ip_local_port_range="1024 65023"
    nix
    sudo sysctl -w net.ipv4.ip_local_port_range="1024 65023"

    Tip

    To change a sysctl value when running a Docker container, add the --sysctl parameter to your docker run command with the name of the kernel setting and its value, for example:

    nix
    --sysctl net.ipv4.tcp_fin_timeout=30
    nix
    --sysctl net.ipv4.tcp_fin_timeout=30
  3. Optionally, disable swap for best performance.

Tune the load balancer settings Jump to heading

In this section, we’ll show changes you can make to the load balancer configuration to improve performance.

If, after you have tuned your kernel settings and your load balancer settings, your monitoring shows that the load balancer performance is still insufficient, you may be able to optimize the load balancer settings for your hardware to show improved performance.

Configure the load balancer for high traffic volume Jump to heading

Use these global configuration directives to tune the load balancer’s behavior. Consider their values carefully, as they directly impact CPU and memory usage. These directives most often need adjustment, as their optimal values depend on your traffic volume.

Option Description
maxconn

This defines the maximum number of per-process, concurrent, TCP connections. If the number of concurrent connections exceeds the value set by maxconn, the load balancer will stop processing them and allow them to queue up.

If you see that the load balancer frequently approaches the default maxconn value, you could consider increasing it. Keep in mind that you should monitor the number of file descriptors the load balancer uses, as this affects memory usage, and you want to prevent the load balancer from using up all available file descriptors, as other processes need them as well.

See the overload protection configuration tutorial for more information about configuring maximum connection limits. Once you have configured your maxconn values, you can implement queues in the load balancer, which help prevent overwhelming your servers.

fd-hard-limit This setting limits the number of file descriptors the load balancer can use. When your monitoring shows that the load balancer is accepting fewer connections than expected, it may be because you’ve specified a value for fd-hard-limit. Be aware that you must take into account the file descriptor limits of your system when setting a value for fd-hard-limit.
tune.pool-high-fd-ratio

Use tune.pool-high-fd-ratio, to define what percentage of the maximum number of available file descriptors the load balancer can use before it will kill idle connections in order to create new connections. In other words, this defines a threshold after which the load balancer will kill an idle conection in the pool of connections availble for reuse in order to free its associated file descriptor, replacing it with a new connection. The default is 25 to indicate 25%. If the load balancer must kill a large number of idle connections, this prevents it from reusing connections.

Reusing connections is beneficial, as reusing a connection uses less CPU than does creating a new connection. The drawback to this is that more idle connections available for reuse ultimately means that the load balancer uses more file descriptors, which uses more memory. The tradeoff here is that you use more memory at the expense of using less CPU.

This number of file descriptors often translates to number of simultaneous connections, though with many threads, this isn’t 1:1. For a more detailed explanation about how the load balancer manages sockets across multiple threads, see the configuration tutorial for performance optimization for large systems.

If you see high CPU usage, and the load balancer is creating many more new connections than it’s reusing connections, you may see improvement by setting tune.pool-high-fd-ratio to 90 and tune.pool-low-fd-ratio to 80, but be mindful of the number of file descriptors the load balancer uses under high load. If that results in the load balancer using too many file descriptors for your system, set tune.pool-high-fd-ratio to 50 and tune.pool-low-fd-ratio to 40.

tune.pool-low-fd-ratio

Use tune.pool-low-fd-ratio to define the percentage of the maximum number of available file descriptors the load balancer will use before it stops pooling idle connections. The default is 20 to indicate 20%, but you can ultimately raise the number of connections that are reused by raising this value.

http-reuse

You can specify how you want connections to backend servers to be shared0 between HTTP requests. Keeping idle connections open for reuse reduces the amount of overhead required for initiating new connections to backend servers. This defaults to safe, where the first request of a session is always sent over a new connection, and then subsequent requests may be sent over existing connections. By default, the load balancer runs in keep-alive mode which keeps connections idle to both the client and the server to facilitate connection reuse.

Setting this to aggressive improves connection reuse, and therefore requires less CPU, but can result in failed requests if the client can’t retry requests. This is safer than always, but imposes more risk than safe.

Setting this to always can significantly improve performance over the default, safe, and even potentially provide more performance than agressive, but has a greatly increased risk of connection failures, since requests will always travel over existing connections. Make sure that your servers don’t break connections quickly after releasing them, as this will result in connection failures.

tune.bufsize We recommend that you don’t change the default for tune.bufsize, which sets the buffer size in bytes. If you increase this value, you should decrease your maxconn values by the same factor that you increase tune.bufsize. This setting has a direct impact on the amount of memory the load balancer uses, but you may need to increase it if large HTTP requests cause errors.

For more information about these settings and other tunables that impact performance, see the reference guide for performance tuning.

Tip

Also consider enabling HTTP/3 (QUIC) on your frontends as this will enahance performance for connections with clients that support it.

Optimize the load balancer settings for your hardware Jump to heading

As of version 2.4, the load balancer automatically optimizes its operations specifically for your hardware, and generally you don’t need to tune how it interacts with your CPUs, though there may be cases where you should consider enabling some additional settings. Starting in version 3.2, there are some new options for tuning these behaviors. See our configuration tutorial for performance optimization for large systems for more information and to determine which settings are best for your hardware.

Monitor the load balancer under high load Jump to heading

You can use the HAProxy Runtime API to gather statistics about the load balancer’s performance.

To benchmark the load balancer:

  1. Make sure that you have tuned your settings for high performance by implementing the recommended settings for the load balancer and for the kernel for performance under high traffic load.

  2. Enable the HAProxy Runtime API.

  3. On a regular interval (such as one second), call the show info command to retrieve the current load balancer statistics and record the results of each call. Here, we use watch utility to make this call once per second:

    nix
    watch -n 1 'echo "show info" | \
    sudo socat stdio tcp4-connect:127.0.0.1:9999'
    nix
    watch -n 1 'echo "show info" | \
    sudo socat stdio tcp4-connect:127.0.0.1:9999'

    The following statistics provide insights into how the load balancer is performing:

    • Idle_pct: This is the percentage of CPU that is still idle; the portion in use is usually the amount the load balancer is using across all of its threads. A value greater than 50 generally doesn’t indicate that performance issues are the result of the hardware. A value less 50, and even more so, a value less than 20, indicates that there may be CPU limitations. This is one case where you may need to consider optimizing the load balancer settings for your hardware.
    • CurrConns: This is the number of connections or requests the load balancer is currently processing, or the number of currently established connections. Monitor this value to make sure it’s less than your maxconn value.
    • SessRate: This is the number of requests per second.
    • SslRate: This is the number of requests over SSL/TLS per second.
    • SslFrontendKeyRate: This is the number of SSL/TLS keys calculated per second.

    The example below shows the output of show info on version 3.1. The statistics present in the output may vary per version.

    output
    text
    Name: hapee-lb
    Version: 3.1.0-1.0.0-348.519
    Release_date: 2025/08/26
    Nbthread: 22
    Nbproc: 1
    Process_num: 1
    Pid: 8
    Uptime: 0d 0h08m16s
    Uptime_sec: 496
    Memmax_MB: 0
    PoolAlloc_MB: 636
    PoolUsed_MB: 636
    PoolFailed: 0
    Ulimit-n: 1000429
    Maxsock: 1000429
    Maxconn: 500001
    Hard_maxconn: 500001
    CurrConns: 263462
    CumConns: 1594300
    CumReq: 7025819
    MaxSslConns: 0
    CurrSslConns: 189296
    CumSslConns: 1399631
    Maxpipes: 0
    PipesUsed: 0
    PipesFree: 0
    ConnRate: 2278
    ConnRateLimit: 0
    MaxConnRate: 22813
    SessRate: 2277
    SessRateLimit: 0
    MaxSessRate: 23383
    SslRate: 2062
    SslRateLimit: 0
    MaxSslRate: 22889
    SslFrontendKeyRate: 1098
    SslFrontendMaxKeyRate: 8125
    SslFrontendSessionReuse_pct: 46
    SslBackendKeyRate: 0
    SslBackendMaxKeyRate: 0
    SslCacheLookups: 92
    SslCacheMisses: 89
    CompressBpsIn: 0
    CompressBpsOut: 0
    CompressBpsRateLim: 0
    Tasks: 265114
    Run_queue: 4
    Idle_pct: 50
    node: 7de1a8dbdbbf
    Stopping: 0
    Jobs: 263546
    Unstoppable Jobs: 1
    Listeners: 77
    ActivePeers: 2
    ConnectedPeers: 2
    DroppedLogs: 0
    BusyPolling: 0
    FailedResolutions: 0
    TotalBytesOut: 22954321242
    TotalSplicedBytesOut: 0
    BytesOutRate: 112438176
    DebugCommandsIssued: 0
    CumRecvLogs: 0
    Build info: 3.1.0-1.0.0-348.519
    Memmax_bytes: 0
    PoolAlloc_bytes: 666931408
    PoolUsed_bytes: 666931408
    Start_time_sec: 1746127957
    Tainted: 0
    TotalWarnings: 0
    MaxconnReached: 0
    BootTime_ms: 37
    Niced_tasks: 0
    CurrStreams: 1
    CumStreams: 2
    BlockedTrafficWarnings: 0
    output
    text
    Name: hapee-lb
    Version: 3.1.0-1.0.0-348.519
    Release_date: 2025/08/26
    Nbthread: 22
    Nbproc: 1
    Process_num: 1
    Pid: 8
    Uptime: 0d 0h08m16s
    Uptime_sec: 496
    Memmax_MB: 0
    PoolAlloc_MB: 636
    PoolUsed_MB: 636
    PoolFailed: 0
    Ulimit-n: 1000429
    Maxsock: 1000429
    Maxconn: 500001
    Hard_maxconn: 500001
    CurrConns: 263462
    CumConns: 1594300
    CumReq: 7025819
    MaxSslConns: 0
    CurrSslConns: 189296
    CumSslConns: 1399631
    Maxpipes: 0
    PipesUsed: 0
    PipesFree: 0
    ConnRate: 2278
    ConnRateLimit: 0
    MaxConnRate: 22813
    SessRate: 2277
    SessRateLimit: 0
    MaxSessRate: 23383
    SslRate: 2062
    SslRateLimit: 0
    MaxSslRate: 22889
    SslFrontendKeyRate: 1098
    SslFrontendMaxKeyRate: 8125
    SslFrontendSessionReuse_pct: 46
    SslBackendKeyRate: 0
    SslBackendMaxKeyRate: 0
    SslCacheLookups: 92
    SslCacheMisses: 89
    CompressBpsIn: 0
    CompressBpsOut: 0
    CompressBpsRateLim: 0
    Tasks: 265114
    Run_queue: 4
    Idle_pct: 50
    node: 7de1a8dbdbbf
    Stopping: 0
    Jobs: 263546
    Unstoppable Jobs: 1
    Listeners: 77
    ActivePeers: 2
    ConnectedPeers: 2
    DroppedLogs: 0
    BusyPolling: 0
    FailedResolutions: 0
    TotalBytesOut: 22954321242
    TotalSplicedBytesOut: 0
    BytesOutRate: 112438176
    DebugCommandsIssued: 0
    CumRecvLogs: 0
    Build info: 3.1.0-1.0.0-348.519
    Memmax_bytes: 0
    PoolAlloc_bytes: 666931408
    PoolUsed_bytes: 666931408
    Start_time_sec: 1746127957
    Tainted: 0
    TotalWarnings: 0
    MaxconnReached: 0
    BootTime_ms: 37
    Niced_tasks: 0
    CurrStreams: 1
    CumStreams: 2
    BlockedTrafficWarnings: 0
  4. Run your traffic load test.

  5. After your load test has finished, you can discontinue collecting the statistics.

See also Jump to heading

Do you have any suggestions on how we can improve the content of this page?