Reliability
Health checks
Health checks ensure that only healthy servers are kept in the load balancing rotation. They check the status of each server by using one of the health checking modes described in this section.
On-demand webinar
Active health checks Jump to heading
An active health check attempts to connect to a server or send it an HTTP request at a regular interval. If the connection cannot be established or the HTTP request fails, the health checks fails.
If the number of consecutive failed checks meets the failure threshold, the server is taken out of rotation; however, health checks continue while the server is down. If the server resumes service and responds successfully to the health checks, and the number of consecutive successful responses meets the success threshold, the server is restored to rotation.
TCP health checks Jump to heading
A basic TCP-layer health check tries to connect to the server’s TCP port. The check is valid when the server answers with a SYN/ACK packet. Enable it by adding a check argument to each server line that you would like to monitor.
In the following example, the load balancer tries to connect to port 80 on each server:
haproxybackend serversserver srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
haproxybackend serversserver srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
To send health check probes to a port other than the one to which normal traffic is sent, add the port argument. In the following example, the health check is sent to port 8080.
haproxybackend serversserver srv1 10.0.0.1:80 check port 8080server srv2 10.0.0.2:80 check port 8080
haproxybackend serversserver srv1 10.0.0.1:80 check port 8080server srv2 10.0.0.2:80 check port 8080
Define a send/expect sequence Jump to heading
Use option tcp-check to define a sequence of messages to send and responses to expect back. Below, we send the string PING and expect to receive the string PONG:
haproxybackend serversoption tcp-checktcp-check send PING\r\ntcp-check expect string PONGserver srv1 10.0.0.1:80 check
haproxybackend serversoption tcp-checktcp-check send PING\r\ntcp-check expect string PONGserver srv1 10.0.0.1:80 check
HTTP health checks Jump to heading
An HTTP-layer health check sends an HTTP OPTIONS request to the server and expects to get a successful response. To enable it, add option httpchk to the backend section:
haproxybackend serversoption httpchkserver srv1 192.168.1.5:80 check
haproxybackend serversoption httpchkserver srv1 192.168.1.5:80 check
Checks send an OPTIONS request to the URL / by default.
You can change the HTTP method and URL by specifying them on the option httpchk line. In the following example, we send checks using GET instead of OPTIONS to the URL /healthz:
haproxybackend serversoption httpchk GET /healthzserver srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
haproxybackend serversoption httpchk GET /healthzserver srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
If the response status code is in the 2xx or 3xx range, the server is healthy.
Expect a response status Jump to heading
Use http-check expect to specify which HTTP status code indicates a healthy server. In the following example, the server must return a 200 OK response:
haproxybackend serversoption httpchkhttp-check expect status 200server srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
haproxybackend serversoption httpchkhttp-check expect status 200server srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
As an alternative to http-check expect status, where you specify one explicit status value, you can use rstatus to specify a regular expression to match multiple status codes. In the next example, the health check uses rstatus in conjunction with the negation operator (!) to consider all statuses as valid except for 5xx responses:
haproxybackend serversoption httpchkhttp-check expect ! rstatus ^5default-server inter 3s fall 3 rise 2server srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
haproxybackend serversoption httpchkhttp-check expect ! rstatus ^5default-server inter 3s fall 3 rise 2server srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:80 check
Expect a string in the response Jump to heading
To specify a string to search for in the body of an HTTP or TCP response:
-
Set the string that you expect to see in the body by adding the
expect stringdirective tohttp-checkortcp-check. In the next example, the response must contain the stringOK:haproxybackend serversoption httpchkhttp-check expect string OKserver srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:8080 checkhaproxybackend serversoption httpchkhttp-check expect string OKserver srv1 10.0.0.1:80 checkserver srv2 10.0.0.2:8080 check
Use the expect rstring argument to specify a regular expression instead of an explicit string.
Customize with the send directive Jump to heading
Available since
- HAProxy 2.2
- HAProxy Enterprise 2.2r1
- HAProxy ALOHA 12.5
Another way to change the HTTP method and URL is by adding the http-check send line and specifying the new values there. In the following example, checks send GET requests to the URL /healthz:
haproxybackend serversoption httpchkhttp-check send meth HEAD uri /healthz ver HTTP/1.1 hdr Host test.localserver srv1 192.168.1.5:80 check
haproxybackend serversoption httpchkhttp-check send meth HEAD uri /healthz ver HTTP/1.1 hdr Host test.localserver srv1 192.168.1.5:80 check
You can send POST requests too:
haproxybackend serversoption httpchkhttp-check send meth POST uri /health hdr Content-Type "application/json;charset=UTF-8" hdr Host www.mwebsite.com body "{\"id\": 1, \"field\": \"value\"}"server srv1 192.168.1.5:80 check
haproxybackend serversoption httpchkhttp-check send meth POST uri /health hdr Content-Type "application/json;charset=UTF-8" hdr Host www.mwebsite.com body "{\"id\": 1, \"field\": \"value\"}"server srv1 192.168.1.5:80 check
Customize connect arguments Jump to heading
Available since
- HAProxy 2.2
- HAProxy Enterprise 2.2r1
- HAProxy ALOHA 12.5
Use the connect directive to enable SNI, connect over SSL/TLS, perform health checks over SOCKS4, and choose the protocol, such as HTTP/2 or FastCGI. Here’s an example where health checks are performed using HTTP/2 and SSL:
haproxybackend serversoption httpchkhttp-check connect ssl alpn h2http-check send meth HEAD uri /health ver HTTP/2 hdr Host www.test.localserver srv1 192.168.1.5:443 check
haproxybackend serversoption httpchkhttp-check connect ssl alpn h2http-check send meth HEAD uri /health ver HTTP/2 hdr Host www.test.localserver srv1 192.168.1.5:443 check
To close a connection cleanly instead of sending a RST, use the linger option.
Check multiple HTTP endpoints Jump to heading
Available since
- HAProxy 2.2
- HAProxy Enterprise 2.2r1
- HAProxy ALOHA 12.5
Additional power comes from the ability to query several endpoints during a single health check. In the following example, we make requests to two distinct services: one listening at port 8080 and the other at port 8081. We also use different URIs. If either endpoint fails to respond, the entire health check fails.
haproxybackend serversoption httpchkhttp-check connect port 8080http-check send meth HEAD uri /healthhttp-check connect port 8081http-check send meth HEAD uri /upserver server1 127.0.0.1:80 check
haproxybackend serversoption httpchkhttp-check connect port 8080http-check send meth HEAD uri /healthhttp-check connect port 8081http-check send meth HEAD uri /upserver server1 127.0.0.1:80 check
Change the interval Jump to heading
By default, the load balancer sends a health check every two seconds. Change this by adding the inter argument to the server line. In the next example, we send a health check every four seconds:
haproxybackend serversserver srv1 10.0.0.1:80 check inter 4sserver srv2 10.0.0.2:80 check inter 4s
haproxybackend serversserver srv1 10.0.0.1:80 check inter 4sserver srv2 10.0.0.2:80 check inter 4s
Use any of the following time suffixes:
us: microsecondsms: millisecondss: secondsm: minutesh: hoursd: days
Other arguments that affect the check interval are defined below:
| Argument | Description |
|---|---|
| inter | Sets the interval between two consecutive health checks. If not specified, the default value is 2s. |
| fastinter | Sets the interval between two consecutive health checks when the server is in any of the transition states: UP - transitionally DOWN or DOWN - transitionally UP. If not set, then inter is used. |
| downinter | Sets the interval between two consecutive health checks when the server is in the DOWN state. If not set, then inter is used. |
Change the failure threshold Jump to heading
Use the fall argument to change the number of failed health checks that will trigger removing the server from the load balancing rotation. By default, this is set to 3. In the following example, 5 failed checks will put the server into the DOWN state:
haproxybackend serversserver srv1 10.0.0.1:80 check fall 5server srv2 10.0.0.2:80 check fall 5
haproxybackend serversserver srv1 10.0.0.1:80 check fall 5server srv2 10.0.0.2:80 check fall 5
Use the rise argument to set how many successful checks are needed to bring a down server back up. The default is 2. In the following example, 10 successful health checks are needed before the server will return to the load balancing rotation:
haproxybackend serversserver srv1 10.0.0.1:80 check fall 5 rise 10server srv2 10.0.0.2:80 check fall 5 rise 10
haproxybackend serversserver srv1 10.0.0.1:80 check fall 5 rise 10server srv2 10.0.0.2:80 check fall 5 rise 10
Set check-scoped variables Jump to heading
Available since
- HAProxy 2.2
- HAProxy Enterprise 2.2r1
- HAProxy ALOHA 12.5
Use either tcp-check set-var or http-check set-var to set a variable scoped to a health check session. For example:
haproxybackend serverstcp-check set-var(check.port) int(1234)tcp-check connect port var(check.port)server srv1 10.0.0.1:80 check
haproxybackend serverstcp-check set-var(check.port) int(1234)tcp-check connect port var(check.port)server srv1 10.0.0.1:80 check
Passive health checks Jump to heading
A passive health check monitors live traffic for errors. You can watch for either failed TCP connections or bad HTTP responses. Passive checks will detect errors returning from any part of your proxied service, but they require active traffic to monitor.
Monitor for TCP connection errors Jump to heading
To monitor live traffic for TCP connection errors, follow these steps:
- Add the
checkargument to theserverlines you want to monitor. - Add the
observe layer4argument to eachserverline to activate passive health checking. - Add the
error-limitandon-errorarguments to set the threshold for failed passive health checks and the action to take when errors exceed that threshold.
In the following example, we monitor for TCP connection errors. When there are at least 10 of these errors, we mark the server as down by using the mark-down value for the on-error argument:
haproxybackend serversserver server1 192.168.0.10:80 check inter 2m observe layer4 error-limit 10 on-error mark-down
haproxybackend serversserver server1 192.168.0.10:80 check inter 2m observe layer4 error-limit 10 on-error mark-down
The check argument enables an active health check probe that will ping the server’s TCP port at an interval. The interval is 2 seconds by default, which you can change using the inter argument. After a set number of successful active health check probes, this will bring the server back online after it has been removed from the load-balancing rotation from failed passive health checks. In the example above, the interval is increased to 2 minutes to ensure that the server can remain healthy for a longer period of time before returning to service.
Monitor for HTTP response errors Jump to heading
To monitor live traffic for HTTP response errors, follow these steps:
- Add the
checkargument to theserverlines you want to monitor. - Add the
observe layer7argument to eachserverline to activate passive health checking. - Add the
error-limitandon-errorarguments to set the threshold for failed passive health checks and the action to take when errors exceed that threshold.
In the following example, we monitor for HTTP response errors. When there are at least 10 of these errors, we mark the server as down by using the mark-down value for the on-error argument:
haproxybackend serversserver server1 192.168.0.10:80 check observe layer7 error-limit 10 on-error mark-down
haproxybackend serversserver server1 192.168.0.10:80 check observe layer7 error-limit 10 on-error mark-down
The check argument enables an active health check probe that will ping the server’s TCP port at an interval. After a set number of successful active health check probes, this will bring the server back online after it has been removed from the load balancing rotation from failed passive health checks.
Set the on-error action Jump to heading
The on-error argument on the server line determines what action to take when errors exceed the threshold you set with the error-limit. It accepts any of the following values:
| Action | Description |
|---|---|
| fastinter | Forces fastinter mode, which causes the active health check probes to be sent more rapidly. |
| fail-check | Increments one failed active health check and forces fastinter mode. |
| sudden-death | Simulates a pre-fatal failed check. One more check will mark the server as down. It also forces fastinter mode. |
| mark-down | Marks the server as down and forces fastinter mode. |
Set a server’s initial state Jump to heading
This section applies to:
- HAProxy 3.1 and newer
- HAProxy Enterprise 3.1 and newer
- HAProxy ALOHA 17.0 and newer
You can control how quickly each server can begin handling traffic after restarting, coming out of maintenance mode, or being added through service discovery. On a server directive in a backend, set an init-state argument to one of the following values. Each value changes how the load balancer determines the server’s readiness in relation to health checks:
| init-state value | Description |
|---|---|
| up | Up initially and able to receive traffic, but it will be marked as down if it fails the initial health check. |
| fully-up | Up initially and able to receive traffic, but it will be marked as down if it fails all of its health checks. |
| down | Down initially and unable to receive traffic until it has passed the initial health check. |
| fully-down | Down initially and unable to receive traffic until it has passed all of its health checks. |
In the example below, we use init-state fully-down so that the server remains unavailable until it has passed all five of its health checks, set by the rise argument.
haproxybackend serversbalance roundrobinserver web1 172.16.0.11:8080 check maxconn 30 init-state fully-down rise 5
haproxybackend serversbalance roundrobinserver web1 172.16.0.11:8080 check maxconn 30 init-state fully-down rise 5
You can also use the init-state argument on server-template directives.
Use idle connections for checks Jump to heading
This section applies to:
- HAProxy 3.2 and newer
- HAProxy Enterprise 3.2r1 and newer
To send health checks over existing, idle connections instead of opening new connections to servers, add the check-reuse-pool argument to your server directives. This lowers the number of new connections the load balancer needs to establish with the servers and works with mode tcp and mode http.
haproxybackend serversbalance roundrobinserver web1 172.16.0.11:8080 check maxconn 30 check-reuse-pool
haproxybackend serversbalance roundrobinserver web1 172.16.0.11:8080 check maxconn 30 check-reuse-pool
Agent checks Jump to heading
An agent check is one where the load balancer connects to an agent program running on a backend server. In response to the agent check probe, the agent program sends back a string of ASCII text that triggers a change in the load balancer.
The program running on the server can send back to the load balancer a string containing any of the following commands.
| Text | Effect on the load balancer |
|---|---|
<number>% |
Changes the server’s weight to a percentage of its current value. Specifying 0% is equivalent to drain. For example, change server weight to one-half its current value with 50%. |
down |
Marks the server as down due to critical condition such as missing process or port not responding. Optionally, you can append a number sign (#) followed by a description string. |
drain |
Puts the server into drain mode, where it won’t accept new connections other than those accepted via persistence. |
fail |
Marks the server as down and can indicate that a validity test has failed. Optionally, you can append a number sign (#) followed by a description string. |
maint |
Puts the server into maintenance mode, where it won’t accept any new connections, and health checks will be stopped. |
maxconn:<number> |
Changes the server’s maxconn value to the given number. Don’t issue a space in between maxconn: and the number. For example, set maximum connections value to 30: maxconn:30. |
ready |
Takes the server out of maintenance mode. |
stopped |
Marks the server as down due to intentional halt. Optionally, you can append a number sign (#) followed by a description string. |
up |
Marks the server as up. |
The string is formatted as one or more of these commands separated by spaces, tabs, or commas. The string must end with a carriage return (\r) or new line (\n) character. Example: ready 50% maxconn:30.
Create an agent program Jump to heading
The agent program can be written in any programming language, as long as it allows you to listen on a TCP port. When the program detects that the load balancer has connected, it should return a string of ASCII text that makes a change to the load balancer or keeps it at its current state.
The program below is written in the Go programming language. It returns the string 50%\n if the server’s CPU idle time is less than 10, which would indicate it is near to maxing out its CPU. Otherwise, it returns the string 100%\n. Notice that the string should end with a line feed character (\n):
agent-program.gogopackage mainimport ("fmt""time""github.com/firstrow/tcp_server""github.com/mackerelio/go-osstat/cpu")func main() {server := tcp_server.New(":9999")server.OnNewClient(func(c *tcp_server.Client) {fmt.Println("Client connected")cpuIdle, err := getIdleTime()if err != nil {fmt.Println(err)c.Close()return}if cpuIdle < 10 {// Set server weight to halfc.Send("50%\n")} else {c.Send("100%\n")}c.Close()})server.Listen()}func getIdleTime() (float64, error) {before, err := cpu.Get()if err != nil {return 0, err}time.Sleep(time.Duration(1) * time.Second)after, err := cpu.Get()if err != nil {return 0, err}total := float64(after.Total - before.Total)cpuIdle := float64(after.Idle-before.Idle) / total * 100return cpuIdle, nil}
agent-program.gogopackage mainimport ("fmt""time""github.com/firstrow/tcp_server""github.com/mackerelio/go-osstat/cpu")func main() {server := tcp_server.New(":9999")server.OnNewClient(func(c *tcp_server.Client) {fmt.Println("Client connected")cpuIdle, err := getIdleTime()if err != nil {fmt.Println(err)c.Close()return}if cpuIdle < 10 {// Set server weight to halfc.Send("50%\n")} else {c.Send("100%\n")}c.Close()})server.Listen()}func getIdleTime() (float64, error) {before, err := cpu.Get()if err != nil {return 0, err}time.Sleep(time.Duration(1) * time.Second)after, err := cpu.Get()if err != nil {return 0, err}total := float64(after.Total - before.Total)cpuIdle := float64(after.Idle-before.Idle) / total * 100return cpuIdle, nil}
Configure the backend Jump to heading
Configure the servers in the backend to send agent checks.
Here, the load balancer sends an agent-based check probe every five seconds to a program listening at 192.168.0.10 at port 8080:
haproxybackend serversserver server1 192.168.0.10:80 check weight 100 agent-check agent-addr 192.168.0.10 agent-port 8080 agent-inter 5s agent-send ping\n
haproxybackend serversserver server1 192.168.0.10:80 check weight 100 agent-check agent-addr 192.168.0.10 agent-port 8080 agent-inter 5s agent-send ping\n
Use the following arguments on each server line to enable agent-based checks:
| Argument | Description |
|---|---|
check |
Enables health checking. |
agent-check |
Enables agent checks for the server. |
agent-addr |
Identifies the IP address where the agent is listening. |
agent-port |
Identifies the port where the agent is listening. |
agent-inter |
Defines the interval between checks. |
agent-send |
A string that the load balancer sends to the agent upon connection. Be sure to end it with a newline character. |
LDAP health checks Jump to heading
You can health check LDAPv3 servers. The load balancer uses the Anonymous Authentication Mechanism of Simple Bind to connect. The check is valid if the server responds with a successful result message.
-
Configure the LDAP servers accordingly to allow anonymous binding. You can do this with an IP alias on the server side that allows only the load balancer’s IP addresses to bind to it.
-
Add
option ldap-checkto yourbackendsection. In this example, we send the health check probes to alternative IP addresses specified with theaddrargument on theserverlines:haproxybackend serversoption ldap-checkserver srv1 10.0.0.1:389 check addr 10.0.0.11server srv2 10.0.0.2:389 check addr 10.0.0.12haproxybackend serversoption ldap-checkserver srv1 10.0.0.1:389 check addr 10.0.0.11server srv2 10.0.0.2:389 check addr 10.0.0.12
MySQL health checks Jump to heading
You can health check MySQL database servers. The check is valid if the server responds with a successful result message. Two modes exist:
- check the MySQL handshake packet
- test Client Authentication
In the following example, we check a MySQL handshake by adding the option mysql-check directive:
haproxybackend serversoption mysql-checkserver srv1 10.0.0.1:3306 checkserver srv2 10.0.0.2:3306 check
haproxybackend serversoption mysql-checkserver srv1 10.0.0.1:3306 checkserver srv2 10.0.0.2:3306 check
Add a user argument to option mysql-check for the health check probe to send a Client Authentication packet:
haproxybackend serversoption mysql-check user hapee-lbserver srv1 10.0.0.1:3306 checkserver srv2 10.0.0.2:3306 check
haproxybackend serversoption mysql-check user hapee-lbserver srv1 10.0.0.1:3306 checkserver srv2 10.0.0.2:3306 check
PostgreSQL health checks Jump to heading
You can perform a simple PostgreSQL check by sending a StartupMessage. The check is valid if the server responds with a successful Authentication request message rather than an error response. Add the option pgsql-check directive to your backend section and include a check argument on each server line.
haproxybackend serversoption pgsql-checkserver srv1 10.0.0.1:5432 checkserver srv2 10.0.0.2:5432 check
haproxybackend serversoption pgsql-checkserver srv1 10.0.0.1:5432 checkserver srv2 10.0.0.2:5432 check
Optionally, include the username that will be used to connect to the PostgreSQL server (here, hapee-lb):
haproxybackend serversoption pgsql-check hapee-lbserver srv1 10.0.0.1:5432 checkserver srv2 10.0.0.2:5432 check
haproxybackend serversoption pgsql-check hapee-lbserver srv1 10.0.0.1:5432 checkserver srv2 10.0.0.2:5432 check
Redis health checks Jump to heading
You can monitor a Redis service by sending the PING command. The check is valid if the server responds with the string +PONG. Add the option redis-check directive to your backend section and include a check argument on each server line.
haproxybackend serversoption redis-checkserver srv1 10.0.0.1:6379 checkserver srv2 10.0.0.2:6379 check
haproxybackend serversoption redis-checkserver srv1 10.0.0.1:6379 checkserver srv2 10.0.0.2:6379 check
SMTP health checks Jump to heading
You can monitor a Simple Mail Transfer Protocol (SMTP) service. Add the option smtpchk directive to your backend section and include a check argument on each server line. The check is valid if the server response code starts with the number 2xx.
haproxybackend serversoption smtpchkserver srv1 10.0.0.1:25 checkserver srv2 10.0.0.2:25 check
haproxybackend serversoption smtpchkserver srv1 10.0.0.1:25 checkserver srv2 10.0.0.2:25 check
You can also monitor an Extended Simple Mail Transfer Protocol (ESMTP) service. Add the hello command to use, which is HELO for SMTP and EHLO for ESMTP. Follow this with the domain name to present to the server:
haproxybackend serversoption smtpchk EHLO mydomain.comserver srv1 10.0.0.1:25 checkserver srv2 10.0.0.2:25 check
haproxybackend serversoption smtpchk EHLO mydomain.comserver srv1 10.0.0.1:25 checkserver srv2 10.0.0.2:25 check
See also Jump to heading
Use the following directives and parameters to specify behavior related to health checks for servers and other services.
- To enable an auxiliary agent check independent of a regular check, see agent-check.
- To enable a regular TCP-based server health check, see check.
- To open a new connection to perform a health check, see http-check connect.
- To specify what type of server response is considered health or not, see http-check expect.
- To specify the headers and body sent for a health check, see http-check send.
- To enable HTTP-based server health checks, see option httpchk.
- To enable LDAP-based server health checks, see option ldap-check.
- To enable MYSQL-based server health checks, see option mysql-check.
- To enable PostgreSQL-based server health checks, see option pgsql-check.
- To enable REDIS-based server health checks, see option redis-check.
- To enable SMTP-based server health checks, see option smtpchk.
- To enable server health checks using TCP-check send/expect sequences, see option tcp-check.
- To specify data to be collected and analyzed in a generic health check, see tcp-check expect.
- To specify a string or log format to be sent in a generic health check, see tcp-check send.
Do you have any suggestions on how we can improve the content of this page?