When it comes to operationalizing your log data, HAProxy provides a wealth of information. In this blog post, we demonstrate how to set up HAProxy logging, target a Syslog server, understand the log fields, and suggest some helpful tools for parsing log files.
HAProxy sits in the critical path of your infrastructure. Whether used as an edge load balancer, a sidecar, or as a Kubernetes ingress controller, getting meaningful logs out of HAProxy is a must-have.
Logging gives you insights about each connection and request. It enables observability needed for troubleshooting and can even be used to detect problems early. It’s one of the many ways to get information from HAProxy. Other ways include getting metrics using the Stats page or Runtime API, setting up email alerts, and making use of the various open-source integrations for storing log or statistical data over time. HAProxy provides very detailed logs with millisecond accuracy and generates a wealth of information about traffic flowing into your infrastructure. This includes:
- Metrics about the traffic: timing data, connections counters, traffic size, etc.
- Information about HAProxy decisions: content switching, filtering, persistence, etc.
- Information about requests and responses: headers, status codes, payloads, etc.
- Termination status of a session and the ability to track where failures are occurring (client side, server side?)
In this post, you’ll learn how to configure HAProxy logging and how to read the log messages that it generates. We’ll then list some tools that you’ll find helpful when operationalizing your log data.
HAProxy can emit log message for processing by a syslog server. This is compatible with familiar syslog tools like Rsyslog, as well as the newer systemd service journald. You can also utilize various log forwarders like Logstash and Fluentd to receive Syslog messages from HAProxy and ship them to a central log aggregator.
If you are working in a container environment, HAProxy supports Cloud Native Logging which allows you to send the log messages to stdout and stderr. In that case, skip to the next section where you’ll see how.
Before looking into how to enable logging via the HAProxy configuration file, you should first make sure that you have a Syslog server, such as rsyslog, configured to receive the logs. On Ubuntu, you’d install rsyslog using the apt package manager, like so:
Once rsyslog is installed, edit its configuration to handle ingesting HAProxy log messages. Add the following either to /etc/rsyslog.conf or to a new file within the rsyslog.d directory, like /etc/rsyslog.d/haproxy.conf:
Then, restart the rsyslog service. In the example above, rsyslog listens on the IP loopback address, 127.0.0.1, on the default UDP port 514. This particular config writes to two log files. The file chosen is based on the severity level with which the message was logged. In order to understand this, take a closer look at the last two lines in the file. They begin like this:
The Syslog standard prescribes that each logged message should be assigned a facility code and a severity level. Given the example rsyslog configuration above, you can assume that we’ll be configuring HAProxy to send all of its log messages with a facility code of local0.
The severity level is specified after the facility code, separated by a dot. Here, the first line captures messages at all severity levels and writes them to a file called haproxy-traffic.log. The second line captures only notice-level messages and above, logging them to a file called haproxy-admin.log.
HAProxy is hardcoded to use certain severity levels when sending certain messages. For example, it categorizes log messages related to connections and HTTP requests with the info severity level. Other events are categorized using one of the other, less verbose levels. From most to least important, the severity levels are:
|Severity Level||HAProxy Logs|
|emerg||Errors such as running out of operating system file descriptors.|
|alert||Some rare cases where something unexpected has happened, such as being unable to cache a response.|
|err||Errors such as being unable to parse a map file, being unable to parse the HAProxy configuration file, and when an operation on a stick table fails.|
|warning||Certain important, but non-critical, errors such as failing to set a request header or failing to connect to a DNS nameserver.|
|notice||Changes to a server’s state, such as being UP or DOWN or when a server is disabled. Other events at startup, such as starting proxies and loading modules are also included. Health check logging, if enabled, also uses this level.|
|info||TCP connection and HTTP request details and errors.|
|debug||You may write custom Lua code that logs debug messages|
Modern Linux distributions are shipped with the service manager systemd, which introduces journald for collecting and storing logs. The journald service is not a Syslog implementation, but it is Syslog compatible since it will listen on the same /dev/log socket. It will collect the received logs and allow the user to filter them by facility code and/or severity level using the equivalent journald fields (SYSLOG_FACILITY, PRIORITY).
HAProxy Logging Configuration
The HAProxy configuration manual explains that logging can be enabled with two steps: The first is to specify a Syslog server in the
global section by using a
log directive instructs HAProxy to send logs to the Syslog server listening at 127.0.0.1:514. Messages are sent with facility local0, which is one of the standard, user-defined Syslog facilities. It’s also the facility that our rsyslog configuration is expecting. You can add more than one
log statement to send output to multiple Syslog servers.
You can control how much information is logged by adding a Syslog level to the end of the line:
The second step to configuring logging is to update the different proxies (
listen sections) to send messages to the Syslog server(s) configured in the
global section. This is done by adding a
log global directive. You can add it to the
defaults section, as shown:
log global directive basically says, use the
log line that was set in the
global section. Putting a
log global directive into the
defaults section is equivalent to putting it into all of the subsequent proxy sections. So, this will enable logging on all proxies. You can read more about the sections of an HAProxy configuration file in our blog post The Four Essential Sections of an HAProxy Configuration.
By default, output from HAProxy is minimal. Adding the line
option httplog to your
defaults section will enable more verbose HTTP logging, which we will explain in more detail later.
A typical HAProxy configuration looks like this:
Using global logging rules is the most common HAProxy setup, but you can put them directly into a
frontend section instead. It can be useful to have a different logging configuration as a one-off. For example, you might want to point to a different target Syslog server, use a different logging facility, or capture different severity levels depending on the use case of the backend application. Consider the following example in which the
frontend sections, fe_site1 and fe_site2, set different IP addresses and severity levels:
When logging to a local Syslog service, writing to a UNIX socket can be faster than targeting the TCP loopback address. Generally, on Linux systems, a UNIX socket listening for Syslog messages is available at /dev/log because this is where the syslog() function of the GNU C library is sending messages by default. Target the UNIX socket like this:
However, you should keep in mind that if you’re going to use a UNIX socket for logging and at the same time you are running HAProxy within a chrooted environment—or you let HAProxy create a
chroot directory for you by using the chroot configuration directive—then the UNIX socket must be made available within that chroot directory. This can be done in one of two ways.
First, when rsyslog starts up, it can create a new listening socket within the chroot filesystem. Add the following to your HAProxy rsyslog configuration file:
The second way is to manually add the socket to the chroot filesystem by using the
mount command with the
Be sure to add an entry to your /etc/fstab file or to a systemd unit file so that the mount persists after a reboot. Once you have logging configured, you’ll want to understand how the messages are structured. In the next section, you’ll see the fields that make up the TCP and HTTP-level logs.
If you need to limit the amount of data stored, one way is to sample only a portion of log messages. Set the log level to silent for a random number of requests, like so:
Note that, if possible, it’s better to capture as much data as you can. That way, you don’t have missing information when you need it most. You can also modify the ACL expression so that certain conditions override the rule.
NOTE: A new way to sample logs is available. Read about HAProxy Log Sampling.
Another way to limit the number of messages logged is to set
option dontlog-normal in your
frontend. That way, only timeouts, retries and errors are captured. You probably would not want to enable this all of the time, but only during certain times, such as when performing benchmarking tests.
If you are running HAProxy inside of a Docker container and you’re using HAProxy version 1.9, then instead of sending log output to a Syslog server you can send it to stdout and/or stderr. Set the address to
stderr, respectively. In that case, it’s also preferable to set the format of the message to
raw, like so:
HAProxy Log Format
The type of logging you’ll see is determined by the proxy mode that you set within HAProxy. HAProxy can operate either as a Layer 4 (TCP) proxy or as Layer 7 (HTTP) proxy. TCP mode is the default. In this mode, a full-duplex connection is established between clients and servers, and no layer 7 examination will be performed. If you’ve set your rsyslog configuration based on our discussion in the first section, you’ll find the log file at /var/log/haproxy-traffic.log.
When in TCP mode, which is set by adding
mode tcp, you should also add option tcplog. With this option, the log format defaults to a structure that provides useful information like Layer 4 connection details, timers, byte count, etc. If you were to re-create this format using
log-format, which is used to set a custom format, it would look like this:
Descriptions of these fields can be found in the TCP log format documentation, although we’ll describe several in the upcoming section.
When HAProxy is run as a Layer 7 proxy via
mode http, you should add the option httplog directive. It ensures that HTTP requests and responses are analyzed in depth and that no RFC-compliant content will go uncaptured. This is the mode that really highlights the diagnostic value of HAProxy. The HTTP log format provides the same level of information as the TCP format, but with additional data specific to the HTTP protocol. If you were to re-create this format using
log-format, it would look like this:
Detailed descriptions of the different fields can be found in the HTTP log format documentation.
You can also define a custom log format, capturing only what you need. Use the
log-format-sd for structured-data syslog) directive in your
frontend. Read our blog post HAProxy Log Customization to learn more and see some examples.
In the next few sections, you’ll become familiar with the fields that are included when you use
option tcplog or
Within the log file that’s produced, each line begins with the frontend, backend, and server to which the request was sent. For example, if you had the following HAProxy configuration, you would see lines that describe requests as being routed through the http-in frontend to the static backend and then to the srv1 server.
This becomes vital information when you need to know where a request was sent, such as when seeing errors that only affect some of your servers.
Timers are provided in milliseconds and cover the events happening during a session. The timers captured by the default TCP log format are Tw / Tc / Tt. Those provided by the default HTTP log format are TR / Tw/ Tc / Tr / Ta. These translate as:
|TR||The total time to get the client request (HTTP mode only).|
|Tw||The total time spent in the queues waiting for a connection slot.|
|Tc||The total time to establish the TCP connection to the server.|
|Tr||The server response time (HTTP mode only).|
|Ta||The total active time for the HTTP request (HTTP mode only).|
|Tt||The total TCP session duration time, between the moment the proxy accepted it and the moment both ends were closed.|
You’ll find a detailed description of all of the available timers in the HAProxy documentation. The following diagram also demonstrates where time is recorded during a single end-to-end transaction. Note that the purple lines on the edges denote timers.
Session State at Disconnection
Both TCP and HTTP logs include a termination state code that tells you the way in which the TCP or HTTP session ended. It’s a two-character code. The first character reports the first event that caused the session to terminate, while the second reports the TCP or HTTP session state when it was closed.
Here are some termination code examples:
|—||Normal termination on both sides.|
|cD||The client did not send nor acknowledge any data and eventually
|SC||The server explicitly refused the TCP connection.|
|PC||The proxy refused to establish a connection to the server because the process’ socket limit was reached while attempting to connect.|
There’s a wide variety of reasons a connection may have been closed. Detailed information about all possible termination codes can be found in the HAProxy documentation.
Counters indicate the health of the system when a request went through. HAProxy records five counters for each connection or request. They can be invaluable in determining how much load is being placed on the system, where the system is lagging, and whether limits have been hit. When looking at a line within the log, you’ll see the counters listed as five numbers separated by slashes: 0/0/0/0/0.
In either TCP or HTTP mode, these break down as:
- The total number of concurrent connections on the HAProxy process when the session was logged.
- The total number of concurrent connections routed through this
frontendwhen the session was logged.
- The total number of concurrent connections routed to this
backendwhen the session was logged.
- The total number of concurrent connections still active on this
serverwhen the session was logged.
- The number of retries attempted when trying to connect to the backend server.
HAProxy doesn’t record everything out-of-the-box, but you can tweak it to capture what you need. An HTTP request header can be logged by adding the
http-request capture directive:
The log will show headers between curly braces and separated by pipe symbols. Here you can see the Host and User-Agent headers for a request:
A response header can be logged by adding an
http-response capture directive:
In this case, you must also add a
declare capture response directive, which allocates a capture slot where the response header, once it arrives, can be stored. Each slot that you add is automatically assigned an ID starting from zero. Reference this ID when calling
http-response capture. Response headers are logged after the request headers, within a separate set of curly braces.
Cookie values can be logged in a similar way with the
http-request capture directive.
Anything captured with
http-request capture, including HTTP headers and cookies, will appear within the same set of curly braces. The same goes for anything captured with
You can also use
http-request capture to log sampled data from stick tables. If you were tracking user request rates with a
stick-table, you could log them like this:
So, making a request to a webpage that contains the HTML document and two images would show the user’s concurrent request rate incrementing to three:
You can also log the values of fetch methods, such as to record the version of SSL/TLS that was used (note that there is a built-in log variable for getting this called %sslv):
Variables set with
http-request set-var can also be logged.
ACL expressions evaluate to either true or false. You can’t log them directly, but you can set a variable based on whether the expression is true. For example, if the user visits /api, you could set a variable called req.is_api to a value of Is API and then capture that in the logs.
Enabling HAProxy Profiling
With the release of HAProxy 1.9, you can record CPU time spent on processing a request within HAProxy. Add the
profiling.tasks directive to your
There are new fetch methods that expose the profiling metrics:
||The microseconds part of the date.|
||The number of calls to the task processing the stream or current request since it was allocated. It is reset for each new request on the same connection.|
||The average number of nanoseconds spent in each call to the task processing the stream or current request.|
||The total number of nanoseconds spent in each call to the task processing the stream or current request.|
||The average number of nanoseconds spent between the moment the task handling the stream is woken up and the moment it is effectively called.|
||The total number of nanoseconds between the moment the task handling the stream is woken up and the moment it is effectively called.|
Add these to your log messages like this:
This is a great way to gauge which requests cost the most to process.
Parsing HAProxy Logs
As you’ve learned, HAProxy has a lot of fields that provide a tremendous amount of insight about connections and requests. However, reading them directly can lead to information overload. Oftentimes, it’s easier to parse and aggregate them with external tools. In this section, you’ll see some of these tools and how they can leverage the logging information provided by HAProxy.
HALog is a small but powerful log analysis tool that’s shipped with HAProxy. It was designed to be deployed onto production servers where it can help with manual troubleshooting, such as when facing live issues. It is extremely fast and able to parse TCP and HTTP logs at 1 to 2 GB per second. By passing it a combination of flags, you can extract statistical information from the logs, including requests per URL and requests per source IP. Then, you can sort by response time, error rate, and termination code.
For example, if you wanted to extract per-server statistics from the logs, you could use the following command:
This is useful when you need to parse log lines per status code and quickly discover if a given server is unhealthy (e.g. returning too many 5xx responses). Or, a server may be denying too many requests (4xx responses), which is a sign of a brute-force attack. You can also get the average response time per server with the
avg_rt column, which is helpful for troubleshooting.
With HALog, you can get per-URL statistics by using the following command:
The output shows the number of requests, the number of errors, the total computation time, the average computation time, the total computation time of successful requests, the average computation time of successful requests, the average number of bytes sent, and the total number of bytes sent. In addition to parsing server and URL statistics, you can apply multiple filters to match logs with a given response time, HTTP status code, session termination code, etc.
HAProxy Stats Page
Parsing the logs with HALog isn’t the only way to get metrics out of HAProxy. The HAProxy Stats Page can be enabled by adding the
stats enable directive to a
listen section. It displays live statistics of your servers. The follow
listen section starts the Stats page listening on port 8404:
The Stats Page is very useful for getting instant information about the traffic flowing through HAProxy. It does not store this data, though, and displays data only for a single load balancer.
HAProxy Enterprise Real-Time Dashboard
If you’re using HAProxy Enterprise, then you have access to the Real-Time Dashboard. Whereas the Stats page shows statistics for a single instance of HAProxy, the Real-Time Dashboard aggregates and displays information across a cluster of load balancers. This makes it easy to observe the health of all of your servers from a single screen. Data can be viewed for up to 30 minutes.
The dashboard stores and displays information about service health, request rates and load. It also makes it easier to perform administrative tasks, such as enabling, disabling and draining backends. At a glance, you can see which servers are up and for how long. You can also view stick table data, which, depending on what the stick table is tracking, may show you error rates, request rates, and other real-time information about your users. Stick table data can be aggregated as well.
The Real-Time Dashboard is one of many add-ons available with HAProxy Enterprise.
In this blog post, you learned how to configure HAProxy logging to get observability over your load balancer, which is a critical component within your infrastructure. HAProxy emits detailed Syslog messages when operating in either TCP and HTTP mode. These can be sent to a number of logging tools, such as rsyslog.
HAProxy ships with the HALog command-line utility, which simplifies parsing log data when you need information about the types of responses users are getting and the load on your servers. You can also get a visual of the health of your servers by using the HAProxy Stats Page or the HAProxy Enterprise Real-Time Dashboard.
Want to know when content like this is published? Subscribe to this blog or follow us on Twitter. You can also join the conversation on Slack! HAProxy Enterprise combines HAProxy with enterprise-class features, such as the Real-Time Dashboard, and premium support. Contact us to learn more or sign up for a free trial today!