Boosting the transparency of your load balancer traffic is advantageous. Web applications continually pass information back and forth, yet some of this important data is often hard to get during transit. And while the perceived “black box” nature of networking seems overwhelming, what if you could peek behind the curtain to better understand your traffic?
This starts with preserving crucial data such as the client’s Source IP. Clients often use the PROXY protocol to transmit the original client IP address but also to embed additional information in the request. In fact, many applications embed additional headers like
PP2_TYPE_ALPN, among others. We developed this protocol in-house at HAProxy, and it's now widely supported by modern web infrastructure. However, there’s some minor work involved in extracting and interpreting this header information. That's where popular external tools like TShark can help.
In this guide, we’ll first explain why connection data matters. Next, you’ll learn how the
PROXY protocol works, how to use the TShark analyzer to capture and inspect packets and view the extraction results.
What makes traffic data so valuable?
Theoretically, each request flowing through a load balancer like HAProxy can contain a client IP address, port information, a destination IP address, and even a virtual private cloud (VPC) subnet ID. These bits of data enable connection validation. For example, parsing the header can help us identify good hosts with legitimate identities. Equally, HAProxy can use that information to flag bad hosts and halt traffic accordingly.
At the highest level, these parameters provide important context behind requests and boost the transparency of client-server communication pathways. It’s easier to understand how traffic is flowing and possibly identify configuration errors.
Finally, data packaged via the
PROXY protocol lets us chain multiple layers of NAT or TCP proxies together while keeping the original IP address. The
PROXY protocol header, in one instance, enables traffic to successfully pass through subsequent firewalls and proxies.
What is the Proxy Protocol?
PROXY protocol provides a convenient way to transport connection information safely between client and server. Without a load balancer in the middle, the server would normally be able to directly retrieve this data independently. This includes the following in HAProxy:
An address family, such as
AF_INET6for IPv6, and
Socket protocols for TCP and UDP
Layer 3 source and destination addresses
Any Layer 4 source and destination ports
We designed the
PROXY protocol to be quickly parseable. While Version 1 focused on human readability, Version 2 adds binary encoding support to accelerate this process even further. The
PROXY protocol is the successor to our
How the Proxy Protocol works
Primarily, we know that clients use the
PROXY protocol to embed connection information. But, how does everything work under the hood?
We've designed the protocol to reduce information processing overhead without requiring users to make sweeping changes to their backend components. Plus, the protocol prevents the typical connection parameter losses that occur when relaying TCP connections through a proxy.
This sidesteps what we call the “dumb proxy” problem, where a load balancer processes protocol-agnostic data without knowing which protocol is transported atop the connection. HAProxy can run in pure TCP mode and fall into this category.
This isn’t a negative aspect in and of itself. However, technical challenges can arise when using the
keep-alive directive. Only the first request on a new connection will pass the
X-Forwarded-For header or the
Forwarded extension, which is a problem on long-lived connections that handle multiple requests. Since the client information within those headers often remains unchanged, sending a header with each request is unnecessary.
Packaging information via the
PROXY protocol solves this problem. HAProxy can prepend each connection with a header that reports insightful connection characteristics for the other side. This is easy to implement without protocol-specific knowledge. Plus, we can eliminate any dangers and limitations of caching.
So, we now have this conveniently prepackaged data. How do users unpack it into something usable?
Introducing the TShark packet analyzer
The TShark tool—part of Wireshark’s open source packet-analysis software—lets us quickly and easily capture, read, and print
PROXY protocol packets. It writes the
PROXY Protocol information it gathers in either a Terminal output or destination file.
TShark also offers the following advantages:
CLI-based Wireshark filter integration
Direct packet capture
Packet capture (PCAP) analysis from a tcpdump
In fact, TShark’s default configuration closely emulates tcpdump’s functionality. The tcpdump tool also ships with most Linux-based operating systems, so many users may already be familiar with it. While HAProxy can interpret and understand the contents of these packets for routing purposes, we can't natively perform a tcpdump.
TShark’s manual states that it grabs data “from the first available network interface and displays a summary line on the standard output for each received packet.” Luckily, TShark is a user-friendly tool. Let’s dive into a technical example that uses TShark to unearth data from
PROXY protocol packets.
Set up TShark and test your packet capture
Before combing through your packets, you’ll have to install TShark using the CLI. This is necessary if your Linux distribution doesn’t automatically come with TShark:
For Red Hat Enterprise Linux (RHEL) and RHEL clones, enter the following command:
For Ubuntu and Debian, enter the following:
TShark should now be available! However, we recommend testing basic packet capture to ensure that everything is working correctly. Use the following command to test capture for five seconds everywhere except at port 22, to avoid grabbing your own SSH traffic:
|tshark -f "tcp port not 22" -a "duration:5"|
If you want to analyze your output later, you can also specify a destination file using the
-w FILENAME argument. Simply enter the
tshark -r FILENAME command to read its contents.
Next, the process for solely targeting
PROXY protocol packets is a little different. Here’s how to do it.
Finding and unpacking Proxy Protocol packets
Capturing, identifying, and unpacking
PROXY protocol packets is easy using Wireshark's display filters. For example, you can filter out only relevant packets, and there are multiple proxy traffic filters.
In most cases, a
PROXY protocol packet has an embedded Source IP. You can find it with the following command:
|tshark -r FILENAME -Y proxy.src.ipv4|
This will only list packets that contain
PROXY protocol information. So, what if you want to view the contents of those packets? Simply add the
-V flag, which gives you the following CLI command:
|tshark -r FILENAME -Y proxy.src.ipv4 -V|
As a result, your Terminal will display something similar to what you see below:
|ubuntu@bk:/etc/hapee-2.6# tshark -Y proxy.src.ipv4 -V|
|0010 .... = Version: 2|
|.... 0001 = Command: 1|
|Address Family Protocol: TCP over IPv4 (0x11)|
|0001 .... = Address Family: IPv4 (0x1)|
|.... 0001 = Protocol: 0x1|
|Source Address: 192.168.64.1|
|Destination Address: 192.168.64.120|
|Source Port: 49344|
|Destination Port: 8081|
That’s it! The entire contents of your
PROXY protocol packet are now viewable and ready for inspection. You can repeat this same process for other
Inspecting traffic while using TLS
In most cases, your traffic will use TLS. However, these TLS packets confuse tools like Wireshark—preventing them from correctly identifying
PROXY protocol packets. This is especially true for
PROXY protocol v2. Version 2 adds a non-parseable binary signature that's designed to cause immediate failures on SSL/TLS (among other protocols) to enforce its use under certain connections.
Luckily, you can use the following command to easily disable TLS protocol detection (for TShark testing purposes) during packet capture and restore
PROXY packet detection:
|tshark -disable-protocol tls -Y proxy.src.ipv4 -V|
This produces a similar output to our original command.
Testing new Proxy Protocol requests
While you can now grab important data, you'll sometimes need to generate
PROXY protocol requests to test your data capture. You can send these requests with the following cURL utility command:
|curl --haproxy-protocol http://<your load balancer>/...|
You'll need to add
:80 accept-proxy to the
bind line of your configuration for this to work properly. This enables HAProxy to accept the
PROXY protocol (both Version 1 and Version 2).
Running this command will verify that your connection is working, and that nothing is preventing
PROXY protocol requests from flowing through HAProxy. Here's an example output using the
curl --haproxy-protocol -v http://localhost command:
|* Trying ::1:80...|
|* connect to ::1 port 80 failed: Connection refused|
|* Trying 127.0.0.1:80...|
|* Connected to localhost (127.0.0.1) port 80 (#0)|
|> PROXY TCP4 127.0.0.1 127.0.0.1 53606 80|
|> GET / HTTP/1.1|
|> Host: localhost|
|> User-Agent: curl/7.74.0|
|> Accept: */*|
|* Mark bundle as not supporting multiuse|
|< HTTP/1.1 302 Found|
|< content-length: 0|
|< location: /hapee-stats|
|< cache-control: no-cache|
|* Connection #0 to host localhost left intact|
Important header information at your fingertips
PROXY protocol is an incredibly useful mechanism for transporting important connection information through HAProxy. This data is valuable to users who want more insight into their traffic, without having to make major functional concessions.
And while some steps are required to unpack that information, tools like TShark make the process pretty painless. The next time you want to inspect the contents of your
PROXY protocol packets, you can do so without jumping through hoops or possessing niche technical knowledge.