Traffic mirroring makes it possible to stream production traffic to a test or staging environment. Use the HAProxy Traffic Shadowing agent to enable mirroring.
The HAProxy Stream Processing Offload Engine (SPOE) lets you stream data to an external agent in real time where it can be processed by a programming language of your choice, including C, .NET Core, Go, Lua and Python. This opens the door to extending HAProxy in many ways. We described the architecture of the SPOE in our blog post Extending HAProxy with the Stream Processing Offload Engine.
As part of the HAProxy 2.0 release, a new agent was introduced that uses the SPOE to mirror traffic to another environment. The Traffic Shadowing agent captures traffic, or a percentage of it, and sends a copy to another URL. You’d use it to send production traffic to a QA testing environment in order to validate a new version of a feature before it’s made public. That way, you reduce the risk of discovering a bug only after the feature is released.
In this blog post, we’ll demonstrate how to use mirroring to send samples of production traffic to your QA environment.
Real-world Traffic Without the Real-world Impact
Imitating real users is hard. Real users stress an application in ways that are difficult to reproduce artificially. For example, two users may perform unrelated tasks in different parts of the application simultaneously, which then triggers an unknown bug caused by the confluence of their actions. Race conditions, deadlocks, and other threading problems often surface under realistic load.
Traffic mirroring, or traffic shadowing, is a technique in which live, production traffic is copied and sent to two places: the original production servers and a staging or test environment. That test environment may be segregated into a separate network that is not publicly accessible. As long as the requested URLs and parameters match the new version of the feature being tested, then it’s easy to validate that the new version is as close to bug-free as possible.
The value of traffic mirroring lies in that you can do this without impacting your users. Mirroring traffic using the Traffic Shadowing daemon is fire and forget. When requests are copied and sent to the test environment, it has almost no impact on the time needed to process the request. The client does not need to wait for a response from the test server. You can also configure the daemon to only capture a portion of the traffic so that your test environment doesn’t need to maintain the infrastructure that’s necessary to handle production-level amounts of requests.
Setting up Traffic Mirroring
Clone or download the source code repository and follow the instructions for building it. For example, on Ubuntu Bionic, you would build and install it like this:
--enable-debug flag when you call
configure if you want to see more verbose output from the agent, like this:
After you’ve followed these steps, the spoa-agent program will be available on your PATH. Start it by passing it the
--runtime argument for how long it should run and then exit (e.g. –runtime 1h for one hour or 0 for unlimited time) and the
--mirror-url argument to set the URL where you want to send the mirrored traffic.
The agent listens on all IP addresses at port 12345 by default. You can change these settings with the
--port arguments. You can also pass it the
--daemonize argument to run the program in the background.
Starting with HAProxy version 2.0, you can have HAProxy manage the lifetime of the agent. Use the HAProxy Process Manager to control starting the daemon when HAProxy starts. Add a
program section that contains a
command directive to your HAProxy configuration, as shown:
When you start HAProxy, you’ll see that the spoa-mirror agent is running alongside it:
Now the daemon is set up to receive requests and forward it to the test server.
--logfileflag to the
spoa-mirrorcommand. Prefix the filename with either
w:to overwrite the file if it exists or
a:to append to the file, such as –logfile a:/var/log/spoa-mirror.log.
The next step is to configure HAProxy to send traffic to the agent. Add a
filter spoe directive to your
frontend that references a file named mirror.conf, as shown:
We will cover what goes into mirror.conf in the next section.
Next, in addition to the
backend that holds your production servers, add a
backend that contains the address of the spoa-mirror agent. Here’s an example:
In this example, production traffic is received at port 80. It is then sent to the servers backend like normal, but it’s also mirrored to the mirroragents backend, which relays it to the agent listening at localhost:12345. For this to work, you have to set up mirror.conf, which you’ll see in the next section.
The mirror.conf file
filter spoe directive in your
frontend lists a
config parameter that points to an SPOE configuration file to use. Create that file now at /etc/haproxy/mirror.conf. An
engine parameter sets a label that must match a section in mirror.conf. We’ve arbitrarily named it mirror in this case. It’s only important that they match. Add the following to the file:
With this file, you configure how HAProxy will communicate with the agent(s). The file begins with an engine name, mirror, in square brackets. As mentioned, this must match the
engine value set on the
filter line in the HAProxy configuration.
log global line means that events, such as when HAProxy sends data, will be logged to the same output defined by the
log statement in the
global section of the HAProxy configuration. The
messages line is a space-delimited list of labels that match up with
spoe-message sections. The
use-backend line specifies which backend in the HAProxy configuration holds the mirror agents.
You can also set timeouts for various parts of the HAProxy-to-agent communication. The
timeout hello setting limits how long HAProxy will wait for an agent to acknowledge a connection. The
timeout idle setting limits how long HAProxy will wait for an agent to close an idle connection. The
timeout processing setting limits how long an agent is allowed to process an event.
spoe-message section defines which HAProxy fetch methods will be used to capture data to send to the agents. The label here, mirror, is expected by this particular agent. For traffic mirroring, we capture the following:
- the HTTP method
- the URL path
- the version of HTTP
- all HTTP headers
- the request body (note that this requires
option http-buffer-requestin the HAProxy configuration)
Data is sent every time that the on-frontend-http-request event fires, which is before the evaluation of
http-request rules on the frontend side. Once you have this file in place, restart HAProxy for it to take effect. You should see requests to the Traffic Shadowing daemon appear in the log at /var/log/haproxy.log:
Tuning the Mirrored Traffic
There are a few ways to tune the traffic that gets mirrored. For one thing, you can add an ACL that limits the requests that get captured. For instance, if you only wanted to mirror traffic for requests to the /search feature on your site, you would ignore all requests except those that have a URL path beginning with /search, as shown:
You can also define named ACLs that do the same thing:
Or suppose you didn’t want to capture all traffic, but rather only a portion of it. You would simply add an ACL that collects a random sample of requests. In the next example, we generate a random number between 1 and 100 and only mirror the request if that number is less than or equal to 10:
Your ACL statements can also check values from map files. For example, you can switch mirroring on or off by using a map file that contains a key-value pair like mirroring on. Then, check the map file from your mirror.conf file like this:
Use the HAProxy Runtime API to change the value in the map file to off.
You can also use the Data Plane API to add or remove
filter spoe lines from the HAProxy configuration file dynamically. In the following example, we show the existing filters, then add a new one, and then remove it:
Use the Data Plane API to fully configure your load balancer using REST API commands.
Tips for Making the Most of Traffic Mirroring
I’ll leave you with a few ways to get the most out of traffic mirroring:
- Set up monitoring and compare the errors you get from your production servers with those you get from the new version to which you’re mirroring traffic. Having a monitoring strategy in place will be key to validating a release.
- Make use of HAProxy’s built-in metrics, which you can consume via the HAProxy Stats page or the Prometheus module, to see whether the new version of your feature performs better or worse.
- Make sure that the feature you’re testing has URL paths and parameters that match the existing feature so that it is forward compatible with mirrored traffic. Forward compatibility may be a valuable test in and of itself.
In this blog post, you got a tour of the new Traffic Shadowing daemon, which uses the Stream Processing Offload Engine to capture live traffic and mirror it to a secondary URL. This is especially useful for vetting new versions of features before they’re released to the public. The best thing about it is your production traffic won’t be impacted by the mirroring. It’s a fire-and-forget process where the downstream client doesn’t need to wait for the mirroring agent to respond.
HAProxy Enterprise includes a robust and cutting-edge codebase, an enterprise suite of add-ons, expert support, and professional services. Want to learn more? Contact us today and sign up for a free trial.
We’re hiring! Check out our careers page for more info.