Intent-driven, Fully Automated Deployment of Anycasted Load Balancers with HAProxy and Python
In this presentation, Sebastian Langenhorst and Johannes Kampmeyer describe building self-service load balancing using HAProxy at the University of Paderborn. Because HAProxy can be modified through configuration management, they were able to develop a system where each department has control of their own services’ IP and port mappings, TLS termination, redirects, and more through a declarative, intent-driven language. They use HAProxy for a wide range of services because it gives them a vendor-agnostic solution and allows them to future-proof their infrastructure.
Intent-driven, Fully Automated Deployment of Anycasted Load Balancers with HAProxy and Python, a long title, but excuse us, we are from University, we have to do something like that. So, who are we?
To maintain all this we have quite some infrastructure, we need a load balancer, of course, to do so. We had this quite big, pricey IP solution before that went out of everything, so we had to replace it and we took a look at the market and one product stood out, HAProxy. We try to use HAProxy for everything we do; We do LDAP, we do mail, we do web server, like everything we have, with HAProxy.
We went for an active-active setup. With the VIPs, now it’s just around 150, get anycasted to both load balancers, so it’s an anycast structure above HAProxy. How do we do it? We have an anycast health checker that checks if the HAProxy is up and running, then announces it via BIRD to route reflectors and then they will announce it to the network switches.
What we do—or with our former solution—everything was routed through the load balancer. We wanted to go back from that so that the routers are routing and the load balancers are load balancing, so we don’t have a single point of failure; We still have a single point of failure, but there’s a…we had a problem that our ACLs on the routers didn’t match on the load balancer. With HAProxy, we wanted to get a solution for this. We use termination on the HAProxy, we make heavy use of the Proxy protocol whenever it’s possible; If it’s not possible and it’s HTTP, we use X-Forwarded-For connections.
Another request or demand—yeah it’s more of a demand—from our web team was: “We don’t want to have anything to do with SSL, with TLS. Please terminate.” So we terminate everything that has something to do with TLS on the HAProxy. That is wonderful for us! For the first time we have a consistent SSL cipher out there, we have a consistent TLS version that we are supporting or not supporting. It’s security, it’s great.
Still, active-active is nice, but we have some interesting products, I will call them, which still requires that we are routing on the HAProxies; Thanks to the TPROXY extension it’s still possible to use HAProxy as a frontend and then transparent the tunnel through HAProxy and route. There’s one product we have that uses client IP as the session key, it’s a real good idea, but we can’t replace it, it’s another department, so we have to work around it.
So, we don’t have to have any access to the routers; we go through the route reflectors; The route reflectors, HAProxy, and the service backends, everything feeds out of one config deployment and whenever something new appears or some change appears, anycast health checker will notice on the HAProxy machines, will then via BIRD tell the route reflectors and those then announce a BGP route to the datacenter routers. So, everything we need and with the route reflectors we have an additional layer of security there.
It’s implemented in Python so it’s easily extensible; It supports common templating languages, which we make some use of, some heavy use of, and since it’s Python it’s easily extensible to outside use. If there’s some data we can source from anywhere, we can get it within our config management and make use of it. That made us, to get the idea, that it could possibly make some wholesome, automated approach to get HAProxy to run and that’s Johannes’ part.
We are heavily monitoring with Icinga2. So if we know a service is running on HTTP, we can check the liveness with active checks on the HAProxy frontend, and also the backends if they are living, and also this works for our mail service and whatever you like because it’s extensible. The best thing is it’s a single source of information, so we simply say if it’s not there it doesn’t exist for us.
If the web team has another service, you create a new folder, add a new defaults file, and they have defaults for every type of service they want to deploy as such; and the best thing about this is you don’t have to specify all fields that we support because most things we can simply infer. If you have a service file like here on the right, and you see that we have SSL turned on and we say the SSL redirect is false, we know, okay, the HTTP server is doing the redirecting.
Also, automation always has a price in business complexity. Getting someone up to the level to understand how we are doing things is, yeah, it takes some time and it’s not like, “Yeah, here you have your Ansible playbook; Deploy this and you’re done,” but, and closing, things are not as simple as you might think.
The IPv6 migration is really easy because we can simply turn on, give the FQDN an IPv6, build the configuration once again so HAProxy listens on that IP, it gets automatically announced by BIRD, and then we’re done because HAProxy is an IPv6/4 gateway, we don’t even need IPv6 for the backend. We have fast reconfiguration because if I change something in this defaults file, I can rebuild the whole batch of servers in seconds, and it’s really cheap. It’s really a cheap solution if we compare it to our old BIG-IP solution; and also we have no vendor lock-in and what I like best is it’s future-proof. If we ever say, “Okay, we are moving to the cloud or we are doing Kubernetes or some other fancy technology, we have a consistent definition of what a service is and we can simply apply this to our new technology because we have written it once, so we can simply reuse it. Also, security is always a nice benefit!