User Spotlight

From 1.5 into the Future: How HAProxy Rose from a Simple Load Balancer Replacement into our Swiss Army Knife

Willhaben Logo
Christian Platzer
Willhaben

In this presentation, Christian Platzer explains why Willhaben chose HAProxy as their load balancer. Although they initially needed a way to add HTTPS to the site, they soon found that HAProxy gave them much more. It gave them the ability to pin SSL encryption work to specific CPU cores, use HTTP/2, extend functionality with Lua, and route traffic into Kubernetes. There were a few challenges along the way, which Christian says were much easier to solve with the help of the HAProxy Technologies support staff. Willhaben also uses the Enterprise edition’s Antibot module to protect against crawlers.

talk-main-image-willhaben

Transcript

Hey everyone. I wanted to give back the applause to this enormous and gorgeous organization of this event. I think it’s in order to have a little applause for everything involved here, all the techs and stuff. Thanks a lot for being here. I’m really excited. Yeah, my name is Christian Platzer.

So it’s pronounced Willhaben, which is, if you would translate it in English, it’s something like “I wanna have this”, “wanna have”. Yeah, it’s a term youth. And we are Austria’s largest classified site, classified as in terms of Craigslist and not top secret. We are also one of Austria’s largest sites as such. So, we are top in Austria in terms of page requests and page impressions, which is kind of a contradiction. Austria is not a big country, but we are a large site. I’ll give you some numbers and you can decide if we are a large company or not.

1.-2015_the-jurney-begins

All right, and my task today during the next few minutes is to guide you through some of our decisions. Unfortunately, the first talk today essentially did everything that we have running at our company. So, I’m quite happy that I didn’t devote to detail our system setup. I’m going to talk about some quirks, how we took the decision to go for HAProxy in the first place, and how this could also affect some other companies. I’m going to give you some examples of challenges that we had to face and maybe you can get something out of it.

It all started in 2015, which still amazes me; I have to look up this date every time I do a presentation because it’s just 4 ½ years ago. Back then we were around something like this. So, maybe as a background, Willhaben is a startup company. We started with very few people and it grew over time and now we have like 250 people, 250 employees, and you can imagine that there are a lot of challenges we had to tackle to put up with the growth.

Back then it was in the range of 600 to 800 megabits per second. This is a decent amount of traffic, but still it’s not that much, and 99% of it was HTTP. So, we were solely delivering HTTP requests or responses. That was it, except for login and payment. This was just 4 ½ years ago. Now we are running HSTS on our platform. So there is no HTTP anymore. Of course, we had a load balancer at the time of an undisclosed name. We had just four pools: our web pool, one for the API, the desktop, and the caching image pool. That was it and we were really limited in operating these pools. We just had the ability to drain them and disable them over the web frontend. There were no API requests for us because we are not hosting the infrastructure ourselves. We start from operating system level up and this also didn’t change over time. It cost a lot, this undisclosed load balancer solution.

2.-incentive

The first interesting thing I want to point out is our incentive. You might think that our incentive to go for a software load balancer solution would be something like gaining flexibility and having the operators or SREs have an easier day, but that was not it. The incentive to have a look at our load balancing solution came with the need to go all in on HTTPS. This was the major incentive and at the time the manager approached me and said, “Hey, maybe we could just shop the next bigger F5?” Okay, now I’ve said it.

I told him, “Yeah we could, but it will cost a lot.” You’ve seen the numbers from the first presentation. It really costs a lot and plus we’re not happy with it because we still would have to deal with this limited functionality that we’d need to put up with. So then this colleague of mine, let’s call him Mr. O., he was all like, “It’s Linux, baby! We can do this on software.”

I didn’t know HAProxy at the time, but he is one of these tech guys constantly pushing for new solutions and stuff, and he said that we can pull it off, and in my opinion, I always thought that if something is done in hardware it has to be way better than software because otherwise they would do it in software. So, I was really skeptical about replacing something like an F5 with a software solution that I didn’t even know at the time; but after some initial testing…yeah, I thought, maybe we can pull it off. We approached our manager. He gave his okay and that’s how HAProxy 1.5 came to pass.

3.-first-iteration-version-1.5

Our initial setup may look weird to you if you haven’t used HAProxy in the 1.5 version. There was no multithreading. It was all multi-processing. To give you a quick overview on what the design really does, we have this bunch of frontend cores that essentially just do the HTTPS offloading; and then they really connect to the backends through localhost connections, really HTTP localhost connections with a Proxy version 2 protocol. From there we go to the backends. Also, the scaling is pretty much, or was pretty much as you see there, we had these two backend processes.

The funny thing is with this kind of setup we kind of emulated a hitless reload because you could drain one backend, reconfigure it, put it up again, drain the second backend, configure it and put it up again. We usually didn’t touch our frontend process because we only needed to touch it or reload it when we really needed a new IP and that was not a common case back in the days. So, this worked pretty well. It was perfectly fine to deal with our 600 to 800 megabits per second. We always had core stickiness. That’s something that we also decided to keep even for the new implementations, and we were pretty happy with it.

Of course, when we first switched the DNS there were some problems; We didn’t account for the enormous amount of connections that would go over our backend if you keep it in Keep-Alive mode. We had to switch it to Server-Close there, but nothing that you couldn’t get under control. Yeah, enough from that.

4.-challenge-1-bandwidth

Now, I want to guide you through some of the challenges. This is a mostly time-based challenge line. It’s five challenges. Bandwidth. We were pretty sure that we had to deal with bandwidth in the first place because all these HAProxy servers, which were, by the way, old ESX hosts that we used to do this, were perfectly fine dealing with the traffic, but they only had bonded one gigabit interfaces. We already were at 800 megabits per second peak, so we had to do something there, and since these prebuilt systems all had like six NICs, we decided to go with IEEE 802.3ad bond, which is link aggregation essentially on the OS level. But the thing is, this must also be configured on the switch where we didn’t have access and it turned out that this point cost the most problems because our infrastructure provider was challenged by it. Let’s put it this way.

Now we are running…so how did we solve this? We had this thing for some time. Now we are running everything in 10 gigabit interfaces, so everything is scaled up now. But at the time it was simply too expensive and one problem that we ran into was something like that. I think you can read it or decipher it. We had four of these interfaces connected into the link aggregation and one of them was overloaded. So, we hit the cap of one of the interfaces because at that time we only had two cache servers with two IPs and chances…so the bandwidth was not evenly distributed and we saturated one of the links, which had some really weird effects. We thought that HAProxy might have a bug at first, but then we looked at it and saw, “Okay, this is something completely different.” This was also the cause why we decided to not go for link aggregation. Let’s see, light hearted. Again, because we are already hitting the 10 Gig limit. No we are not hitting it, but we’re getting close to it so we’re still growing and that’s maybe something that we have to come up with in the future.

5.-challenge-2-multithreading

The second challenge that we needed to tackle was going for multithreading. Why? We wanted to get rid of multiple backends, first of all for scalability because it would be nicer to just spawn threads and don’t care how many processes are actually running. We also have our configuration in Ansible and, theoretically, it was no problem to scale the backends or the frontends, but there are health check issues. For instance, so if you’re doing like we did, the frontend processes had to health check their backend processes and if you spawn another one then the health check traffic multiplies.

So, there are a lot of good reasons for us to go to multiprocessing; and one of the major implications was that it was not strictly required for Kubernetes, but we really wanted to have it for Kubernetes because in Kubernetes you need an Ingress Controller and it would be nice to just touch one configuration file there. Multithreading is something that we really wanted. It worked flawlessly without HTTPS. Remember, we had HTTP on the backend side, solely HTTP. There we were running from version 1.8 without any issues. This was really nice, although now it’s stable in 1.9r1; but it proved to be challenging when we enabled HTTP/2 or h2; Challenging in terms of undesired behavior. I’m going to get into detail of what happened there at a later slide.

6.-challenge-3-kubernetes

Then we wanted to have Kubernetes. Kubernetes is cool. Everyone wants to use it and there are certain requirements that you have to pull off in terms of what you need on your load balancing infrastructure to actually do that. One was the Ingress Controller. So, you needed an Ingress Controller. We are not running HAProxy as a sidecar container within Kubernetes. This is bare metal, we have two machines, which are actually active-active because we are…but consider it active-standby with Keepalived between them. We needed an Ingress Controller to actually do this. We also needed hitless reloads because if you’re dealing with Kubernetes, you will get a lot of configuration changes if there are a lot of deployments and if you’re thinking about continuous delivery, then you really need to have hitless reloads to do that.

Also, one thing that I really took care to have is a guaranteed syntax correctness. So, at any point in time I wanted to have a reloadable, restartable configuration at the host machines. You might think, how can it happen that it’s not that way? The solution for these problems is actually quite simple. In HAProxy 1.9 hitless reloads were introduced. Yay! We immediately switched to 1.9. It was a good decision and we finally were able to put everything in one process, have one configuration file where everything is there and go for Kubernetes.

In Kubernetes we have a self-written Ingress Controller. So, we are not on 2.0 yet, but it’s definitely something we will have a look into. Until now we’re using self-written Ingress Controller. It’s also written in Go, it works pretty nice, but does not support everything you could do in a Kubernetes Ingress itself. There are certain things missing there.

We’re using optional backends. What does this mean? Usually if you have a configuration you have some IP address. We do it like this. That’s an ACL. We bind all the IP addresses in a list. It’s not that many. You have an IP address; Hey, if the destination addresses is this, then use this backend k8sapp. If you just put it there, the below configuration is generated.

We include two configuration files: One is the main configuration; One is the configuration provided by the Ingress Controller. If you do it like this and you simply use use_backend Kubernetes app and someone decides to un-expose the service, then the backend is deleted and you cannot reload HAProxy again or restart again because it’s a syntax error, because you have a backend used that’s not there. If you do it like this it won’t work, but everything else will work and that’s the thing that we go for. I don’t know if there’s a better solution. Probably there is, but that’s what we came up with. I don’t know, you know how it is. You have a problem, you find a solution, it works, you keep it that way.

7.-challenge-4-http2

Challenge 4, the second to last challenge that we had to tackle, was we wanted to go for HTTP/2. HTTP 2.0 was really something that we had in mind of using, of course, because it’s more efficient, because if you don’t know much about it, the major difference is that there are way less connections from a client to a server and multiple requests can be dealt with over a single TCP connection. That’s the major point here. Of course, it reduces latency because the TCP connection does not have to be established in the first place. So, it’s really efficient and it’s the future, so we really wanted to go for it.

It had a huge impact in TCP sessions that we had on our servers. Of course, we also wanted to be cool and have h2 enabled as fast as possible, and we did enable it. It’s essentially just, you put in the right things in HAProxy and HAProxy does the magic for you. It’s done. It’s done and we got all that we wanted, so now we were cool. We had less connections, everything was really fine.

8.-challenge-4-http2-2

But we also got some other things. For instance, these busy looping processes that were mostly connected to HTTP/2 and HTTPS. Now really my kudos here to the HAProxy Enterprise—we are running Enterprise—support team; The time that they took to really produce a fix for all these problems, it was really enormous how fast they were, so really perfect.

We also had some segfaulting. It looks bad, but if you think about it the busy looping is far worse because usually, first of all, our prime time is not during working hours. So people get home from work and they scroll through our ads and that’s mainly in the evening, somewhere in the evening, and then we have our prime times and all these problems were prone to happen during our prime times, during high-load scenarios.

Sometimes one of the connections just got stuck and it essentially just took the resources of one of the processes. This by itself is something that will not pop up in some monitoring because it’s just a higher load. I mean our load, normally, is in the range of six cores during rush hours and then it was seven cores. We do have monitoring on the load there and if more than one gets stuck, we get an alert and we can restart it there, but still it can be pretty annoying.

On the other hand, if the process segfaults, it stops and it’s immediately restarted. Sometimes it was so fast that Keepalived was not failing one single health check in the time. HAProxy starts really fast, so it produced a downtime of one to three seconds or something, which is still bad, but not that bad. A segfault, in my opinion, is something that we could live with easier.

Then, we also had some very strange behavior that I wanted to diverge into and give an overview. We saw a lot of these lines and with “a lot” I mean like 40 to 60,000 a day, where, apparently, HAProxy…So, cd means the client aborted the connection and it was in the data phase. So, the client sent the request, everything was okay, header or everything, HAProxy directed it to the backend, started receiving the answer, and then cut the connection. We also did a TCP dump and it was just like that. HAProxy sent the reset packet.

We also saw this in our applications. We are running Java and there it was, a lot of these lines: Error while sending response to client. Connection reset by peer. So, the peer reset the connection. This has to be bad, right? After hours of digging and reproducing this stuff, we found out that it’s a Firefox feature and it’s called Race Cache with Network. Who of you knows this Firefox feature? Raise your hands. One. Please keep it up, I can see it. Yeah that’s one geek. Two, three. Yeah, about the same happened at our company.

Race Cache with Network in Firefox works like this: If there’s a cached resource, an image or something, in our case a configuration, and Firefox requests this resource longer than 500 milliseconds after the initial page load, then Firefox decides, “Okay, this web page is apparently very, very slow and most probably it’s caused by a slow computer. A lot of laptops are still running spinning discs and if it takes longer than 500 milliseconds, you know what I do? I just request the file, which should be cached from the local cache, and simultaneously I produce a network request to the original source and whichever arrives first, I will take.”

That was the first thing that boggled us. Why are these resources even queried in the first place because they should be cached on the client, right? We thought we messed up something there, but this was the case and it took us really a long time to get to the point of it. That was what I wanted to mention. Before we enabled HTTP/2, there was no way for the client to abort this request except terminating the TCP request, which it wouldn’t do. It wouldn’t terminate the TCP request. I don’t know why. In HTTP/2, it could because it just used the already established stream and immediately sent an abort. The TCP request was still established and that’s it. This was a feature that impacted us because we enabled HTTP/2, but was not really related to it. Quite interesting. If you see something like this, mind my words.

9.-challenge-5-crawler-protection

Finally, the next challenge and my personal vendetta: crawler protection. Crawlers are bad for us. We are a site, a classified site, we are producing ads for the customers and sometimes we are not fast enough, but we want to be crawled by Google because they are indexing us and they produce hits. The crawlers come with a range of varieties. For instance, there’s this Google Bingbot mayhem every website has to deal with, but there are also badly written bots.

Just two weeks ago we had a student at the university trying some things and he was producing searches as fast as the network could handle from his mobile device. It was enough that we really saw him. I could tell you a lot of stories in these requests. Also, it impacts our business because it sometimes produces the wrong access numbers if you want to show how many requests were to your ad, for instance, then this could be heavily influenced by a crawler.

Right from the start we implemented something from HAProxy which is the tarpitting functionality. This is quite easy, I really like it. What it essentially does is you have a blacklist of IP addresses and if this IP address requests your service, then after a grace period of ten seconds it gets a 503 or a 502 and it looks like the services is down. It works pretty well, but you have to really take care because it’s highly disruptive. You are actually blocking all of the requests to your site from a different IP address and this could be a bad idea if a company is running a NATed network or a web proxy, for instance. Then, you always see the same IP address and if it misbehaves, if you block the whole range or the whole IP address, then it could lead to bad side effects. Of course, it’s also dangerous if you have a dynamic IP pool. Most mobile devices are assigned dynamic IP pools and if you blacklist them it could easily happen that the IPs switch and then a new customer suddenly cannot access your site anymore and you don’t know why.

10.-challenge-5-haproxy-ee-antibot-module

We tried to get away from that and have something less disruptive and that’s where the HAProxy Enterprise edition Antibot module comes in and that’s the main reason why we decided to use the Enterprise edition in the first place. Basically, it is quite simple, actually. If a client requests your site, it doesn’t get the site, but on the first request it gets the JavaScript; It needs to execute the JavaScript, the JavaScript on the client sets a cookie and is redirected to the origin page. If the cookie is correct, it has to be verified by HAProxy, then it gets the original site. It works pretty well.

It was really easy to implement the different strategies for how to produce this challenge. Either it’s automated, you can even include Google CATPCHAs, a wide variety of things that you can do, and it looks like this. Sorry. I had to blind the secrets, but essentially all the logic to decide if such a challenge is produced in the first place, is contained in a Lua script. I decided to go Lua there. Most probably you could do most of the things in the configuration itself, but it’s pretty nice to have it contained in a Lua script, then you can code different triggers and stuff. We came up with this.

Then that top secret rules are checking if the cookie is correct and long story short, the user gets a challenge if the Lua script disagrees or the cookie was bad, and that’s it and it works pretty nicely. There are some caveats, some quirks that you have to take care about. For instance, this is very bad for Ajax requests because if you’re doing an Ajax request and you get a JavaScript challenge, it’s usually not solved. You have to take care of not messing up your application.

11.-miscellaneous

Now we are on HAProxy Enterprise 1.9.1 release r1 and we’re pretty happy with it. Everything is working. Kubernetes is integrated and I wanted to give you some miscellaneous quirks of what we are doing right now and what gives us thought. One thing is the balancing algorithms in Kubernetes. We are using an overlay in Kubernetes, which means all exposed nodes are in every backend that is generated. So, it’s always the same nodes and the overlay network takes care on how to distribute it.

For this we usually use leastconn because it’s really perfect for Java applications where you have, occasionally, a garbage collection, and if you use leastconn as a balancing algorithm, you essentially don’t overload. You redistribute, during small hiccups you redistribute all the requests and that’s simply not possible in Kubernetes anymore. That’s why we had to switch Kubernetes to Flannel, which is the overlay network to LVS. So, it supports LVS and there leastconn is supported again.

Secondly, something that I wanted to show you, we just started dynamic environments. So, the thought was for every pull request we wanted to have something like a dynamic environment that can be accessed. So, a developer creates a pull request, a complete dynamic environment spawns up, they test it and if it’s okay, the test is okay, it’s merged, then it’s torn down again. We managed to do this with Kubernetes, but what we needed is a URL.

The URL looks something like this. If you have a URL and a hostname that looks like this and you want to access a backend that looks like this, I knew that this is one of the things that, when you know HAProxy, you know that it can do it. It might be a hassle to implement it, but you know it can be done and that’s how I did it. I know maybe it’s not beautiful, but it works. It’s dyna_ and it takes the Host header. You read it from left to right. It takes the Host header, the whole thing, to lowercase, substitute dashes with points, and take the second field, which is the hash, and append it to Kubernetes. Shame on me, but it works perfectly! I’m really happy with this solution and now we implemented this dynamic application. The only thing that you have to do is create the wildcard CNAME for the subdomain. That’s everything you have to do, essentially, to get everything to the same IP address and then make this decision. That was it.

12.-some-data-using-haproxy

I’m over time, but here’s some numbers that I wanted to show you in terms of what we’re dealing with. Maybe the most interesting, we do have a peak traffic of 5.1 Gigabits for external facing, 5.7 in total, including our internals. As you see, we are already reaching our 10 Gig limit and that’s the next thing that we have to think about. Maybe distribute it; I don’t know how we tackle this, but it’s something that will come up in the next year and I’m sure we are going to be able to solve it.

We’re dishing out 700, 750 terabytes of traffic in 30 days and, at the moment, we are running 55 backends. More than half of them are generated by Kubernetes. It’s going in a microservice direction and will be more. Everything is handled with just 14 threads. We had no need to expand this. We can deal with this traffic easily with seven cores at peak time.

For the future, we are really excited about HAProxy 2.0 and, in my opinion, traffic shadowing is the thing that I anticipate the most because it gives us the opportunity to have production data on the test environment. How cool is that? Great. I won’t diverge into the other things because I’m already over time. So, thanks for the attention and if you have questions I’m happy to answer them.

CP
Christian Platzer Product Site Reliability Engineer, Willhaben

Organizations rapidly deploy HAProxy products to deliver websites and applications with the utmost performance, observability and security at any scale and in any environment. Looking for more stories?

Explore All User Spotlights