Willy Tarreau on HAProxy at its 20 year anniversary (interview)

Willy Tarreau, the founder of the HAProxy load balancer, 20 years past its initial, open source release, still guides the project, often submitting code patches and writing long and meticulous replies on the community forum. Over the years, he has been joined by a cast of regular contributors, but also newcomers. This collaboration has kept the project evolving over time.

In this interview, Willy describes his views on the success of the project, and how it grew over the years. He also discusses how the open source model has evolved and other forces at play in the software industry.

See the HAProxy History Timeline here.

HAProxy’s popularity is evident by the number of organizations we hear from that are using it, as well as the rapid growth of our online communities. Are you surprised by this? Or did you foresee that it would be popular given the importance of load balancing?

I didn’t foresee that it would become popular, in part because when you start a project initially to serve your own needs, and you know of a number of limitations in your code, you can’t imagine why anyone would be fool enough to use it. Also ’til 2005, HAProxy was solely used by a few French people whom I knew directly or indirectly because its main doc was written only in French. This suddenly changed when Roberto Nibali, a friend working with load balancers, told me, “I can barely read your doc. This hinders the development of your project,” that I forced myself to translate it to English. Then, I suddenly started to see users from other languages and countries, ’til the point I had to set up a mailing list because it became too painful to keep in touch with all of them individually.

2005 is also the period when we started to get a lot of requests to assist in HAProxy deployments because the web was growing very fast by then. I remember that I couldn’t keep up with patching kernels and providing working Keepalived and HAProxy packages for enterprise distros. That’s also when we decided to create our appliance, the ALOHA.

By then we had to question ourselves whether this was something serious and durable or not, and I remember saying that we had to prepare for a possibly short-lived project because load-balancing would become commoditized like routing, switching or network cables, and that it would be so deeply integrated in applications that load balancers would quickly disappear. The world of containers and massive horizontal scalability decided slightly otherwise! These technologies have standardized the need for load balancing and elevated it as a dedicated component, which is great for us.

Another point made me at least partially believe in a durable project. It’s that, for me, it quickly appeared obvious that HTTP was becoming ubiquitous, largely because it brought metadata on top of TCP. Its ability to add typing, authentication and application-level sessions to transported data made it obvious to me that it would replace TCP for a lot of services; and having a proxy whose initial purpose was to rewrite and filter HTTP headers, I definitely felt that there was a huge, under-exploited potential there. Microservices sort of confirmed my intuitions.

When it comes to the communities, well, I guess every open source project passes through the same phases. First, I think that many of the people who spend their spare time on computers have some form of timidity and feel easier with computers than with people, so the notion of a community is not exactly something that first comes to mind, and it could even scare some to the point where they’d prefer to give up.

For some time, only your friends who hear you decline their weekend invitations know that you’re having fun working on your project. Some of them share some ideas, advice and criticism, and they start to constitute your community. At some point, when you see that the project could fulfill some real use cases at work, you’re a bit shy to propose it, but you start to discuss its suitability with some coworkers and they’re totally thrilled by the idea, which could possibly save a lot of time on certain projects. They become your strongest supporters and users—and they come with plenty of demands at once. Your community broadens.

Then a third step happens. You start to get users you don’t know and then it’s too late to step back. You realize that most of them are highly skilled—especially when trying to use an early project—and they have no problem with design issues and just want to discuss how to solve problems. Over time, you discover a wide variety of people from plenty of different backgrounds, histories and even motivations. But all of them have one thing in common: they want the project to succeed and improve. This is an exceptional driver for mutual respect and help because the project cannot work without them.

You progressively see a self-organizing team of people with complementary skills that eliminates all the roadblocks in your way, while making sure you don’t waste your time doing stuff you’re not good at; and in return, they also get a lot of respect from the same community for the work they’re doing, to the point that some of them are not afraid of publicly telling you when you’re wrong and this is critically important for the project’s health.

This all works amazingly well, but nobody can foresee this without actively having been part of it. I couldn’t have done a tenth of what we’ve reached without this helpful community, and I would never have imagined that I could have teamed up with plenty of people I didn’t know, who would occasionally force me to change some of my processes to help them help me deliver something better.

HAProxy is a well known, open source project and you’ve often shown your commitment to keeping it that way. How has the world of open source changed over the lifetime of the project? Do you see more acceptance of it as a model? Or have you seen skepticism grow?

Oh, it has amazingly evolved!

HAProxy developed in an atmosphere where each deployed product had to have a contract with a vendor so that someone else was responsible in case of trouble. Initially I remember that it was strictly forbidden to use Linux because of this. Later, some people understood the value in using some open source products—Linux and HAProxy included—but it was crucial that the top-level chief didn’t know! I’m pretty sure he did know, but acted as if he didn’t.

Many times, when there was a bug in the proxied application, HAProxy was used as an easy scapegoat by unscrupulous application developers, who emphasized its lack of vendor support to justify the likelihood of it being the source of their trouble. Some in the organization would have preferred to see it replaced, while others did benefit in having it. During the same period, some applications saw the benefit of having both open source software and a developer who could work on it inside the walls, because feature requests started to fall like rain to the point where it was hardly conceivable to replace it anymore. Instead of hiding behind “I was not aware”, official support contracts were set up. For me it was a huge victory of open source over the old model. It showed for once that open source doesn’t mean “deal with your problems on forums or by yourself”, but tweak the code to suit you best, and if you don’t want to develop internal skills you can find someone outside who will do it for you.

Nowadays I think most companies do not even know whether what they are using is open source or not. They’re using pre-packaged solutions, deploying containers, popping up VMs, using software they don’t know, but which delivers the service well. At best, they know if they have to pay per seat or not. I wouldn’t even be surprised if there were companies trying to make sure that they don’t use non-open source products—which in my opinion would be as stupid as the former approach. What matters is efficiency: use the best tool for the job. If two tools have the same efficiency and one is free while the other one is paid, this forces the other one to distinguish itself by trying to be significantly better to justify the price. This offers a wide spectrum of solutions to all users.

There is, however, something that worries me for the long term. Due to lots of admins not having the time anymore to maintain their own solutions, and with the need for agility as a recurring excuse, there’s a massive trend to move a lot of services to hosted applications. Many hosted applications not only force you to adopt a single way of using them, which is totally contrary to the open source principles of “tweak it to suit you best”, but in addition progressively dictate your rights and your access, and others’ access, to your own data. With the help of delegated identity management, they made sure it was so convenient to exchange with unknown people hassle-free that this creates social pressure on those who preferred to keep a bit of control over all this. This could diminish the relevance of open source, not because it becomes less interesting, but because there is not as much interest for software in itself as there used to be.

Another factor that significantly evolved in the open source area over the last two decades is how developers want their code to be used. When open source was something new, developers didn’t want to completely give their code away, they felt like they were giving too much for too little in exchange. By then a lot of projects were considering using GPLv2 as their license because it provided balanced rights between the developer and the end user, and theoretically made sure that the developer’s code wouldn’t be stolen and used in a proprietary product. Then GPLv3 appeared to further protect end users, at the expense of even more efforts from the developers. But at that time, open source had already spread widely and developers discovered the hard way that some of their code couldn’t be adopted by other projects due to license incompatibilities. For example, I learned about this when a guy working on a BSD-licensed project told me he couldn’t use the elastic binary trees library I wrote because it was licensed under LGPL which I thought was already very permissive. In HAProxy, which is GPL, we had to add the OpenSSL license exception to allow us to link with OpenSSL.

There is also the case of companies who had developed pieces of code under a license and cannot reuse them in another project released under another license! After facing this trouble myself, I figured that the simplest solution was to license my supposedly reusable code under the MIT license, which is as close to the public domain as possible. In short, it’s a license that says, “do what you want with it, but do not bother me and do not deny me my rights.”

I’m seeing the world evolve in the right direction on this front. For example, OpenSSL recently changed its license so this exemption is not needed anymore. Some FSF-driven projects that required contributors to do some paperwork to give away their rights were simplified so that it’s not needed anymore; and when you look on GitHub’s license statistics (since 2015), you see that MIT holds almost 45%, vs 13% for GPLv2, and 9% for GPLv3. Some more recent stats from WhiteSource indicate the more recent and fairly permissive Apache 2.0 is now taking some shares from MIT and the other ones, to reach the same level, probably in part due to being popularized by Kubernetes, and GPL continuing to decline.

For me, this is a strong indicator that open source is now perceived as the normal way to develop, to the point that something that is not open source looks highly suspicious and is hardly trusted anymore. Look at how people share some Arduino code on blogs. They know it will be duplicated, and so what? Developers don’t want to be bothered by legal stuff they do not understand. They know that in a world dominated by centrally hosted solutions their license doesn’t guarantee them to get anything back, so they only want not to be bothered and to still be allowed to reuse their code after someone fixes something trivial in it. The rest is just uninteresting garbage used to feed lawyers.

I’m very happy to have witnessed this change over time. It was probably necessary to achieve higher levels of complexity in projects and bring more talents on board; and I know that for a significant number of our HAProxy Enterprise customers, knowing they’re using open source is an extremely important factor they are not going to give up anymore. I mean, we can hold discussions around code between tech people! No time wasted supposing stuff that ends up being wrong. It’s a huge time saver for them.

The one constant in technology is change. For example, cloud computing, microservices, and Docker containers each had a major impact on how people deploy and host applications. What are some of the most important technology shifts you’ve seen over the years and how have they affected HAProxy?

Without a doubt, dynamism. For a decade we knew solid and steady infrastructures where each server was known and registered in a configuration, and it wouldn’t change for 2-3 years. This allowed you to build up a huge amount of very advanced and unique load balancing features that rely on this perfect knowledge of the environment, like slow start with dynamic weights, SLA-based prioritization in the server queues, and self-aware servers that automatically advertise their conditions to modulate their load (e.g. slow down during backups), like redundant paths, anomaly detection, etc.

Then when cloud infrastructures appeared with servers changing their IP addresses on each reboot, people started to use DNS to provision their farms, and orchestrators allowed hosting providers’ customers to add/remove servers each and every second. The old, robust model didn’t hold anymore because it required a reload to apply each change, and users were feeling bothered with having several old processes finishing their job while the new ones were serving the new config, and it cannot do everything well if you do that every second.

We had to work hard on allowing users to perform dynamic changes. But in this fast-moving world, it was not easy to both tries to design a reliable long-term solution and please users who were insisting at the same time that they have something quick and dirty to start with. It took time, but now we’re there after changing a huge amount of the internal architecture without the end-users noticing since even configs from 15 years ago continue to load fine.

You can reconfigure your servers on the fly, add/remove them, enable/disable SSL and even change mutual auth certificates, etc. There’s always more stuff that users ask us to do and we’re still working on pleasing them, but we also know that the internal architecture continues to adapt to evolving end-user demands and that being increasingly dynamic makes us increasingly responsive to new trends.

Also what is extremely motivating for us is that those who are asking for all these features are also the most demanding ones about performance. And we can proudly say that we have really not compromised on performance over time. Seeing millions of HTTPS requests parsed, analyzed, processed, and routed every second by a single process, or reaching speeds of 100 Gbps provides a huge amount of margin that comforts our users in their choice even if it took us time to get there. We’re not piling up random stuff pulled out from random repositories to deliver features in an emergency, and that ultimately pays off, for users first, by always requiring fewer machines to do the same job; this is important in modern environments where you pay for machine time!

If you had a time machine, would you travel to the past and try to convince yourself to do anything differently?

It is always difficult to respond to this. In two decades on this project, I gathered a lot of experience, including in how to interact with people, so that’s not something you can decide on, and it benefits from failures. Initially, I wanted to stay discreet about that project, and now I know from what I’m seeing that making a lot of noise could have saved years to establish a community. But I wasn’t ready for this anyway. And I still am not, but now I can speak publicly about the project, that’s better 🙂

I would possibly have considered a more open license such as a BSD-2 clause or Apache 2.0 so that I never ever have to hear this question anymore “do you think I can steal your function there for my own project?” to which I systematically have to respond “I don’t know, run git log on it and ask others if there are just a few contributors,” which almost always ends up in “OK it’ll be quicker to rewrite one myself.” Whenever someone cannot reuse existing open-source code in good faith, I see this as a heartbreaking failure.

Twenty years ago the choices were complicated. I spent two weekends studying possibilities among about 100 licenses or so. Nowadays, I’m convinced that GPL doesn’t bring anything to the project anymore. It certainly used to be during the first 10 years when the project could have been discretely packaged into appliances by unscrupulous vendors trying to directly compete with those doing their job for free. But now, almost no new appliances are created, and the code is already used in hosted platforms if it’s needed.

In addition, the code evolves fast, modern protocols and the wide variety of features have made the project so technically complex that absolutely nobody wants to keep their local patches for themselves because the only thing they can be certain of is that their patches are wrong and that they’d keep dangerous code in production. What is important for anyone having local patches is to try to get them reviewed, fixed, and merged as fast as possible so that they quickly become someone else’s problem. I know that a few users are running a non-negligible number of local patches for their own infrastructure, and these ones are probably running on vulnerable end-of-life versions nowadays because it simply is impossible for them to keep up with the development pace, and the current licensing model doesn’t change anything to that anyway.

The thing I would probably do differently is to decide upfront to release code on a schedule and not “when it’s ready.” This was a huge mistake that caused us to stay in development for 4.5 years for version 1.5 when six months were expected. This is something you learn when you have a community. You don’t decide when patches or feature requests arrive, and you cannot slow down development. Linus Torvalds said this very well many times. By the way, he also faced various challenges in his project that I faced later so I’m pretty sure there is nothing specific to HAProxy nor to me here, there are just critical steps any open source project has to go through, and the project passes the test thanks to its community.

Over the long term, if we can work more on topic branches, I’d like it if we could release what is ready every 3-4 months or so (i.e. 3-4 releases a year) and maintain them shorter (4-6 months maybe) for one LTS per year. This could provide something more dynamic to bleeding-edge users and reduce the amount of maintenance we’re doing on non-LTS versions. But that’s just a feeling and I could be proven wrong. But doing this would probably imply quite a few changes to the development process that we’re not willing to go through at the moment.

Subscribe to our blog. Get the latest release updates, tutorials, and deep-dives from HAProxy experts.