While a load balancer routes requests to multiple web servers, how does the load balancer decide which should get the request? Load-balancing algorithms bring some method to the madness of processing (often) millions of requests, intelligently and intentionally distributing them between servers based on configured rules.
There are multiple ways to accomplish this equitably — either by analyzing information within the request itself, server performance metrics, connection volumes, and more. The overall goal is to reduce strain on backend servers and keep applications running smoothly.
How do load-balancing algorithms work?
Because the load balancer sits in front of the servers it safeguards, each instance has unique insight into individual server health. This encompasses not just "active" and "down" statuses, but includes metrics like response times, resource capacity, and load.
Additionally, since the load balancer can unpack each request to view request content URLs and IP addresses, it's possible to use an array of dynamic algorithms.
Here are some common load-balancing algorithms:
- Random – Requests are routed to one or more servers without taking metrics into account. This method is simple but can cause load imbalances over time. There are some variations here such as the power of two random choices (pick two backends at random and use the one with the least load) which can mitigate some of the downsides. This is useful in the presence of rapidly-changing backends. 
- Round robin – Requests are routed sequentially to each server in the pool until the cycle repeats itself. This method is also simple when used with servers that have homogenous capacity, but doesn't consider server load. 
- Weighted round robin – Similar to round robin, the load balancer assigns each server a "weight" (and therefore determines their processing abilities) based on capacity. 
- Least connections – New requests are routed to servers with the fewest active connections. This is great when connection durations vary greatly, but can falter when connections are long-lived. 
- Least response time – Requests are routed to servers with the quickest response times, provided they also have the least number of connections. This takes server performance into account, yet requires constant monitoring to be effective. 
- Source IP hashing – Requests are routed based on each client's hashed IP addresses. This enables persistence by matching each client to the same server throughout their session, but can lead to uneven distribution. 
- Hashing – Some part of the request (such as the file being requested) is hashed to select which backend should service a given request. This is useful for CDNs and other servers which cache responses but can respond to any request. 
- Dynamic – Requests are routed based on collective, continuous monitoring of server load, response times, and more. This approach is highly adaptable but difficult to implement and demands closer oversight. 
- First – Requests are routed to the first backend server with open slots. This is beneficial when servers have known capacities, like those supporting some online multiplayer games. 
It's important to remember that no algorithm is necessarily the best. These options exist thanks to the wide variety of environments, deployments, server setups, and more. Each has its place within a unique load balancing strategy. If you're unsure which load balancing algorithm to try first, using round-robin is generally a good starting point (while switching to another if required, depending on your observations).
What makes load-balancing algorithms useful?
Load balancing algorithms protect servers from becoming overloaded. By choosing the correct algorithm for a given application environment, organizations can prevent clients from collectively consuming too much memory, CPU/GPU, or network bandwidth while using web applications or APIs. Even distribution is important and also ensures that available resources are utilized efficiently instead of sitting idle.
This is key to enabling high performance, security, and high availability — whether you're using a hardware appliance, virtual appliance, or software load balancer. Since this is a general list, each load balancer will offer its own set of algorithms that fit within these categories.
You’ve mastered one topic, but why stop there?
Our blog delivers the expert insights, industry analysis, and helpful tips you need to build resilient, high-performance services.
Does HAProxy support load balancing algorithms?
Yes! HAProxy products support over 13 algorithms to best match your infrastructure needs. These range from simple to complex, and static to dynamic. To learn more about intelligent request routing in HAProxy, check out our algorithms documentation.