Accelerate Your APIs by Using the HAProxy Cache

HAProxy’s cache helps boost API performance by serving saved messages to your users.

The age of rendering most of a web page’s contents on the server and then delivering it as a colossal HTML file is fading into the past. Modern web frameworks like Angular, React, and Vue push towards creating components instead—individual elements on the page that fetch their data in the background and poll for asynchronous updates—which can be reused across your site. Meanwhile, the major browsers have added support for Web components, which may eventually cement components as an official web standard. The latest version of the HTTP protocol also makes components more attractive: HTTP/2, makes asynchronous communication with backend services more efficient by allowing better use of connections and utilizing multiplexing.

Components call RESTful APIs to get data from the backend servers. Be cautious, though. It’s not always the best idea to have client-side code connect directly to these servers. Doing so tightly couples frontend code to specific endpoints, which makes it harder to shuffle servers, do maintenance on them, or deploy updates. You can gain many benefits from placing HAProxy in between to act as an API gateway. An API gateway is a proxy that relays messages back and forth. It also adds functions like authentication, TLS encryption, rate limiting, and observability.

Something else that HAProxy adds is the ability to cache API responses, which can boost how quickly clients receive data. In this blog post, you will learn how to set up HAProxy’s cache feature, which will improve how fast you can deliver messages and lessen the load placed upon your backend servers.

Why You Should Cache API Responses

There are two readily available caches: a client-side (browser) cache and a server-side (HAProxy) cache. A browser’s cache will boost performance for a single user. HAProxy’s cache, which is known as a proxy cache, will speed it up for all users because once a resource is cached in the proxy, it’s available for anyone making the same request until it expires. It’s easy to enable, but you should know how to use it effectively.

HAProxy’s cache runs in memory, which makes it fast. Other proxy caches need to read and write state on the filesystem, which incurs some I/O latency. Also, because it runs within HAProxy, you don’t need to contact an upstream cache server, which means you have one less transfer across the network. In some cases, however, you’ll want the extra features of a shared cache server like Varnish. HAProxy’s caching feature is modest in comparison, but it might be exactly what you need for caching API responses.

Why should you cache API responses?

For one thing, it will speed up the time components take to receive their data, which has a huge effect on how responsive your website seems overall. One of the biggest obstacles to adopting a component-based design is the fear that your webpage will render its initial HTML page quickly, but linger in an unusable state while the individual components wait to load. A lot of that time waiting is spent processing the request on the web server, pulling the requested data out of the database, and forming the JSON-encoded response. Caching allows you to perform those steps only once and then serve the saved message to other clients.

Another reason to love proxy caching is because it reduces load on your servers. They don’t need to process nearly as many requests, many of which are likely the same request they saw earlier. It’s perfectly fine to serve a slightly stale response for content that doesn’t change extremely often, such as daily news feeds, product descriptions, reviews, and comment boards. Even caching this content for five or ten seconds could have a worthwhile impact, depending on the number of users viewing that same data. By the way, caching for a very short period of time is known as microcaching.

API functions that return data, rather than modify it, are best suited for caching. This typically includes any function called with GET. Just be sure that your API responses do not include any information that is specific to a user, such as API keys, user profile data, and the like.

How to Cache with HAProxy

In your HAProxy configuration file, add a cache section. It goes at the same level as a global or defaults section. You can have more than one cache section to create multiple caches for different purposes, and each can set its own max-age and other attributes.

The total-max-size directive sets the total amount of memory that this cache can consume; It has a maximum value of 4095 megabytes. The max-object-size directive sets the largest size of a single item you can store in the cache, and it can only be half of the total-max-size value. In this example, I’ve set it to 10,000 bytes, which is 10 kilobytes. If a response is larger, it simply won’t be cached. The last directive, max-age, sets the time-to-live (TTL) in seconds for an item in the cache. After the TTL expires, the item will be removed from memory.

Next, add an http-request cache-use and an http-response cache-store directive to your backend section. The former uses a cached resource if it’s found and the latter adds it to the cache. Both take the name of a cache section.

You can also restrict which responses should be cached by appending an if statement to the end of the http-request cache-use directive. For instance, if you wanted to cache only when the requested URL path begins with /api/news_feed/, you would use the following:

Notice that you add a condition to the http-request line, but you do not need one on the http-response line. HAProxy is designed to skip caching if there’s no chance the item will ever be used. Alternatively, the backend server can return a Cache-Control header with a no-store attribute to disable caching of a particular response.

Cache-Control: no-store

The Cache-Control header also supports the s-maxage attribute, which lets you override the TTL that was set in HAProxy’s cache section. Consider the following Cache-Control header, which allows the response to be cached, but sets its TTL to 10 seconds:

Cache-Control: public,s-maxage=10

To see the TTL that was set on an item in the cache, call the HAProxy Runtime API show cache command, which shows the TTL as the expire field:

You can also get metrics about your cache, which can be displayed in Grafana. If you’ve enabled Prometheus metrics in HAProxy, scrape the following metrics from HAProxy’s Prometheus endpoint (where the proxy label would be the name of your frontend or backend):

  • haproxy_frontend_http_cache_lookups_total{proxy= »fe_api »}
  • haproxy_frontend_http_cache_hits_total{proxy= »fe_api »}
  • haproxy_backend_http_cache_lookups_total{proxy= »be_api »}
  • haproxy_backend_http_cache_hits_total{proxy= »be_api »}

These metrics show you how many cache lookups were performed and how many resulted in a cache hit. You can use that to adjust your TTL values.

One last trick: You can return a response header that shows whether the requested resource was found in the cache. Currently, this method is a temporary solution until a future version of HAProxy adds a better way. Add these two lines to your frontend. They check if the srv_id fetch method returns the name of a server that was used to handle the request. If no value is returned, it means that HAProxy used the cache.

HAProxy will set the X-Cache-Status header to HIT if the item was found in the cache, or to MISS otherwise.


HAProxy’s cache helps boost the speed of your API services, resulting in a more responsive website. Define how long responses should be cached using the max-age directive, which you can override with a Cache-Control header. If there are certain responses that should not be cached at all, you can use an if statement to filter them out or you can set your Cache-Control header to no-store. The HAProxy Runtime API will show you how long items will live in the cache and HAProxy’s Prometheus metrics endpoint exposes counters for lookups and cache hits. Now go and enjoy the benefits of proxy caching!

Want to stay up to date on similar topics? Subscribe to this blog! You can also follow us on Twitter and join the conversation on Slack.

Interested in advanced security and administrative features? HAProxy Enterprise is the world’s fastest and most widely used software load balancer. It powers modern application delivery at any scale and in any environment, providing the utmost performance, observability, and security. Organizations harness its cutting edge features and enterprise suite of add-ons, backed by authoritative expert support and professional services. Ready to learn more? Sign up for a free trial.