Home
Web Development
Why Turning on HTTP/2 Was a Mistake

Why Turning on HTTP/2 Was a Mistake

April 10, 2019 Thayne McCombs 25 Comments

Why Turning on HTTP/2 Was a Mistake.

HTTP/2 is a great leap forward for HTTP. It increases bandwidth efficiency by using a binary compressed format for headers, decreases latency by multiplexing requests on the same TCP connection, and allows the client to specify priorities for requests. In many cases moving to HTTP/2 is the right thing to do, but for some applications, HTTP/2 can cause significant problems.

Last year, at Lucidchart, we enabled HTTP/2 on the load balancers for some of our services. Shortly thereafter we noticed the servers behind our HTTP/2 load balancers had higher CPU load and slower response times than our other servers. At first, the cause was unclear because the server traffic appeared normal and the problems seemed unrelated to any code changes. On closer inspection, we realized that the average number of requests was the same as usual, but the actual flow of requests had become spikier. Instead of a steady flow of requests, we were getting short bursts of a lot of requests. Although we had overprovisioned capacity based on previous traffic patterns, it wasn’t enough to deal with the new request spikes, so responses to requests were delayed and timed out.

What was happening? Well, most browsers limit the number of concurrent connections to a given origin (a combination of scheme, host, and port), and HTTP/1.1 connections must be processed in series on a single connection. This means HTTP/1.1 browsers effectively limit the number of concurrent requests to that origin, meaning our user’s browser throttles requests to our server and keeps our traffic smooth.

Remember how I said HTTP/2 adds multiplexing? Well, with HTTP/2, the browser can now send all HTTP requests concurrently over a single connection. From the web client’s perspective, this is great. In theory, the client should get all the resources it needs quicker since it no longer has to wait for responses from the server before making additional requests. However, in practice, multiplexing substantially increased the strain on our servers. First, because they received requests in large batches instead of smaller, more spread-out batches. And secondly, because with HTTP/2, the requests were all sent together—instead of staggered like they were with HTTP/1.1—so their start times were closer together, which meant they were all likely to time out.

How to fix it

Fortunately, it is possible to solve this problem without having to increase computing capacity. In fact, there are a few potential solutions, although they take a little more effort than just enabling HTTP/2 in the load balancer.

Throttle at the load balancer

Perhaps the most obvious solution is to have the load balancer throttle requests to the application servers, so the traffic patterns from the application servers’ point of view are similar to what it was using HTTP/1.1. The required level of difficulty depends on your infrastructure. For example, AWS ALBs don’t have any mechanism to throttle requests with the load balancer (at least not at the moment). Even with load balancers like HAProxy and Nginx, getting the throttling right can be tricky. If your load balancer doesn’t support throttling, you can still put a reverse proxy between the load balancer and your application server and do the throttling there.

Re-architect the application to better handle spiky requests

Another (and perhaps better) option is to change the application so that it can handle traffic from a load balancer that accepts HTTP/2 traffic. Depending on the application, this probably involves introducing or tweaking a queueing mechanism to the application so that it can accept connections, but only process a limited number of them at a time. We actually already had a queuing mechanism in place when we switched to HTTP/2, but because of some previous code decisions, we were not properly limiting concurrent request processing. If you do queue requests, you should be careful not to process requests after the client has timed out waiting for a response—no need to waste unnecessary resources.

Before you turn on HTTP/2, make sure that your application can handle the differences. HTTP/2 has different traffic patterns than HTTP/1.1, and your application may have been designed specifically for HTTP/1.1 patterns, whether intentionally or not. HTTP/2 has a lot of benefits, but it also has some differences that can bite you if you aren’t careful.

Postscript

Another “gotcha” to look out for is that software supporting HTTP/2 is not fully mature yet. Although it is getting better all the time, software for HTTP/2 just hasn’t had enough time to become as mature and solid as existing HTTP/1.1 software.

In particular, server support for HTTP prioritization is spotty at best. Many CDNs and load balancers either don’t support it at all, or have buggy implementations, and even if they do, buffering can limit its effectiveness.

Also, several applications only support HTTP/2 over HTTPS. From a security point of view, this is great. However, it complicates development and debugging inside secure networks that don’t need or want TLS between different components. It means you need to manage a CA and certificates for localhost for local development, log session secrets in order to inspect HTTP/2 requests with Wireshark or similar tools, and may require additional compute resources for encryption.

HTTP

25 Comments

Dmitry Pashkevich • April 20, 2019 at 1:48 am

Thanks for sharing! So it sounds like you went ahead with http/2 instead of going back to http 1.1. Why do you call the transition a mistake then? Can you talk more about how you made the call?
Thayne McCombs • April 22, 2019 at 11:20 am

We have not yet moved our services to http/2 (we are using http/2 for resources served through a CDN though). Primarily because we don’t currently have the resources to make the necessary changes to our application. It was specifically a mistake to turn on http/2 without first making changes to our application so it could handle the different traffic patterns.
Why Turning on HTTP/2 Was a Mistake – Lucidchart – JNW • April 22, 2019 at 1:23 pm

[…] HTTP/2 is a great leap forward for HTTP, but turning it on HTTP/2 was a mistake. Learn how to solve this problem without having to increase computing.Read More […]
Ismail Kably • April 22, 2019 at 4:31 pm

You comment is contradictory and unclear
– We have not yet moved our services to http/2
– It was specifically a mistake to turn on http/2

Is your post analyses based on testing HTTP/2 on a testing environment ? Or this analyses is based on turning on HTTP/2 for a short time on production servers then it had been migrating back to HTTP/1
Thayne McCombs • April 22, 2019 at 4:34 pm

Sorry, yes, that is a little unclear. We did enable http/2 in production (after testing in preproduction environments without any problems). But because of the issues described we went back to http/1.1 to keep our prod environment stable. And we continue to use http/1.1 for requests that aren’t served by a CDN.
lucas dicioccio • April 22, 2019 at 4:56 pm

You should be able to configure the max-concurrency of the HTTP/2 session to limit how many streams a client asks at a same time. The default is 100, which is probably too high for your capacity plan. Another tuning point is the flow-control mechanism, though this will be harder to configure well and impossible to tune dynamically if your application has no direct access to flow-control (e.g., because a load-balancer terminates the HTTP/2 sessions).
Thayne McCombs • April 22, 2019 at 5:02 pm

That’s good to know. Although, at the time we were using AWS ALB’s to load balance HTTP/2 requests as HTTP/1.1 requests, and as far as I am aware there isn’t any option for customers to configure that setting on ALBs.
Nathan Myers • April 22, 2019 at 5:15 pm

Seems like getting multiple queries together in traffic spikes could give you better performance if you can organize your services right. In any case, if queries are timing out, that really suggests you are close to your limits.

You can often get better performance by grouping similar queries to a server that specializes on that kind of query, so data needed by multiple queries is hot in cache.

Getting multiple queries at once enables you to schedule work on them, adaptively. The ones that need a database query can be started first, then, while waiting, you can process those that don’t. Those that need a lot of back and forth to one database can be grouped with others that talk to a different database.
Joseph Scott • April 22, 2019 at 5:23 pm

Just to note here that the number of concurrent requests even over HTTP/2 is an evolving situation. For example, Chrome now limits the number of inflight requests to 10 on slow connections – https://blog.josephscott.org/2019/04/16/chrome-delays-requests-on-slow-connections-even-with-http-2/

Obviously if most of your users are on fast connections then this isn’t something you are going to see
Frank de Wit • April 23, 2019 at 3:19 pm

Multiplexing has been around well before HTTP/2. It just created these problems that you described above, so the techies never enabled it in their browser.

This situation could have easily been avoided by having old people in tech positions, but ageism in this industry is all to real.
James McGregor • April 24, 2019 at 11:43 am

I feel that this article blames wrong things. Even if browsers would throttle HTTP/2 requests, this wouldn’t change the fact, that lack of server-side network shaping and relying on client-side implementations in pretty much insane (I struggle to find better word for that). If that was mine experience, on my system, first thing I would do, I would start implementing server-side network shaping ASAP. What is happening on HTTP/2 with current browsers is also possible on HTTP 1.1 and the fact that DDoS is not common for you guys amazes me 😉
Time is So Everything Doesn’t Happen at Once - Coiner Blog • May 16, 2019 at 10:04 pm

[…] really enjoyed this post by LucidChart engineering on some recent troubles they encountered when enabling HTTP/2 on their […]
Wout • August 13, 2019 at 8:20 am

Would it not be better to do only TCP load balancing and letting the servers themselves serve HTTP/2? That way, the control is all local and the less efficient HTTP1.1 would not swamp the backend servers?
Thayne McCombs • August 15, 2019 at 10:49 am

If that was possible, yes. But our “load balancer” does more than just load balancing. It also performs routing based on data from the http request (like the Host header), modifies http headers, etc. Also, load balancing at the tcp level means load balancing connections rather than requests, which can lead to uneven load distribution if some clients make more requests than others.
Maksim • January 15, 2020 at 7:44 am

Time to change app architecture to modern microservices/FaaS approach. By the way HTTP/3 is already there.
Frankie • April 30, 2020 at 8:22 pm

Very interesting! Thanks for sharing. Covered your case on https://wasteofserver.com/apache-archiva-behind-haproxy-with-ssl-termination/ by adding the `tune.h2.max-concurrent-streams 5` to HAProxy’s config which, in your case, wasn’t possible. But this is definitely something people should be aware. Also, the default of HAProxy of 100 simultaneous streams per-connection seams overkill. Maybe hitting servers like Varnish, but otherwise it can over flood any Apache.

THanks!
David Lee • July 29, 2020 at 6:25 am

For those who suggest (or believe) the ‘obvious’ answer is to ‘simply’ change the app architecture — at some point in your career you will discover that is rarely ‘simple’. Inherent in *any* mature and successful product, for reasons technical and organizational — architecture change, or significant change to core components gets increasingly difficult and *expensive*. And who pays for that ? You do. What you are really suggesting “Why didn’t you just charge everyone 2x and get rid of the free plan instead of finding a way that cost less and let me keep my free plan?”
Ben • February 4, 2021 at 11:56 am

Did you discover any new bugs when increasing the amount of concurrent web requests per user? I imagine you might have more unexpected race conditions when dealing with real time collaborative diagramming. Theres many edge cases to cover when all users can read and write the same data in real time.
Thayne McCombs • February 4, 2021 at 12:05 pm

No, as far as I am aware this didn’t have any impact on race conditions in real time collaboration.
Angelin • March 31, 2021 at 2:30 pm

if implementing throttle mechanism gives traffic patterns from the application servers’ point of view similar to what it was using HTTP/1.1 then what is the benefit here? It may not reduce latency.
Av • October 3, 2021 at 10:18 pm

I am unclear about something very basic when it comes to HTTP/2 and load balancing.
Taking the typical example, say the browser sends “give me the image of a car”, “also give me image of a dog”, “also give me image of a house”. Does the load balancer send the 3 requests to just one application pool member because it is part of the same “session” or does it load balance it to multiple pool members?
Thayne McCombs • October 4, 2021 at 11:05 am

That depends on the load balancer. By default AWS ALBs, and haproxy will load balance to multiple pool members. I’m not sure about others.
Paul Libbrecht • September 30, 2022 at 12:13 am

Super interesting post!

I tend to conclude a bit more optimistic:
– If the back-end communication can all be made http/2 then the spikes could be well managed. This seems hard, indeed, because http/2 over non-TLS is tricky… but shouldn’t this precisely be the push to make it work? (e.g. at big proxy servers)
– Also, as you say that CDNs were on http/2: anything static can easily and should be made http/2 and an easy way to do that is make sure that this is mostly served from the front proxy (caching, rewrite rules…).
Paul
Tze • July 12, 2023 at 6:26 am

God thank you so so much, your efforts back in 2019,safe my life in 2023. Was having this 504 time out error from no where, until read through your articles, and yes we turned off our AWS ALB http2.0 now. Thanks
Yash • January 27, 2024 at 1:29 am

By queuing mechanism, do you mean having a queue like Kafka or MQ? If an application has to make changes to handle http2, it is not as bad as an application making changes such as adding sprites in the case of http1.1? If Queuing is used, does that make your application work like it is working on http1.1? What does ‘enabling’ http2 mean? Does it mean the browser is now aware server talks http2?