Why Turning on HTTP/2 Was a Mistake

Why Turning on HTTP/2 Was a Mistake.

HTTP/2 is a great leap forward for HTTP. It increases bandwidth efficiency by using a binary compressed format for headers, decreases latency by multiplexing requests on the same TCP connection, and allows the client to specify priorities for requests. In many cases moving to HTTP/2 is the right thing to do, but for some applications, HTTP/2 can cause significant problems.

Last year, at Lucidchart, we enabled HTTP/2 on the load balancers for some of our services. Shortly thereafter we noticed the servers behind our HTTP/2 load balancers had higher CPU load and slower response times than our other servers. At first, the cause was unclear because the server traffic appeared normal and the problems seemed unrelated to any code changes. On closer inspection, we realized that the average number of requests was the same as usual, but the actual flow of requests had become spikier. Instead of a steady flow of requests, we were getting short bursts of a lot of requests. Although we had overprovisioned capacity based on previous traffic patterns, it wasn’t enough to deal with the new request spikes, so responses to requests were delayed and timed out.

What was happening? Well, most browsers limit the number of concurrent connections to a given origin (a combination of scheme, host, and port), and HTTP/1.1 connections must be processed in series on a single connection. This means HTTP/1.1 browsers effectively limit the number of concurrent requests to that origin, meaning our user’s browser throttles requests to our server and keeps our traffic smooth.

Remember how I said HTTP/2 adds multiplexing? Well, with HTTP/2, the browser can now send all HTTP requests concurrently over a single connection. From the web client’s perspective, this is great. In theory, the client should get all the resources it needs quicker since it no longer has to wait for responses from the server before making additional requests. However, in practice, multiplexing substantially increased the strain on our servers. First, because they received requests in large batches instead of smaller, more spread-out batches. And secondly, because with HTTP/2, the requests were all sent together—instead of staggered like they were with HTTP/1.1—so their start times were closer together, which meant they were all likely to time out.

How to fix it

Fortunately, it is possible to solve this problem without having to increase computing capacity. In fact, there are a few potential solutions, although they take a little more effort than just enabling HTTP/2 in the load balancer.

Throttle at the load balancer

Perhaps the most obvious solution is to have the load balancer throttle requests to the application servers, so the traffic patterns from the application servers’ point of view are similar to what it was using HTTP/1.1. The required level of difficulty depends on your infrastructure. For example, AWS ALBs don’t have any mechanism to throttle requests with the load balancer (at least not at the moment). Even with load balancers like HAProxy and Nginx, getting the throttling right can be tricky. If your load balancer doesn’t support throttling, you can still put a reverse proxy between the load balancer and your application server and do the throttling there.

Re-architect the application to better handle spiky requests

Another (and perhaps better) option is to change the application so that it can handle traffic from a load balancer that accepts HTTP/2 traffic. Depending on the application, this probably involves introducing or tweaking a queueing mechanism to the application so that it can accept connections, but only process a limited number of them at a time. We actually already had a queuing mechanism in place when we switched to HTTP/2, but because of some previous code decisions, we were not properly limiting concurrent request processing. If you do queue requests, you should be careful not to process requests after the client has timed out waiting for a response—no need to waste unnecessary resources.

Before you turn on HTTP/2, make sure that your application can handle the differences. HTTP/2 has different traffic patterns than HTTP/1.1, and your application may have been designed specifically for HTTP/1.1 patterns, whether intentionally or not. HTTP/2 has a lot of benefits, but it also has some differences that can bite you if you aren’t careful.

Postscript

Another “gotcha” to look out for is that software supporting HTTP/2 is not fully mature yet. Although it is getting better all the time, software for HTTP/2 just hasn’t had enough time to become as mature and solid as existing HTTP/1.1 software.

In particular, server support for HTTP prioritization is spotty at best. Many CDNs and load balancers either don’t support it at all, or have buggy implementations, and even if they do, buffering can limit its effectiveness.

Also, several applications only support HTTP/2 over HTTPS. From a security point of view, this is great. However, it complicates development and debugging inside secure networks that don’t need or want TLS between different components. It means you need to manage a CA and certificates for localhost for local development, log session secrets in order to inspect HTTP/2 requests with Wireshark or similar tools, and may require additional compute resources for encryption.

14 Comments

  1. Dmitry PashkevichApril 20, 2019 at 1:48 am

    Thanks for sharing! So it sounds like you went ahead with http/2 instead of going back to http 1.1. Why do you call the transition a mistake then? Can you talk more about how you made the call?

  2. Thayne McCombsApril 22, 2019 at 11:20 am

    We have not yet moved our services to http/2 (we are using http/2 for resources served through a CDN though). Primarily because we don’t currently have the resources to make the necessary changes to our application. It was specifically a mistake to turn on http/2 without first making changes to our application so it could handle the different traffic patterns.

  3. […] HTTP/2 is a great leap forward for HTTP, but turning it on HTTP/2 was a mistake. Learn how to solve this problem without having to increase computing.Read More […]

  4. Ismail KablyApril 22, 2019 at 4:31 pm

    You comment is contradictory and unclear
    – We have not yet moved our services to http/2
    – It was specifically a mistake to turn on http/2

    Is your post analyses based on testing HTTP/2 on a testing environment ? Or this analyses is based on turning on HTTP/2 for a short time on production servers then it had been migrating back to HTTP/1

  5. Thayne McCombsApril 22, 2019 at 4:34 pm

    Sorry, yes, that is a little unclear. We did enable http/2 in production (after testing in preproduction environments without any problems). But because of the issues described we went back to http/1.1 to keep our prod environment stable. And we continue to use http/1.1 for requests that aren’t served by a CDN.

  6. lucas dicioccioApril 22, 2019 at 4:56 pm

    You should be able to configure the max-concurrency of the HTTP/2 session to limit how many streams a client asks at a same time. The default is 100, which is probably too high for your capacity plan. Another tuning point is the flow-control mechanism, though this will be harder to configure well and impossible to tune dynamically if your application has no direct access to flow-control (e.g., because a load-balancer terminates the HTTP/2 sessions).

  7. Thayne McCombsApril 22, 2019 at 5:02 pm

    That’s good to know. Although, at the time we were using AWS ALB’s to load balance HTTP/2 requests as HTTP/1.1 requests, and as far as I am aware there isn’t any option for customers to configure that setting on ALBs.

  8. Nathan MyersApril 22, 2019 at 5:15 pm

    Seems like getting multiple queries together in traffic spikes could give you better performance if you can organize your services right. In any case, if queries are timing out, that really suggests you are close to your limits.

    You can often get better performance by grouping similar queries to a server that specializes on that kind of query, so data needed by multiple queries is hot in cache.

    Getting multiple queries at once enables you to schedule work on them, adaptively. The ones that need a database query can be started first, then, while waiting, you can process those that don’t. Those that need a lot of back and forth to one database can be grouped with others that talk to a different database.

  9. Just to note here that the number of concurrent requests even over HTTP/2 is an evolving situation. For example, Chrome now limits the number of inflight requests to 10 on slow connections – https://blog.josephscott.org/2019/04/16/chrome-delays-requests-on-slow-connections-even-with-http-2/

    Obviously if most of your users are on fast connections then this isn’t something you are going to see

  10. Frank de WitApril 23, 2019 at 3:19 pm

    Multiplexing has been around well before HTTP/2. It just created these problems that you described above, so the techies never enabled it in their browser.

    This situation could have easily been avoided by having old people in tech positions, but ageism in this industry is all to real.

  11. James McGregorApril 24, 2019 at 11:43 am

    I feel that this article blames wrong things. Even if browsers would throttle HTTP/2 requests, this wouldn’t change the fact, that lack of server-side network shaping and relying on client-side implementations in pretty much insane (I struggle to find better word for that). If that was mine experience, on my system, first thing I would do, I would start implementing server-side network shaping ASAP. What is happening on HTTP/2 with current browsers is also possible on HTTP 1.1 and the fact that DDoS is not common for you guys amazes me 😉

  12. […] really enjoyed this post by LucidChart engineering on some recent troubles they encountered when enabling HTTP/2 on their […]

  13. Would it not be better to do only TCP load balancing and letting the servers themselves serve HTTP/2? That way, the control is all local and the less efficient HTTP1.1 would not swamp the backend servers?

  14. Thayne McCombsAugust 15, 2019 at 10:49 am

    If that was possible, yes. But our “load balancer” does more than just load balancing. It also performs routing based on data from the http request (like the Host header), modifies http headers, etc. Also, load balancing at the tcp level means load balancing connections rather than requests, which can lead to uneven load distribution if some clients make more requests than others.

Your email address will not be published.