HTTP/2 Request Priorities - A Summary
Previously, I'd written about a particular use case we'd run into where we needed to make thousands of HTTP requests to download 3D assets to populate a WebGL scene.
Many of these assets (skeletons, meshes, lower-mip-level textures, poses) are higher priority than others (higher-mip-level textures, animations). In addition, objects closer to the camera should be prioritized higher than objects farther away.
The following table defines the priority of each asset download:
| Asset Type | Base Priority | 
|---|---|
| Object Description | 100 | 
| Skeleton | 90 | 
| Mesh | 90 | 
| Low-res texture | 90 | 
| Pose | 90 | 
| Animation | 50 | 
| High-res texture | 10 | 
Plus a modifier based on the requesting object:
| Object | Priority modifier | 
|---|---|
| Room | +9 | 
| My Avatar | +8 | 
| Other Avatars | +[0,7] -- distance to camera | 
Asset type determines the base priority. Within an asset type, the referencing 3D object adds a priority modifier. The room gets highest priority, your avatar next, and all remaining avatars are sorted by camera distance.
That is, object descriptions trump everything else. (They're small and contain links to other assets.) Skeletons, meshes, low-res textures, and poses must be loaded before the object to be rendered at all, so they claim the next-highest-level priority. Finally, animations and high-res textures can come in later.
Bandwidth-Delay Product
Let me take a brief digression to explain why priority is important.
The bandwidth-delay product measures the amount of data that must be in flight to fully utilize a pipe. In 2012, the average US Internet connection was 6.7 Mbps. Average round-trip to Google in US was, say, 55 ms. This places the bandwidth-delay product at 6.7 Mbps * 55 ms = 46 KB. If you cannot keep 46 KB in flight at a time, you are wasting bandwidth. Of course, not everyone has Google's data centers, and latency is jittery, and you likely want worldwide efficiency, so aim higher. Targeting a 150 KB bandwidth delay product, or even higher if you're worldwide, is not a bad idea.
Now, because the pipe should have a lot of data in flight at any one moment, a large number of requests should be sent to the server right away. Since the client doesn't know how big responses are, if the server returned low-priority responses first, the page would load more slowly than if low-priority requests weren't sent until all high-priority responses were downloaded. However, having the client wait to issue low-priority requests does not make good use of available bandwidth. The best option is to give the server enough information to prioritize responses, letting it send down high-priority responses first, making everything load faster. Experience with SPDY shows that it's critical that priority work well.
An ideal prioritization solution would satisfy the following objectives:
- make full use of network bandwidth
- retrieve responses for high-priority requests before low-priority requests
- immediately -- on the current or next frame -- retrieve any responses already cached by the browser
On the web, today, there is no way to accomplish all of these goals.
Again, for as much as people like to point out that priority is optional or advisory [1] [2], it is critically important. Multiplexed connections without priority are slower than HTTP/1! Browsers already prioritize HTTP/1 requests, so if they didn't prioritize SPDY or HTTP/2 requests, low-priority resources would contend for bandwidth with high-priority resources, causing initial load to be slower.
After whining about all of this to a friend, he casually said "Why don't you take it up with the standards bodies?" (Damn you, jfb!) I've not had good luck with web standards bodies in the past -- to the point that I have the impression some key people don't care a whole lot about real-world use cases -- but I thought I'd give it a shot.
It turns out that exposing a simple priority field on XMLHttpRequest (W3C WebApps WG), while easy, obvious, and an immediate solution to many use cases, does not satisfy everyone. In fact, it's been proposed before. The WebApps WG instead suggested that prioritization should be solved at the "Fetch" layer, so regular DOM fetches like <img> and >script< tags can benefit too. This increases the scope of the proposal, but fair enough...
So I went to WHATWG and discovered a proposal that, on first blush, looked absurdly complicated. It's based on HTTP/2's dependency tree priority model. I spent an evening or two wrapping my head around the dependency tree model and ran into some problems.
I now see that, for all of the varied use cases for prioritization on the web, there is no apparent solution that satisfies them all. (But one is very close, as you'll see.)
Use Cases
Let's work through a few illustrative use cases for prioritizing requests on a multiplexed connection.
Basic Page Load
The ideal scenario for a page load is that enough of the HTML arrives to link to the appropriate CSS and JS resources, after which the JS and CSS are downloaded (with resources completing serially so that response processing on the CPU can happen in parallel with transfer), followed by the remainder of the HTML.
This requires dynamic reprioritization of the HTML document mid-transfer.
Any referenced images are not critical to the initial load, thus have lower priority. They are also transferred in parallel rather than serially under the assumption that they can be progressively-rendered. Two half-complete JavaScript files is not useful at all, but two half-complete JPEGs certainly could be useful.
This use case illustrates sequential transfer (JS and CSS) and parallel transfer (images). Sequential transfer is ideal when a resource doesn't provide value until it's completely downloaded. Parallel transfer is ideal when resources support progressive display or when trying to be fair to competing requests.
Multiple Tabs
Consider a tabbed browser. Each tab may issue dozens of its own requests. The browser ought to prioritize requests from the focused tab. Background tabs receive the remainder of the network resources.
Switching tabs ought to efficiently reprioritize requests from the old and new tabs appropriately.
Proxy
Proxies take many requests from multiple clients and multiplex them on fewer backend connections. Proxies ought to be fair: if client A asks for low-priority requests and client B asks for high-priority requests, the proxy's resources should be split evenly between them.
Streaming Video
A video stream (e.g. DASH or M3U) consists of a sequence of short video segments. The current segment has the highest priority, followed by the next, and so on.
Reprioritization occurs on seek, though it's probably fine to disregard any outstanding requests upon seek.
Loading a 3D Scene
This is the use case described at the top. Unless the connection is otherwise idle, not a single byte of low-priority responses should come before high-priority responses. Requests should be periodically reprioritized as the objects move closer to the camera.
Large Page of Images
Image a web page that consists of 10,000 img tags. Ideally, they would download in priority order where priority is distance from current viewport. On scroll, requests would be reprioritized.
Server Push
I personally have no interest in server push and many smart people are hashing out how priority interacts with server-pushed data in various forums, so I will refrain from further discussion here.
Prioritization Primitives
We can take the uses cases above and break them down into the components that a prioritization algorithm should support.
- Sequential in specified order (A before B before C ...)
- Sequential in no particular order (A, B, C, but all of one before the next)
- Parallel (A, B, C, interleaved on the connection)
- Fairness (resources allocated evenly between {A, B, C} and {X, Y, Z}.)
- Reprioritization
- Parallel group dependencies (all of {A, B, C}, in parallel, before {D, E, F}, also in parallel)
- And because non-prioritized mux is worse than HTTP/1, advertised knowledge of prioritization support helps clients decide whether to blast all requests all at once or be more conservative :)
The Proposed Priority Models
HTTP/1
First, let's talk about the status quo: legacy HTTP served by N concurrent TCP connections.
Firefox maintains a priority queue on 32-bit signed integers fed from the nsISupportsPriority interface.
Chromium supports a handful of priorities but I can't find the code that implements the priority queue in there. All evidence says it exists though. :)
Most browsers also implement a first-paint optimization, where they distinguish between resources that are required for the first paint, like HTML, JS, and CSS, and download those first. Anything else: images or XMLHttpRequests is blocked until after the first paint. This is slightly inefficient: it leaves the connection idle during the first paint.
An HTTP/1 priority queue served by N TCP connections is inefficient for a couple reasons:
- round-trips and limited concurrency make it hard to keep the pipe full (especially on high-BDP connections)
- cannot reprioritize an in-flight request
- TCP flow control works better if you have one saturated connection rather than 8 unsaturated connections
SPDY/3
The version of SPDY currently in widespread use, SPDY/3, associates a 3-bit priority field with every stream. Streams cannot be reprioritized. "The sender and recipient SHOULD use best-effort to process streams in the order of highest priority to lowest priority."
Original SPDY/4 Proposal
Two years ago, Google proposed switching SPDY/4's dependency model to a system based on stream dependencies and weights. The idea is that, rather than using numeric priorities, streams form a DAG which represents a partial ordering relation.
At each depth "level" of the DAG, weights are used to distribute available resources across streams.
In this model, streams could have multiple parents. You could say, for example, C depends on A and B, placing C at a lower priority than both A and B.
This proposal also introduced reprioritization.
SPDY/4 advertises on the connection whether prioritization has an effect.
One of the gnarlier bits of the stream dependency model is that it introduces some number of race conditions. After a stream is closed, how long is its information, for the purposes of future reprioritization packets, retained?
Microsoft's HTTP/2 Proposal
In a simplified form, the SPDY/4 dependencies and weights proposal made it into the HTTP/2 draft. However, prioritization was still a somewhat controversial topic. Osama Mazahir, on behalf of Microsoft, proposed an alternate priority design that appeared to garner some support. See the replies in that thread.
The basic idea is that there are groups of streams. Each group is assigned a weight, and within each group, streams are assigned priority.
Osama does a better job describing the tradeoffs, but the main argument is that it's simpler and supports more use cases more naturally than dependencies and weights.
The large open questions seem to be: "So what priority values do people choose? What if you want to prioritize resource B between A and C, but A's and C's priorities differ by only 1?" If you use numeric priorities, you have to select arbitrary values for people to
After this thread, I never saw mention of it again. What happened?
HTTP/2 Draft 15
HTTP/2 Draft 15 goes into great detail about the stream dependencies and weights priority model.
Every stream is given one 31-bit parent and an 8-bit weight in [1, 256]. The server SHOULD process streams from the root of the tree down, dividing available resources among streams by their relative weights. If a parent stream closes, it is removed from the tree and its weight is divided among its children.
The default stream parent is zero - the root.
One repeated bit of confusion I've seen on the mailing lists is the difference between weights and priorities. Priorities are used to specify relative ordering: A before B. Weights are used to divide resources among many competing requests. If I wanted {A0,A1,A2,A3}[weight=256] before {B0,B1,B2,B3}[weights=128] before {C0,C1,C2,C3}[weights=1], is the expectation that the gateway proxy would open 12 simultaneous backend connections, and divide the connection with 2/3 of the frames solving for A, 1/3 solving for B, and a trickle for C? That would be silly: we want all As before any Bs. So then is the expectation that proxies would "tend to choose" streams of higher weight and run them to completion? If so, that violates at least the spirit of the algorithm described in HTTP/2.
In terms of server decision-making and client bandwidth utilization, weighting make a very poor substitute for true prioritization, so it's important to treat them separately. Priorities specify ordering, weights specify resource allocation.
However, there doesn't seem to be much practical implementation experience with the HTTP 2 priority model. Mozilla's current implementation ignores dependencies entirely, which would have unexpected results against a compliant connection, as low-priority resources would consume bandwidth while the page is waiting for high-priority resources. (To be fair, this implementation is actively in progress.)
Which Priority Protocol is Best?
Let's enumerate the prioritization primitives again. The goal is to list whether a particular primitive is expressible in a given protocol. We won't even consider HTTP/1 because it doesn't solve for efficient bandwidth usage on a high-latency connection. The three priority protocols under consideration are: SPDY/3, SPDY/4 as of 2012, Microsoft, and HTTP/2.
SPDY/3 is easy to eliminate from the running too: 3 bits of priority is way too limiting in many use cases, including, for example, in the use case of streaming video chunks. After 8 chunks, you would be out of priority levels.
| Use Case | SPDY/4 2012 | Microsoft | HTTP/2 | 
|---|---|---|---|
| Sequential in specified order (A before B before C ...) | 👠| 👠| 👠| 
| Sequential in no particular order^ | |||
| Parallel (A, B, C in parallel) | 👠| 👠| 👠| 
| Fairness (resources allocated evenly between {A, B, C} and {X, Y, Z}.) | 👠| 👠| 👠| 
| Reprioritization | 👠| 👠| 👠| 
| Parallel group dependencies ({A, B, C} in parallel before {D, E, F}, also in parallel) | 👠| 👠| |
| Advertisement of prioritization support | 👠| 
^ None of the protocols allow the requestor to specify that responses should be sent in their entirety before the next response is sent without also defining a specific sequence. I suppose the assumption is the server can make this determination by content type? I wonder whether a bit to indicate "I don't care about the order, but once you start transmitting a resource, please finish before starting the next." would be useful in practice.
Many of the desired use cases are supported in HTTP/2, but Microsoft's proposal appears to satisfy strictly more. I wonder what happened to it in committee? I will reach out to its authors and supporters and see.
The biggest issues with HTTP/2 are that:
- There is no way to express "download {A,B,C} in parallel then download {D,E,F} in parallel"
- If priority is truly optional, and there is no mechanism by which the server advertises support, then clients will need to play it safe, reducing the possible upside from a multiplexed connection.
Can HTTP/2 be fixed?
Can we modify HTTP/2 to add support for the {A,B,C} in parallel followed by {D,E,F} use case?
The idea of dummy placeholder streams has been bandied about. Let's see what that might look like:
- 0 is the root.
- P1 is the high-priority dummy stream.
- P2 is the low-priority dummy stream.
If we defined a tree as such:
P1 -> 0
{A, B, C, P2} -> P1
{D, E, F} -> P2
and changed the HTTP/2 specification such that dummy nodes never closed and streams closer to the root were serviced first, then the "Parallel groups" use case is satisfied. However, watch what happens to fairness in the proxy / multiple tabs use case.
Consider two inbound connections to a proxy that are being multiplexed on one outbound connection.
Inbound connection A has three priority levels, thus three dummy streams. Inbound connection B has two priority levels, thus two dummy streams.
0 <- A_P1 <- A_P2 <- A_P3 0 <- B_P1 <- B_P2
Now let's assume that inbound connection A has two requests: A_S1 at priority A_P1 and A_S2 at priority A_P3. Inbound connection B also has two requests: B_S1 at priority B_P1 and B_S2 at B_P2.
A_P1 <- A_S1 A_P3 <- A_S2 B_P1 <- B_S1 B_P2 <- B_S2
Note that A's priority levels go deeper than B's. Per the proposed spec modification above, A_S1 and B_S1 would have equal priority and be serviced fairly. However, if two streams are at unequal depth in the tree, the one closer to the root would win. B_S2 would unfairly starve A_S2. So it seems that fairness and group prioritization are at odds in a collapsing, tree-based dependency model.
What now?
It's surprising to me how hard it is to get priorities correct over multiplexed streams. Microsoft's proposal seems to be the most useful, but I have no insight into what caused it to fail in committee.
I don't have any ideas for how to save a tree-based model. We need either weighted stream groups or we need to convert the tree into a DAG -- that is, if nodes had multiple parents, the dependency model would work swimmingly.
Getting priority right is critical -- HTTP/2 will be the most popular protocol on the Internet, and the upside potential in both page-load efficiency and performance of new web applications is huge.
Pre-emptive Rebuttals
Why not define your own priority extension?
The big advantage to using HTTP/2 is that it's a standard. CDNs, caching proxies, gateways, and even load balancers will implement it. A custom protocol would be prohibitive in practice.
Just get over it and use weights as priorities.
I hope I demonstrated earlier why weights don't solve the prioritization problem. "But it's close enough!" The speed of light isn't getting any faster. Page load optimization is getting increasingly expensive. Projects like SPDY, QUIC, and HTTP/2 are ambitious, protocol-level efforts for arguably marginal wins. Being "close enough" is no longer sufficient. Defining a protocol that can specify the ideal transfer schedule has a huge return on investment, assuming the protocol is adopted and implemented well.
Well, I asked about your "don't let the second batch of images interfere with the first batch's delivery" case on Moznet's #necko channel, and Nicholas Hurley suggested that a dependency tree with the same form you present above with dummy streams P1 and P2 could do nicely: you just have to give P2 weight=0.
That way, the server would know not to bother with P2's subtree unless it couldn't make any progress on any of P1's other children.
... but evidently you already knew about this basic idea, and simply neglected to trackback this post in your follow-up, http://chadaustin.me/2014/11/update-on-http2-priority/