Tuesday, November 3, 2009

Idea post: low-latency caching proxies for mobile internet

This is just an idea post, since I seriously don't have time to implement this right now.

A problem that has been getting worse with mobile internet (on cell phones, etc) is latency. Rather, I should say, the bandwidth on mobile devices has been getting better and better, and is starting to rival slow DSL connections for throughput, but the latency is still atrociously bad. This could be addressed by mobile network operators, but since I'd like to see this problem solved before I am dead, I think it's fair to look into alternate solutions.

Quick explanation first: Bandwidth is more complicated than people usually think. There are two main components to it: throughput, which is the number of bytes per second, and latency, which is the amount of time those bytes take to arrive. Internet connections are usually sold exclusively in terms of throughput, but latency can be really important, especially for applications that have to go back and forth between you and a server a lot. Wireless networks (cell networks and wi-fi both) generally have way higher latency than wired connections of any type. For a good wired connection, 30-150 milliseconds latency is normal, while for a wireless connection, it ranges from a few hundred to a few thousand milliseconds. Congratulations! You now know more about internet connections than a Level 2 AT&T tech support person.

idea 0: low-latency caching proxies
This is actually a pretty trivial thing - anybody can set up a proxy server, and point their mobile web browser at it. (Well, theoretically.) This doesn't gain us that much, though. It'll mainly reduce latency caused by the website, not by the mobile network, so it doesn't really address the actual problem. There are two reasons for even mentioning this: it gives us compression, since that's a relatively straightforward thing that proxies can add, and it makes the rest of the ideas possible.

idea 1: aggressive HTTP pipelining
HTTP allows for an optimization called "pipelining". Basically, the client makes a bunch of requests at the same time, on the same connection, and the server responds to them in order. This can do wonders for reducing latency. The usual sequence of actions goes something like this: send request, wait, get response, send request, wait, get response, send request... Pipelining eliminates most of the waiting. Clients will usually limit the amount of pipelining they do to be kind to web servers, but if we're getting everything from our own proxy server from step 0, we can send as many requests as we want.

It's not perfect, though. On the web, documents will usually pull in other files, like images, scripts, etc, and those can themselves pull in other files. So while we'd like to make all the requests up front, we usually don't know which files we'll need before the responses start coming back. Pipelining helps, but we still end up having to wait more than we'd like.

idea 2: inlining images and scripts and stylesheets
What if instead of depending on the client to fetch everything, we had the server help a bit? A lot of files on the web can actually be either linked to as external documents, or rendered inline inside a page. Javascript and CSS can be inlined pretty easily, and every web browser supports that. It turns out it's possible to also inline images this way, using something called a data URL.

The proxy server could fetch the web page, fetch all the other parts of the page that the client would normally have to request individually, and package them up into a giant page, before sending that to the client. If this works perfectly, then the client only has to make one request, and wait for one response.

This has some pretty serious disadvantages, though. For one thing, it bypasses caching on the client device, so anything included using this method would have to be fetched each time the page loads. This would be fine for small files, but there would have to be some kind of heuristic on the server for what to inline, and what to leave out. I'm also not entirely sure that javascript behaves exactly the same way when it's linked versus inline, but given how well mobile devices support javascript to begin with, that may not be such a big issue.

idea 3: image prescaling
Jumping back to decreasing bandwidth here. It seems kind of a waste to download a full-size image to a mobile device when it's just going to scale it down before it's displayed. Why not scale the image down to the display size before it leaves the proxy? We'd need some way to control this from the client end, since otherwise the server won't know what size to scale to. A custom request header would work, probably. This would also save a lot of CPU time on the mobile device, which translates to faster rendering and less battery usage.

This may not work so well for zoomable browsers, though, such as the iPhone. In this case, it would be interesting for the server to make the image a progressive scan image, so that the client could receive a scaled version first, and then the rest of it later. We could even imagine something fancy, with progressive scan images and HTTP Range headers, where the client first downloads the first progressive part of each image, enough to display a scaled version, and then goes back and fetches the rest of each image file.

(And while we're at it, we can convert GIFs to PNGs. :p)

idea 4: speculative server-initiated prefetch (needs a better name) blah blah
We've been trying to work around the fact that clients have to fetch everything that's on a webpage to load it. What if we bypass that entirely? (this one is more me thinking out loud, actually)

What I'm proposing here is a way (that I haven't thought through properly) to have the server push files to the client, rather than waiting for the client to request them. This would allow the client to cache page elements properly, but still have the same performance as inlining page elements. The problem we'd run into then is that the server has no way of knowing what's in the client's cache (and really, shouldn't), so there will still be a lot of wasted bandwidth here. Maybe the server could send a list of page elements and -- nope, then we're back to client fetch anyway. Hmm. This one might be a lost cause. >_>

Existing stuff

Naturally, I'm not the first one to think of this. There are a few implementations of parts of this; the one that comes to my mind first is Opera Turbo. What I'm proposing here, though, is an open-source server that anybody can run, rather than an add-on controlled by one company. After all, the Internet has proven time and time again that the usefulness of a technology is proportional to how easy it is for an enterprising geek to roll their own.

No comments: