Sunday, November 21, 2010

The Man-in-the-Middle Defense

I'm not sure why this is, and it's probably a really bad idea, but for some reason as BitTorrent gets harder to use (with all the major trackers being targeted), I become more and more motivated to come up with a p2p protocol that doesn't have BitTorrent's weaknesses. So in that vein, here's something I've been thinking about.

One of BitTorrent's biggest problems is that every user knows about all the other users, and so sending nasty letters to everybody downloading a given file is as simple as grabbing a list of IP addresses and contacting a bunch of ISPs. What we need, then, is a way to anonymize connections, and transfer data between two computers without either of them knowing the identity of the other. There are already a few protocols that get us partway there: using a traditional proxy, you can mask the address of one party, but not both; using something like Tor, you can prevent any one proxy server from knowing both the client and server address, but the client still has to know the server address. So what can we do instead?

Repeaters

The golden rule of system design: When in doubt, go for the simplest thing that could possibly work!

So in this hypothetical protocol, two peers (Alice and Bob, why not) have been communicating pseudonymously. At some point, Alice wants to send Bob a file which is too big for the channel they've been using, so they both agree on an anonymous repeater. At some predetermined time (in most protocols, the time would just be "now"), they both connect to a server, and the server just repeats everything it sees on one connection to the other.

(Once you've got the simplest thing, you may need to fix up a bunch of other problems.)

First, we need a way for a single server to distinguish between a lot of people connecting to it at once - which of them actually want to talk to each other? One solution is to have Alice and Bob agree on a shared secret ahead of time. When they both contact the repeater, they send the shared secret, and the repeater knows to establish a connection between them.

That's weak against eavesdroppers, though, and in today's world of relatively widespread DPI gear, you just can't trust the network. This can be solved with SSL, but only if Alice and Bob can agree on the repeater's public key ahead of time. (If not, then SSL certs are easy enough to fake - it's a little extra effort to do so, but if we were only defending against unmotivated eavesdroppers, then this would be pretty easy!) Solving this problem properly will take a little more thought than I'm willing to put into this problem tonight, unfortunately. (Maybe passing the SSL layer through the proxy?)

How do we keep the repeaters from learning about Alice and Bob's identities? For that, we can use proxies - we only need to keep their identities safe from the repeater, not the other way around, so existing proxy mechanisms will work for this.

How will we handle the performance cost of going through so many layers? Alice and Bob can agree on some protocol ahead of time for dividing the data across several paths, or something like that. I'm explicitly not designing that layer here - proxies and repeaters are general enough mechanisms that more complex protocols can be implemented on top of them pretty easily. (That's one of the biggest advantages of using the simplest thing!)

As a completely unintended bonus, if anonymous repeaters like I've described here become widespread, that could be a solution to the problem of establishing a connection between two computers that are both stuck behind NAT.

Digital Dead Drops

A completely different solution to the original problem would be establishing "dead drops" - locations where you can drop a file for a certain amount of time, and somebody else can pick it up later. (I've already seen pastebin used like this, come to think of it!) If both parties use proxy chains, and the data is encrypted, then this is even more secure than using repeaters because you avoid the simultaneous connection problem - an eavesdropper has a hint about who you're talking to because you're connected (however indirectly) to the other user.

The Next Problem

The other major problem with p2p networks is that search is public - by making a file available for others to download, I'm also announcing to the world that I have that file, and some people might be upset about that.

I have some ideas about how to solve that, but I've probably spent too much time blogging about this topic already. Instead, I think it's time for me to start coding these things up, and see what works. Should be fun!

No comments: