Sunday, December 5, 2010

Adventures in UNIX: Pipe buffer edge cases, half-open sockets

So yesterday I had a really neat idea: exposing pipe-based programs as network services. You could open a connection to a program, send it data, and the remote computer would put it through some predefined command and send it back. Then I realized that with IPv6, you could give each program its own IP address, and give them all names in the DNS, so that you'd have a perfectly usable system without using any higher-level protocols than DNS and TCP. Then I realized that this would be simple enough that you wouldn't even need to write any code to make this work - it's all doable with simple shell commands! (I was wrong about this last one, and that's the topic of this post.)

(By way of comparison: This is sort of an inversion of the Plan9 model of network computation. Instead of mounting a remote filesystem and piping it through a local program, you're piping local data through a remote program.)

Pipes are usually a one-way structure, but for this to work properly, I needed something a little more exotic. I need to be able to take output from a command, and pipe it back around to the beginning of the pipe, so that the pipe has a loop in it. If I could do that, I could combine netcat with any command, and that'd be a one-liner that implements a server. :D

So here's the first thing I tried.

mkfifo t
while true; do nc -l 9999 < t | tr a-z A-Z >> t; done
nc 9999 < testdata

In a perfect world, this would work! But here's where we get into the details of pipe buffers.

The first problem with this is in the server. When you use a pipe, data's actually buffered along the way, in the commands that are being piped. Normally, this is transparent, because the buffers are flushed out when the previous command exits. This doesn't work when you have a loop through a fifo, though! The data that's buffered in the tr command doesn't get flushed to the fifo until netcat exits, and so netcat never actually has the chance to send the tail end of the data. The only way I could think of to solve this was to write some code for the server - pretty disappointing, but probably necessary. (I'm not going to post that code here, because it's even messier than being a prototype should justify. >_>)

But that's not all - it turns out the client part is broken, for a completely different reason. When you give netcat an EOF (Ctrl-D), it doesn't know how to tell the remote side of the connection that there was an EOF. The server then doesn't have any way to know when to flush the buffers out and end the command, so the whole thing deadlocks waiting for more input that's never coming.

It turns out that TCP solves this problem; the bug is in netcat. With a TCP socket, you can close one direction of traffic, but keep using the other - for example, when you're done writing data to a socket, you can shut down the socket for writes, which signals to the remote side that you're done writing, and then read whatever the server sends back. This, unfortunately, required more code.
import sys
addr = sys.argv[1]

import select
def attempt_read(s, BUF_SIZE):
    if[s], [], [], 0)[0]:
        return s.recv(BUF_SIZE)
    return ''

import socket
s = socket.create_connection((addr, 9999))

BUF_SIZE = 4096
buf =
while buf:
    sys.stdout.write(attempt_read(s, BUF_SIZE))
    buf =


buf = s.recv(BUF_SIZE)
while buf:
    buf = s.recv(BUF_SIZE)

(This is trivial enough that I'm planning to port it to C soon.)

Finally, some good news: this works perfectly! :D With this, you can open a connection and use it as a component in a pipe.

Wednesday, December 1, 2010

p2p DNS

Now that the US is considering forcing pirate domain names out of the DNS, one of the founders of The Pirate Bay is floating the idea of a p2p DNS alternative.

Okay, wow. This is an incredibly terrible idea.

I'll start with the obvious objections:

  • The DNS is meant to be authoritative
  • In a p2p system, you don't know who you can trust, because everybody else is just a peer. The DNS is completely useless if the results you get back aren't authoritative. Some people are proposing web-of-trust type solutions, or other idiocy. NO. Web-of-trust doesn't scale, and requires too much human maintenance to ever work. Even being able to compute some kind of transitive trust metric is an open research question, and then there's the so-far-intractable problem of picking a trust metric. Any answer you get from a p2p DNS system will be unreliable.
  • The DNS is meant to be reliable
  • DNS is meant to be a transparent layer, when you're using the Internet. It's something that you just sort of expect to work, and bad stuff happens when it doesn't. And the thing about p2p systems is, it's actually pretty near impossible to make any guarantees at all about their behavior. I've actually read a lot of papers about building distributed storage systems. And you know what? Nobody's ever actually managed to get anything better than a relatively weak statistical guarantee about any property of a p2p storage system. For the DNS, that's simply not good enough.
  • Performance
  • The DNS has pretty tight performance constraints, and p2p systems (for all their advantages) are extremely vulnerable to DoS attacks. It's pretty much inherent in their design - any p2p system will require a peer to have fairly complex communications with a lot of other untrusted peers. And, as many people have shown over the years, when you manage to take down the DNS with a (D)DoS attack, people tend to flip out.
  • Secure decentralized systems are HARD
  • Look, it's not like it's impossible for random people on the Internet to band together and write a program. It's not even that difficult; open source has proven that. What is hard is getting random people together to solve a fundamentally hard problem in computer science. Let me put it this way. If a well-respected professor of computer science were to propose a p2p DNS system, I would treat it with heavy skepticism. If Peter Sunde proposes it, and expects the Internet hivemind to just sort of blast through all the hard problems by sheer virtue of wanting torrents, then I just laugh. (And then, if it looks like people are taking him seriously, I write a blog post like this.)

There are some people whose first reaction to any data management problem is to try to stick it in a magic DHT and forget about it. In many cases it works - see BitTorrent, for example. A DHT will work in any application where you don't especially need data to be reliable or trustworthy; it's a perfect fit for BitTorrent peer exchange, where reliability is optional because the DHT is only a backup for the real tracker, and trustworthiness doesn't matter because the peers aren't trusted in the first place. For the DNS, though, a DHT is exactly the wrong solution.

It may be possible, someday, to fully decentralize the DNS. To do it will take some fundamental advances in computer science, though, and Peter Sunde isn't going to be able to make that happen by rallying the pirates to his cause.