Sunday, December 5, 2010

Adventures in UNIX: Pipe buffer edge cases, half-open sockets

So yesterday I had a really neat idea: exposing pipe-based programs as network services. You could open a connection to a program, send it data, and the remote computer would put it through some predefined command and send it back. Then I realized that with IPv6, you could give each program its own IP address, and give them all names in the DNS, so that you'd have a perfectly usable system without using any higher-level protocols than DNS and TCP. Then I realized that this would be simple enough that you wouldn't even need to write any code to make this work - it's all doable with simple shell commands! (I was wrong about this last one, and that's the topic of this post.)

(By way of comparison: This is sort of an inversion of the Plan9 model of network computation. Instead of mounting a remote filesystem and piping it through a local program, you're piping local data through a remote program.)

Pipes are usually a one-way structure, but for this to work properly, I needed something a little more exotic. I need to be able to take output from a command, and pipe it back around to the beginning of the pipe, so that the pipe has a loop in it. If I could do that, I could combine netcat with any command, and that'd be a one-liner that implements a server. :D

So here's the first thing I tried.

Server:
mkfifo t
while true; do nc -l 127.0.0.1 9999 < t | tr a-z A-Z >> t; done
Client:
nc 127.0.0.1 9999 < testdata

In a perfect world, this would work! But here's where we get into the details of pipe buffers.

The first problem with this is in the server. When you use a pipe, data's actually buffered along the way, in the commands that are being piped. Normally, this is transparent, because the buffers are flushed out when the previous command exits. This doesn't work when you have a loop through a fifo, though! The data that's buffered in the tr command doesn't get flushed to the fifo until netcat exits, and so netcat never actually has the chance to send the tail end of the data. The only way I could think of to solve this was to write some code for the server - pretty disappointing, but probably necessary. (I'm not going to post that code here, because it's even messier than being a prototype should justify. >_>)

But that's not all - it turns out the client part is broken, for a completely different reason. When you give netcat an EOF (Ctrl-D), it doesn't know how to tell the remote side of the connection that there was an EOF. The server then doesn't have any way to know when to flush the buffers out and end the command, so the whole thing deadlocks waiting for more input that's never coming.

It turns out that TCP solves this problem; the bug is in netcat. With a TCP socket, you can close one direction of traffic, but keep using the other - for example, when you're done writing data to a socket, you can shut down the socket for writes, which signals to the remote side that you're done writing, and then read whatever the server sends back. This, unfortunately, required more code.

netpipe.py:
import sys
addr = sys.argv[1]

import select
def attempt_read(s, BUF_SIZE):
    if select.select([s], [], [], 0)[0]:
        return s.recv(BUF_SIZE)
    return ''

import socket
s = socket.create_connection((addr, 9999))

BUF_SIZE = 4096
buf = sys.stdin.read(BUF_SIZE)
while buf:
    s.sendall(buf)
    
    sys.stdout.write(attempt_read(s, BUF_SIZE))
    sys.stdout.flush()
    
    buf = sys.stdin.read(BUF_SIZE)

s.shutdown(socket.SHUT_WR)

buf = s.recv(BUF_SIZE)
while buf:
    sys.stdout.write(buf)
    sys.stdout.flush()
    buf = s.recv(BUF_SIZE)

(This is trivial enough that I'm planning to port it to C soon.)

Finally, some good news: this works perfectly! :D With this, you can open a connection and use it as a component in a pipe.

Wednesday, December 1, 2010

p2p DNS

Now that the US is considering forcing pirate domain names out of the DNS, one of the founders of The Pirate Bay is floating the idea of a p2p DNS alternative.

Okay, wow. This is an incredibly terrible idea.

I'll start with the obvious objections:

  • The DNS is meant to be authoritative
  • In a p2p system, you don't know who you can trust, because everybody else is just a peer. The DNS is completely useless if the results you get back aren't authoritative. Some people are proposing web-of-trust type solutions, or other idiocy. NO. Web-of-trust doesn't scale, and requires too much human maintenance to ever work. Even being able to compute some kind of transitive trust metric is an open research question, and then there's the so-far-intractable problem of picking a trust metric. Any answer you get from a p2p DNS system will be unreliable.
  • The DNS is meant to be reliable
  • DNS is meant to be a transparent layer, when you're using the Internet. It's something that you just sort of expect to work, and bad stuff happens when it doesn't. And the thing about p2p systems is, it's actually pretty near impossible to make any guarantees at all about their behavior. I've actually read a lot of papers about building distributed storage systems. And you know what? Nobody's ever actually managed to get anything better than a relatively weak statistical guarantee about any property of a p2p storage system. For the DNS, that's simply not good enough.
  • Performance
  • The DNS has pretty tight performance constraints, and p2p systems (for all their advantages) are extremely vulnerable to DoS attacks. It's pretty much inherent in their design - any p2p system will require a peer to have fairly complex communications with a lot of other untrusted peers. And, as many people have shown over the years, when you manage to take down the DNS with a (D)DoS attack, people tend to flip out.
  • Secure decentralized systems are HARD
  • Look, it's not like it's impossible for random people on the Internet to band together and write a program. It's not even that difficult; open source has proven that. What is hard is getting random people together to solve a fundamentally hard problem in computer science. Let me put it this way. If a well-respected professor of computer science were to propose a p2p DNS system, I would treat it with heavy skepticism. If Peter Sunde proposes it, and expects the Internet hivemind to just sort of blast through all the hard problems by sheer virtue of wanting torrents, then I just laugh. (And then, if it looks like people are taking him seriously, I write a blog post like this.)

There are some people whose first reaction to any data management problem is to try to stick it in a magic DHT and forget about it. In many cases it works - see BitTorrent, for example. A DHT will work in any application where you don't especially need data to be reliable or trustworthy; it's a perfect fit for BitTorrent peer exchange, where reliability is optional because the DHT is only a backup for the real tracker, and trustworthiness doesn't matter because the peers aren't trusted in the first place. For the DNS, though, a DHT is exactly the wrong solution.

It may be possible, someday, to fully decentralize the DNS. To do it will take some fundamental advances in computer science, though, and Peter Sunde isn't going to be able to make that happen by rallying the pirates to his cause.

Tuesday, November 30, 2010

Julian Assange is a Terrorist (and I mean that in a good way)

Politicians want to classify Julian Assange as a terrorist. Insane? Only at first glance.

I've been reading about something that Assange wrote a few years ago, which basically lays out his plans for Wikileaks. It's actually a pretty neat read. Summary: Assange sees today's American government as some kind of corporate conspiracy (can't argue there), and he wants to throw sand in the works of the conspiracy by increasing the cost of secret communication (without which any conspiracy dies). He intends to do this through random attacks on government secrecy, with the goal of forcing an expensive overreaction, which will end with governments being less secretive.

My first reaction: This dovetails perfectly with a blog post that I've been meaning to write (but will probably never get around to) about the tradeoff between trust and robustness in a networked system. It's actually a really cool tradeoff - trusting another entity in a decentralized system can be viewed as a dodgy optimization, which will usually work but occasionally crashes dramatically. (Bonus: the tradeoff even has a mathematical basis, in the FLP result!) Julian Assange is giving us a real-world demonstration of this principle, by poking at the relatively cosy relationships between governments and forcing them to shift into a less useful but more secure configuration.

My second reaction: You know how the .gov has been making a lot of noise about info-terrorists, even though they have no idea what that even means? DDoS kiddies are usually held up as an example of what to watch out for, but that stuff is so trivial that I'm surprised we waste our time talking about it. Julian Assange, on the other hand, is the real deal, and he's not even terribly sophisticated. He is using the power of the Internet, and the power of the (relatively) unrestricted flow of information, to do something radical to the state.

My third reaction: Oh, man. The government doesn't know how bad this could have been. If Wikileaks had wanted to publish this stuff anonymously, it wouldn't have been terribly difficult for them to do so. The technology already exists, and has for years; it's just a matter of using it effectively. They don't like their diplomatic cables being made public as it is; imagine how much it would suck for them to have a few thousand cables appearing every month, and to be completely unable to track where they were coming from. You know how I said that Assange wasn't terribly sophisticated? If he were, he'd be doing exactly what he's doing now - we'd just have no idea who he was.

Last reaction: I can't help but worry that Julian Assange is gearing up for a dramatic exit from this world. He is simultaneously making himself extremely visible, and making a lot of very powerful enemies. Wikileaks has already published an insurance file; that's not the sort of thing you do unless you expect to have a reason to use it. If Assange does end up assassinated, that may be all the proof we need that something like Wikileaks is desperately needed in today's world.

Monday, November 29, 2010

Rewind

So I read the Void Trilogy by Peter Hamilton a few weeks ago, and one of the subplots went like this: in a world of psychics, one young man has exceptionally powerful abilities. Throughout the books, he learns of increasingly incredible things he can do, until he realizes that he can turn back time itself. Specifically, he can think about any moment in his past that he can remember clearly, and rewind the universe back to that moment (but with all his memories intact). This is where things get a little bit nuts.

For the rest of the book, he tries to make everything right with the world, because he's that sort of character. It takes a terrible toll on his mind at times, but in the end, he lives a life such that there's nothing he wants to go back and fix, and he has reached fulfillment. Happy ending, right? And then he goes and, on his deathbed, gives the secret of turning back time to everybody else in his city - and this is where my brain implodes in dismay.

If zero people know how to turn back time, then things make sense, and history proceeds in a boring linear fashion. If one person knows how to turn back time, then things are still simple enough to wrap your head around, because you can trace a single thread of narrative throughout whatever they do - by designating them the "main character" in the story, the story makes sense. But if two or more people know the secret, then things get Terribly Complicated.

Here's one trivial example of how screwed up the universe would become: imagine a game of Rock-Paper-Scissors between two especially competitive people that know how to turn back time (Rewinders?). The entire universe would be locked in a loop until one of them got bored.

There are weird issues surrounding seniority. If two Rewinders are going back and forth on something, the winner is going to be the one that can go the farthest back - back to before the other one existed, perhaps. If we follow this train of thought, then the winner in any conflict is going to be whoever is the oldest.

On the flip side, there are weird edge cases around death. If I sneak up on someone and kill them before they can react, then that's it for them, I've won, no second chances. This is the only way I can see to break out of a loop without first going through the infinite regression tango, and giving the victory to the older person. A world full of Rewinders would have a lot of immortals trying to kill each other, really - sort of like The Highlander but with more mindfuck.

I'm not really going anywhere with this post. Honestly, I just thought it'd be fun to actually think through some of the consequences of a world with Rewinders. :D

Sunday, November 28, 2010

Diaspora!

So at the beginning of this summer, a group of NYU CS students started on a project to build a decentralized social network, and then made waves when they raised over $200,000 on kickstarter, a crowdsourced funding website. They then proceeded to disappear into a cave for the entire summer, which killed the buzz around Diaspora pretty effectively. Then they put the project up on Github, and people immediately jumped all over them for security flaws. (Personally, I would expect to find security holes about that magnitude for a project this young. You fix them, and you move on.)

If I had to give my opinion of the project, it's somewhere around "cautious optimism". I'm not a Ruby or a Rails fan, but there are worse languages/frameworks they could have used. I think they're striking a reasonable balance between developing in secret and developing in public. On the one hand, they promised to make everything 100% open source, but on the other hand, the open source development model is pathologically incapable of making design decisions, and for the initial stage of a project you're making nothing but. I definitely like that they're piggybacking on existing protocols.

Apparently, they were inspired by Eben Moglen's idea of a "freedom box", which makes me sort of nervous, actually. Nervous, because the idea is good in principle, but completely unworkable and sort of silly in practice. Yes, it would be useful if we all had physical control of our own social media profile, but this has tremendous implications for the reliability of the network as a whole - if my Internet connection goes down, to what extent do I disappear from the web? And, of course, I'm glossing over all the real difficulties with hosting a website on a residential Internet connection. Quite simply, our infrastructure isn't up to the job, and I don't expect that to ever change. So, I kind of hope that the Diaspora devs aren't going to waste too much time on this particular use case.

There are also a ton of fundamentally hard problems that they are going to run into, and while I remain optimistic that they're thinking about them, we won't really know how they handle them until the software is in a more complete state. For example: how do you handle security updates in a worldwide distributed system? There are already a ton of insecure Diaspora instances running around in the wild, that people brought up as soon as the code landed on Github, and the problem is going to get worse unless they do something about it.

Overall, I have high hopes for Diaspora, but it's simply too early to make a call about the project. I'm expecting it to advance rapidly, though, and we may be looking at a 1.0 release within a year. Whether or not it's a "Facebook-killer", like people want it to be, it has a lot of potential to be a useful tool.

Saturday, November 27, 2010

Wallet

I seem to have lost my wallet! It is fucking with my head like you wouldn't believe.

I have looked everywhere. I have looked everywhere at least twice. I've crawled on the ground looking underneath maybe half the furniture in this house. I've torn my room apart - I don't think there's a square inch in there that I haven't looked at today, except for maybe spots that you have to disassemble furniture to get to. I have taken all the cushions off of all the couches, I think. I've called the last place I saw the wallet, and since I came home straightaway after that and haven't really gone anywhere since, there are no leads there. I am this close to trying to figure out a tactful way to call up everybody that was here for Thanksgiving and asking if they walked off with my wallet. Like I said, fucking with my head.

Why am I freaking out so much? This is my first time losing my wallet, and I suppose my first time finding out just how much of a pain it is. If I can't find it before my flight on Sunday, I'm going to have to cancel my credit card and debit card, replace my insurance card, replace my Social Security card, replace my ORCA card (free bus rides, one of many Microsoft perks) so I can ride the bus to work. (By a sheer stroke of luck, I still have my driver's license; don't even ask. XD)

And then, I have to figure out how to keep this from ever happening again. Because, see, I can't just leave a problem like this alone, and deal with it when it comes up. I'm a pathological overthinker. My reaction when a hard drive fails is to create increasingly elaborate system of redundant storage, culminating in what I built earlier this month. My reaction to almost losing my cell phone is to keep multiple backups of all the data on it, just in case. Now that I've lost my wallet once, I'm not sure that I'll be able to ignore the possibility of it happening again - my brain just doesn't work that way. I don't know how I'll solve the problem of randomly losing things that I need to carry around everywhere I go, but I know that I'll be kind of agitated and jittery until I do - it's just how I'm wired.

My mom, bless her heart, tried to help - not by helping me look, but by trying to make me feel better about losing my wallet. Frankly, it just made things worse. She asked me to imagine the worst-case scenario; thanks for the completely generic advice! I definitely feel inclined to sit and listen to you, when I know that you're taking this far less seriously than I am! Plus, I know on an almost subconscious level that she's just trying to make me feel better, and that's not what I want. I don't want to feel better about it, and I don't want to sit down and think about "how bad could it really be". I want to find my goddamn wallet. If all your help means to me is that I have to act calmer while I'm searching frantically, then please, just stop trying.

Anyway, I haven't given up yet. I still have another day to try and figure out where it went. I am convinced that it's still in this house somewhere, and that's the most incredibly frustrating thing - it's so close, but I may not have enough time to find it! Still, I've got until my flight on Sunday. Once I get on the plane, then I'll give up, and start figuring out what all I need to replace. Until then, there's still hope.

Friday, November 26, 2010

The Science of Code

Found this blog post today via reddit. It has a really cool insight: The way people work with code is evolving into the same patterns that exist in the sciences today. At the one end you have the "physicists" - people that work with code on the lowest levels (either machine code, or algorithms, depending on your interpretation), and that can expect mathematical certainty. At the other end, you have code "biologists", that mostly work with whole organisms/programs, which are messy things, but which mostly work in mostly predictable ways.

There are a few neat consequences that you can pull out of the analogy. First, while wizards slinging machine code and novices putting scripts together are both "programmers", we probably need new designations for them, in the same way that you can't always lump physicists, chemists, and biologists together as scientists. Second, even though scripting is perceived as easier than low-level programming today, that could be because of the relative immaturity of the field, and not because it's inherently easier. See this comic, for example: physicists can look down on biologists, but biology is hard! Physics can be seen as the ultimate reductionism, and other sciences are simpler in terms of the physics they use, but harder precisely because they can't afford to reduce everything to that degree.

Higher-level programming languages, then, aren't just about simplification - they're also about specialization. (Maybe this is why domain specific languages (DSLs) are a big deal today? By creating a new language, you're jumping ahead of the existing languages in terms of specialization, which is akin to opening up a new field of study in our analogy.) By leaving some of the complexity of the lower levels behind, you're able to create new abstractions and concepts, which are interesting in and of themselves.

I think the analogy actually outstrips modern programming practices by a bit. If you want to write "organic" code, for instance, you need a specialized language like Erlang, since as far as I know it's the only language designed to handle failures of different parts of the program, and keep on running smoothly. Current languages mostly have the assumption that any fault is reason to terminate the program, because the whole thing should be 100% correct. From a physicists perspective, this is fine - if it's not 100% correct, you can't count on it doing anything right! I'm coming around to the "sloppy code" view the more I think about it, though.

The assumption that all code should be 100% correct is unreasonable in this day and age. It pains me to say it, because it goes against everything I've been taught (and quite a bit of what I've said in the past). All code is going to be a bit sloppy, simply because it's written by humans, and not by the faultless code-writing machines that those humans fancy themselves to be. What we need in the next generation of languages is more robust mechanisms for handling incorrect code; if we don't do that, we're not really designing languages to be used by human beings.

Thursday, November 25, 2010

Layers of Fail

So here's an annoying multi-layered fail, notable because it affects three adjacent layers of the network stack! As any programmer will tell you, the most interesting bugs to diagnose are the ones that result from the interaction of other bugs. This particular one results in me losing messages on AIM.

First fail: U-Verse and my Mac

I don't know if anybody else has seen this problem, but my Macbook Pro cannot keep up a reliable connection to any U-Verse modem. I've seen this problem with multiple U-Verse modems, and only my Mac, so the possibilities are (in order of likeliness):

  1. Buggy AT&T software in the modem
  2. Buggy firmware for my wireless card
  3. Random hardware fault in my Mac (unlikely, because it only happens with U-Verse modems)

This is pretty annoying, because WiFi is supposed to be standardized! All implementations are supposed to be interoperable with all others. Either AT&T or Apple could have caught this (isn't it standard practice to test with other widely-used hardware?), so the fault could lie with either company. Luckily, I don't have U-Verse at home, so I don't have the need to diagnose this properly - it's only an issue when I'm visiting people that do, like my parents.

Second fail: Automatically dropping connections on interface down

This is such a widespread thing that I think it must be intentional, but I can't figure out any reason that it's not a terrible idea. On any OS that I've used, when a network interface goes down, all connections are severed automatically. The thing is, the IP protocol is explicitly designed to allow lost packets, and the TCP protocol on top of it is designed to handle it, so the dropping of connections is unnecessary. If operating systems just ignored the loss of the interface on the assumption that it'll come back up soon, everything will still work as designed, and a lot of situations involving intermittent connections will work much better!

In other words, effort went in to a feature which makes things worse, which definitely counts as a failure in my book.

Third fail: AIM protocol doesn't handle dropped connections cleanly

This one's pretty simple: if a connection drops, and a message is sent during the timeout before the server decides that the connection is dead, that message seems to be lost. Seems like a simple bug to fix on the server, but it's been going on for a while now, so apparently that's not going to happen. What's more irritating about this one is, AIM already seems to save messages that are sent to somebody that's offline - it just can't detect that you're offline during the timeout period.

The end result of these three (or two, if you don't want to count the middle one) bugs is that AIM is nearly unusable for me when I'm using the WiFi at my parents' house. (Yet another reason to switch to GTalk? :D No idea if it has the same problem, though.)

Wednesday, November 24, 2010

"Content" and "Consumers"

So am I the only person that gets annoyed when people talk about "content"?

The phrase is aggressively bland and intolerably reductive. "Content" can be anything. "Content" is a book, an article, a song, a cool video, a funny joke, a clever program, a picture worth a thousand words - all glommed together into a single nondescript concept. Nobody who has created something worth creating, something that they're proud of, will refer to it as "content". "Content" is anonymous. "Content" is undifferentiated. "Content" is everything, and nothing in particular. "Content" is what you call it when you don't care what it is.

"Content" is what you feed to "consumers" - nothing more.

And there's another word that bugs the hell out of me: "consumers". A "consumer" isn't a person - it's a thing, a mindless black hole, which consumes whatever you put in front of it. People are complex and individual, but we know how to market to "consumers". A "consumer" isn't something you would ever have a conversation with. A "consumer" never produces anything of value. "Consumers" are all the same, and when they're not, they're amenable to market segmentation. "Consumers" exist because they're easier to deal with than people.

I think that the mindset embodied by these two words is a sickness. When you see the world in terms of "content", and "consumers" that want it, you're hiding away all the incredible richness of the world. "Content" is culture, viewed from the outside, with the intent to put it in boxes and sell it to people - sorry, "consumers" - and it honestly freaks me out a little that there are people out there that are willing to think in those terms.

So please. Can we all stop using these words? I think we'll all be happier in the long run.

Tuesday, November 23, 2010

The Shallows

So like I mentioned yesterday, I've just finished reading The Shallows, by Nicholas Carr. It's about the effect that the Internet is having on our thought processes, on a fundamental level, and it's a pretty amazing book.

We see the Internet as an incredibly powerful tool for organizing and retrieving information, and we're right - as a multiplier for our own abilities, it's an unprecedented achievement. What some people are realizing, however, is that it is not without its downsides. Ask yourself this: how many books do you read for fun these days, and how many did you read before you discovered the Internet? Carr contends that this is not just because our reading has shifted online; instead, our brains have been rewired by the constant cheap stimulus of the web, so that we find it more difficult to read extended prose than we used to.

Historical precedent

Believe it or not, this isn't the first time that a new technology has fundamentally changed how we think. Carr looks all the way back to the invention of written language, which represented the beginning of the shift from an oral culture to a written one. There's some interesting stuff here that I didn't know about. For example, written languages were originally written using only letters, with no punctuation, or even spacing between words. See, written language was originally just a transcription of what people said out loud, and only intended to be read out loud. It wasn't until hundreds of years later that these things were added to language, to make reading easier.

The revolution brought on by written language reached a fever pitch with Gutenberg's invention in the mid-1600s. All of a sudden, books (previously reserved for the wealthy) were cheap enough that everybody could have them. Simultaneously, the automation of printing meant that the cost of introducing a new book decreased dramatically, so that authors were free to experiment with radical new styles of writing. (Critics of the time decried the new styles, a product of new technology, as a massive dumbing down of literature - sound familiar?)

The introduction of reading to the population as a whole caused a fundamental change in the way we thought. Carr starts the book by just asserting this, but when he really gets into the meat of his argument, he backs it up with a lot of neurological evidence - mostly the dramatic changes that occur in our brains when we learn to read, as seen through an MRI.

Shallows

So how is the Internet changing things? Carr cites study after study that all point in the same direction - the Internet diminishes our ability to read deeply, by encouraging skimming and by allowing us to skip around to other pages easily. Furthermore, it does so on a fundamental neurological level. This is the core argument of the book, and Carr backs it up well, going all the way down to what we know about the mechanisms of human memory.

There are lots of counterarguments to be made here, of course. Even if the Internet makes it harder for us to absorb a book of information, doesn't it make up for it by giving us easy access to anything we might want to know? Doesn't the fact that looking up information is now instantaneous make it more efficient for us to know a wide variety of subjects shallowly, rather than learning a few subjects more deeply? (Personally, I'd say that it depends - some questions are harder to answer than others using a tool like Google, and it's an effect that's difficult to take into account when deciding what to learn.)

What should we do about this? It's hard to say. Really, it's hard to even say what we could do - people aren't even convinced that this is a problem today, and that would be a necessary prerequisite before we could even talk about steering the tech industry based on the effects that technology would have on us.

The technocrat in me thinks that we should just let technology happen, and deal with the consequences after the fact. Technology can solve technology's problems, and everything will work itself out, right? I used to believe that, anyway, but now I'm not so sure. (Maybe I don't have as much faith in humanity as I used to?)

In any case, though, everybody should read this book! Even if you don't agree with this author's premise, you'll probably find it an interesting read.

Monday, November 22, 2010

Nook!

I've so far neglected to mention it here, but I got a Nook!



Compared to the Sony Reader that I've had for a few years now, this is a huge improvement. I'll start with the most visible difference - while the Reader has buttons numbered 1-10 plus a few more for navigation, the Nook has a color touchscreen. They use it with the main E-Ink screen fairly effectively, too - the interface is generally divided between them sanely, with stuff being displayed on the big screen, and all the controls on the touchscreen.

The E-Ink screen on the Nook is noticeably better than the one on the Sony Reader. The contrast is much higher, and the resolution seems a bit better too (though that could also be the font, I guess). The Nook is also much smarter about updating the screen; where the Reader would have to refresh the whole screen with an annoying invert-everything action, the Nook can update individual pixels without disturbing the rest of the display.

The other most important difference is the inclusion of a wireless connection, like the Kindle. I'm pretty impressed by how well this feature works - I can browse the bookstore, buy a book, and download it to the device within a span of a few minutes. (Browsing over a cellular internet connection is a bit sluggish, though. If you can, it's much easier to buy the books through the B&N website.) There's also a WiFi-only version, but I think that's not as much fun. WiFi can be hard to find, and it's nice to be able to get new books from a moving vehicle. As far as ebook pricing goes, I don't have any complaints about the B&N store.

Battery life is a different story. While reading, the battery lasts pretty much forever (and by forever, I mean it should last through any one book, at least that I've found), as expected of an E-Ink device. When you put the device to sleep, though, it only lasts for a few days before needing to be charged. This is pretty disappointing, honestly, and I hope they fix it in a future software update.

Today on my flight to Houston, I brought the Nook and finished The Shallows, which is about how technology is changing our reading habits. (Wooo, irony!) I think I'll be talking more about that in tomorrow's post, but as far as the reading experience goes, it was at least as comfortable as reading a real book. Probably more so, in fact, because the Nook is easier to fit in my bag than the paperback (The Prefect, by Alistair Reynolds; somebody lent it to me and now I need to read it!) I brought along as a backup.

Summary: The Nook is kind of awesome, and I need to finish this post within the next minute or I'm not going to finish in time.

Sunday, November 21, 2010

The Man-in-the-Middle Defense

I'm not sure why this is, and it's probably a really bad idea, but for some reason as BitTorrent gets harder to use (with all the major trackers being targeted), I become more and more motivated to come up with a p2p protocol that doesn't have BitTorrent's weaknesses. So in that vein, here's something I've been thinking about.

One of BitTorrent's biggest problems is that every user knows about all the other users, and so sending nasty letters to everybody downloading a given file is as simple as grabbing a list of IP addresses and contacting a bunch of ISPs. What we need, then, is a way to anonymize connections, and transfer data between two computers without either of them knowing the identity of the other. There are already a few protocols that get us partway there: using a traditional proxy, you can mask the address of one party, but not both; using something like Tor, you can prevent any one proxy server from knowing both the client and server address, but the client still has to know the server address. So what can we do instead?

Repeaters

The golden rule of system design: When in doubt, go for the simplest thing that could possibly work!

So in this hypothetical protocol, two peers (Alice and Bob, why not) have been communicating pseudonymously. At some point, Alice wants to send Bob a file which is too big for the channel they've been using, so they both agree on an anonymous repeater. At some predetermined time (in most protocols, the time would just be "now"), they both connect to a server, and the server just repeats everything it sees on one connection to the other.

(Once you've got the simplest thing, you may need to fix up a bunch of other problems.)

First, we need a way for a single server to distinguish between a lot of people connecting to it at once - which of them actually want to talk to each other? One solution is to have Alice and Bob agree on a shared secret ahead of time. When they both contact the repeater, they send the shared secret, and the repeater knows to establish a connection between them.

That's weak against eavesdroppers, though, and in today's world of relatively widespread DPI gear, you just can't trust the network. This can be solved with SSL, but only if Alice and Bob can agree on the repeater's public key ahead of time. (If not, then SSL certs are easy enough to fake - it's a little extra effort to do so, but if we were only defending against unmotivated eavesdroppers, then this would be pretty easy!) Solving this problem properly will take a little more thought than I'm willing to put into this problem tonight, unfortunately. (Maybe passing the SSL layer through the proxy?)

How do we keep the repeaters from learning about Alice and Bob's identities? For that, we can use proxies - we only need to keep their identities safe from the repeater, not the other way around, so existing proxy mechanisms will work for this.

How will we handle the performance cost of going through so many layers? Alice and Bob can agree on some protocol ahead of time for dividing the data across several paths, or something like that. I'm explicitly not designing that layer here - proxies and repeaters are general enough mechanisms that more complex protocols can be implemented on top of them pretty easily. (That's one of the biggest advantages of using the simplest thing!)

As a completely unintended bonus, if anonymous repeaters like I've described here become widespread, that could be a solution to the problem of establishing a connection between two computers that are both stuck behind NAT.

Digital Dead Drops

A completely different solution to the original problem would be establishing "dead drops" - locations where you can drop a file for a certain amount of time, and somebody else can pick it up later. (I've already seen pastebin used like this, come to think of it!) If both parties use proxy chains, and the data is encrypted, then this is even more secure than using repeaters because you avoid the simultaneous connection problem - an eavesdropper has a hint about who you're talking to because you're connected (however indirectly) to the other user.

The Next Problem

The other major problem with p2p networks is that search is public - by making a file available for others to download, I'm also announcing to the world that I have that file, and some people might be upset about that.

I have some ideas about how to solve that, but I've probably spent too much time blogging about this topic already. Instead, I think it's time for me to start coding these things up, and see what works. Should be fun!

Saturday, November 20, 2010

Chaos Theory

The point is, ladies and gentleman, that chaos, for lack of a better word, is good. Chaos is right, chaos works. Chaos clarifies, cuts through, and captures the essence of the evolutionary spirit. Chaos, in all of its forms; chaos in life, in money, in love, knowledge has marked the upward surge of mankind. And chaos, you mark my words, will not only save Android, but that other malfunctioning entity called the US cellular industry.

This completely unnecessary riff on Gordon Gekko was inspired by this article, which asserts that Windows Phone will have an advantage over Android because it's less "chaotic". How horrid! That kind of platform lockdown moves Windows Phone onto the iPhone's turf,

The iPhone model (one OS running on one device, all controlled by one company) works for Apple, but I personally think it's a fluke. By launching the first smartphone platform in the multitouch paradigm, Apple snatched up enough of the market to become entrenched, and by the time competitors showed up, Apple was on the second or third revision of the iPhone, giving them a lead that could take a decade to wear off. I would argue that they succeed today in spite of (and certainly not because of!) their locked-down ecosystem. The iPhone hasn't changed significantly since the first revision, and at the rate things are going right now, it might not be premature to label the iPhone a legacy platform.

Android, on the other hand, is chaos. Google all but throws the software out there, yells "Come and get it!", and lets the carriers and manufacturers do whatever they want with it. It's sometimes inconvenient, but it also leads to some really cool stuff. Can you imagine something like the Nook (which secretly runs Android) being built on top of the iPhone OS?

I sometimes joke about Windows Phone copying the iPhone to an astonishing degree, but it still worries me. The iPhone model is something that will work once, for the first company to come up with a revolutionary new paradigm, but after that a single entity can't keep up with the rest of the industry for very long. Microsoft does seem to be doing a lot of things better than Apple (C# doesn't make me want to kill myself the same way Objective-C does, for instance), so Windows Phone is still a compelling platform, but if Microsoft wants long-term viability it's going to have to look beyond what Apple is succeeding at today.

We must remember that Windows - and, hell, the entire concept of an operating system - came into being to manage chaos, not eliminate it. Back in the Bad Old Days, hardware platforms were fairly standardized, and introducing any changes was likely to break existing software. This is where turbo buttons originally came from, and it's not a model we should be aspiring to. Having a number of different hardware profiles may make Android harder for developers to test their apps on, but it also means that the platform is much more flexible, and by forcing developers to account for different hardware profiles, Google is creating a better platform for the long term.

Any Windows program will (with very few exceptions) run on any Windows computer. Microsoft's past success with that model wasn't a fluke - it was a testament to the fundamental power of a model that allows for a wide variety of future hardware improvements. It's interesting to see how closely the development of mobile phones is mimicking the development of PCs 25 years ago; some of the players (like Apple) are even moving into the same positions. Microsoft has already managed to invent the PC once, and I hope that some people at the company remember enough to be able to invent it again.

In the end, it comes down to how mature you think the smartphone market is. If you think most of the innovation has already happened, then bet on the vertically integrated platform (like the iPhone). If you think most of the innovation has yet to happen, then bet on the open, flexible platform (like Android). It's still early enough in Windows Phone's development for Microsoft to make a choice, and I really hope they choose the latter.

Friday, November 19, 2010

Thursday, November 18, 2010

New phone!

So I've started thinking about getting a new phone!

Windows Phone

On the one hand, there's Windows Phone. I haven't had great experiences with Windows Mobile in the past, but Windows Phone is a totally new platform, and it actually seems pretty nice. I get one for "free", as a Microsoft employee - free with a two-year contract, though, so really it's closer to half price. Everybody else around the office is pretty stoked about them, and the enthusiasm is a bit infectious. The development tools look pretty awesome, too.

On the other hand, it's a totally new and somewhat unproven platform, with a radical new user interface to boot, and hardly any apps written for it yet. There's also the Microsoft factor, as much as I hate to admit it - I just don't have a ton of confidence in Microsoft's ability to execute in this space. (Here's an example: I use Microsoft My Phone to sync my phone to the web right now. It's a really useful service, but Microsoft is killing it in favor of Windows Live, and I have yet to see any official solution for migrating my data to Windows Phone. No other company in the world could get away with this kind of lack of focus.)

Pros: Employee price, easy to code on, supporting my employer, shiny.
Cons: New; unproven; Microsoft

Android

In the other corner, we have Android. It's proven to be a solid competitor in the mobile space, and is usually mentioned in the same breath as iOS these days. It's also Linux-based and relatively hackable (in the good way), and I have to confess to a certain amount of nerdy glee at the thought of being able to ssh in to my cell phone and poke around. >_> The open source factor is also a plus, even if Android isn't going along with the spirit of open source at all.

There are also problems with Android, foremost among them being that manufacturers usually don't release updates to phones that aren't ridiculously popular. The Nexus One gets updates, the Droid family gets updates, and so do other phones of similar notoriety, but most aren't so lucky. I'd also be the only guy at Microsoft without a Windows Phone, which could get awkward.

Pros: Solid platform, open source(-ish), hackable, has momentum
Cons: Carriers have too much control, Google is slightly evil
Bonus: Microsoft's strategy against Android is pretty much the sketchiest thing possible (future blog topic), and actually inclines me to support Google. <_<

iPhone

The thought of carrying millions of lines of Objective-C code in my pocket just makes me shudder. No iPhone for me.

MeeGo

MeeGo is kind of the dark horse in this race. It's a joint thing between Intel and Nokia (and a few others?) to put together a proper open source Linux-based phone OS. (None of this Android-style "code drop" crap.) This would be a really compelling option for me if they'd gotten around to releasing any phones yet, but... well.

There is the Nokia N900, which you can run MeeGo on, but I don't get the impression that it's especially well supported, and the N900 is already a year old. Overall, it seems like MeeGo is at too early a stage in its development to consider, but if Nokia or Intel announced a phone tomorrow that had high-end specs and ran MeeGo natively, well, that'd certainly throw a wrench into my decision.

So. Anybody got opinions? :D

Wednesday, November 17, 2010

The Decline and Fall of the Facebook Empire (part II)

(click here for part I)

Yesterday, I blogged about a lesson that Facebook is soon going to learn. That lesson is this, and comes in two parts: when you integrate your services, you're actually competing with the Internet. And, when you compete with the Internet, you lose.

The Second Thing

So what does it mean to compete with the Internet, and how can you avoid it?

The Internet is really just a consensus built around a set of protocols; it only holds together as well as it does because everybody agrees to either work within the framework of all the protocols that exist, or to build new protocols and try to build consensus around those. When you build a new service that follows existing protocols, you're working with the Internet, and strengthening the whole network, because it makes it easier for people to benefit from your innovations. So, people innovate within the framework of the Internet because the users are there, and users are there because innovation happens there - it's a virtuous cycle, in other words.

(If you've ever heard people talk about "open standards" in reverent, near-religious terms, it's because they've realized just how powerful this virtuous cycle is.)

That's why competing against the Internet is a losing proposition. Its creators built a system where anybody can improve the whole by adding their bit, and by now it's built such a tremendous momentum that you have to be as large as Facebook to make any headway against the flow.

Facebook has sort of a mixed history when it comes to open protocols. The website (their main service) is pretty locked-down; things like RSS feeds exist, but they sure don't want you to find them! On the other hand, Facebook chat is built on XMPP, and as a result it was trivial for most instant messaging clients to add support for it - an example of the power of using open protocols.

It's possible that they'll somehow manage to map their new messaging service onto an existing protocol (IMAP, if we're lucky), but I have my doubts. If they were doing that, it seems like the sort of major feature that they'd want to mention! Far more likely that they'll expect you to use it through their web interface, and maybe have some limited integration with other services.

On the other hand, some companies actually get it. I may have misclassified Google the other day, for instance - they expose a lot of their services through their GData APIs, and tend toward using open standards without restrictions when possible. (For instance, they were one of the first free webmail services to give people access to IMAP and POP for free.)

Tuesday, November 16, 2010

The Decline and Fall of the Facebook Empire (part I)

There is a hard lesson that Internet titans learn eventually. AOL was the first to learn it, and it led to their long, drawn-out demise. Yahoo learned it, and is now a shadow of its former self. Microsoft and Google have both come up against it, but been reasonably successful in avoiding it; the former by shoveling money at the problem, and the latter through sheer heroic engineering effort. And, with yesterday's announcement about a unified messaging system, Facebook is about to learn it - and doesn't have the resources to power through anyway, like some others.

That lesson is this, and comes in two parts: when you integrate your services, you're actually competing with the Internet. And, when you compete with the Internet, you lose.

The First Thing

Why does integration equate to competing with the Internet?

Let's say I think this new messaging platform is awesome. IM? Great, all my friends use Facebook chat anyway! SMS? Sounds pretty cool! Email? ...Wait, hold on a second. You mean I have to give up GMail to get all the benefits of this integration? :(

In other words, this kind of integration is a form of soft vendor lock-in: they make it more convenient to use all of their services, rather than using the best of what the 'Net has to offer. AOL is probably the strongest example of this - back in the day, they combined an ISP, web browser, and email service into one package. And, they quickly started hemorrhaging customers, because they couldn't compete with the best ISPs, browsers, and email services all at the same time.

See, savvy Internet users know how to mix and match services. Back in the '90s, AOL wasn't just competing against all the other companies in the same space - it was competing against every possible permutation of an ISP, a web browser, and an email service that people could come up with. And today, Facebook is trying something similar. By combining a bunch of similar services under the umbrella of "messaging", Facebook is competing with every possible permutation of messaging services available online.

And here's the kicker: They're not just competing with every permutation of services that do the things they do. They're also competing with every new service that lets people communicate. They don't have an answer for things like Skype, for instance. And the thing about new services is, there's never a right answer for how to deal with them. The only right answer is not to put yourself in a situation where you're pressured to mimic every new technology that comes along.

I mentioned earlier that the savvy Internet users can often find better combinations of services than any one company can provide. What about the non-savvy users? They stick around, sometimes for years after a service has lost its vitality, and you end up with digital ghettoes like Hotmail and AOL. That's how I see yesterday's announcement - it's Facebook's first step toward becoming yet another digital ghetto.

(continued in part II, tomorrow!)

Monday, November 15, 2010

Procrastination

So here's the thing about procrastination: I think I actually enjoy it. For some reason, having something that I need to do, and procrastinating by doing something else, is way more fun than just doing the something else. This worries me.

In that sense, of course, blogging for a month is a good thing, because it's something not to do while I do other stuff, but it does lead to situations like right now, where I have an hour to write something an absolutely nothing coming to mind. >_>

Sunday, November 14, 2010

headache

headache

cannot blog

sleep now

:(

EDIT: Okay rather than skipping a day, I will retroactively make this a post about what I spent most of yesterday on: Harry Potter fanfiction. >_> So here are two fics that I've read lately, that are worth linking!

Harry Potter and the Methods of Rationality: Most of you have already seen this, because I've been recommending it to everyone. The story branches off of canon with Petunia realizing that she can't stand Vernon Dursley, and marrying a college professor instead. Harry grows up immersed in the scientific method, and applies it to the magical world, with sort of awesome results. As a bonus, this story also has Professor Quirrell as a real character (and one of the more interesting ones, at that), instead of just a cardboard cutout. As a further bonus, Harry is slightly evil! As a third and completely gratuitous bonus, it's full of references to pretty much everything else - the latest chapter, for instance, has (after a few chapters worth of buildup) the greatest Army of Darkness homage possible, I think.

Anyway, I enjoy it trememendously (that's present tense; it's an ongoing story) but apparently not everybody does? Definitely not my problem, though. :D

Harry Potter and the Wastelands of Time: Found this one because it was recommended by the author of HP:MoR. It's... not like any Harry Potter fic you've read before, I think. It's incredibly dark, for one thing.

The premise is that Voldemort won the war (the second time around), by discovering Atlantis and using the knowledge that he found there. Harry can't even come close to stopping him, so when he finds himself alone in the world, he makes a deal with a demon for a second chance. This puts him in a timeloop (think Groundhog Day), which begins shortly after OotP and loops back whenever he dies. It takes a lot of tries for him to accumulate enough skills across his lifetimes, and figure out a plan - thousands of tries, you're left to infer, because the story never specifies. Again, that's where the story starts, with a jaded and tired Harry Potter in a sixteen-year-old body getting ready for another go.

It's an interesting take on first person omniscient - Harry mostly knows what's going to happen next, but he doesn't let on much to the reader, and he's still occasionally surprised by things. It's also a good way to keep the action going - because Harry's been refining his plan for so long, he always knows what to do next, and the action never really slows down. Be careful before you start reading this fic, because you might have a hard time stopping.

Anyway. On to a post for today. XD

Saturday, November 13, 2010

Sometimes, in the middle of a conversation, I'll spin off into a tangent which is somewhat factual sounding, and completely untrue. The example that comes to mind at the moment is convincing a friend of mine that there's no such thing as narwhals ("Sea unicorns? Really?"), while we were working on a project for some class. (Apparently it took a while for other people to convince him otherwise; I am vaguely proud of this, if not exactly proud of being proud of this, if that makes any sense.) This is a thing I do all the time.

Why? Part of it is because it's hilarious, sure. Another part is that I'm terrified that if I didn't tell blatant lies from time to time, people would just accept everything I said without questioning it. It sounds like a silly concern, but it actually happens more than I'd like, and it bothers me whenever it does. I've tried being that guy that always knows what he's talking about and never leads you astray, and it is no fun. :(

It's also a lot of fun to tell elaborate fictions, of course! You get to re-cast a piece of the world under an alternate system of rules, which isn't quite internally consistent, and try to run ahead of the other person to paper over the inconsistencies before they find them. If they catch up and figure out a hole in my story that I can't work around, then I lose; it's sort of like a mental game of tag.

On the other hand, there's also the possibility that I just inherited it, since my dad does the same thing sometimes. >_> There's a story my family loves to tell, about how one of my aunts asked my dad how he kept the lawn so nicely mown. He told her all about the Rent-a-Sheep service, where they'll bring a few sheep out to your house every once in a while to munch on your grass, and he told it so convincingly that everybody bought it. My dad: occasionally pretty awesome. :D

Friday, November 12, 2010

The Cloud Scalability Distraction

I have this sneaking suspicion that Amazon has done something extremely clever with the cloud. When they launched their cloud services a few years ago, they made a big deal about how well they scaled up - S3 can store unlimited data! You can start up as many EC2 instances as you want! They even drive the point home with their pricing - S3 pricing is tiered, and there's a tier for people storing more than 5 petabytes of data, "proving" that S3 will easily scale to that amount. (Having that tier listed on the website is a bit unnecessary - if you're going to be spending millions of dollars per year on S3, you're probably going to get to talk to a salesperson face-to-face, just saying.)

Why is this clever? Because all of Amazon's competitors in the cloud space followed suit, and talked about how well their services scaled up, too. Meanwhile, Amazon's been advancing in precisely the other direction - the ability to scale down. For example, they recently launched "micro instances", which are cloud servers that rival the cheapest VPS providers in price.

Why does this matter? Because scaled-down cloud services are going to be the next revolution in computing. Right now, "cloud" is more of a marketing victory than anything else; there's nothing there that a few competent sysadmins and devs couldn't put together in a week or two, for a few times the price. The biggest advantage is the pricing (and it's telling that Amazon is on the forefront of it; the expertise they had that contributed most to the cloud wasn't technology, it was payment processing on a massive scale). Cloud services are still pretty coarse-grained, compared to what they could be.

Right now, the technology that would make the cloud "revolutionary" hasn't been invented yet. (I may have said this before.) I think that that technology is going to be something that enables partitioning up cloud services a few orders of magnitude more efficiently than we can today. Imagine a system where the resources can be sliced finely (and cheaply) enough that it makes sense to integrate them into desktop applications, and have the user pay for it - that's when you'll really have something revolutionary on your hands.

(aside: the title of this post makes me think of the big bang theory T___T)

Thursday, November 11, 2010

Ice, Ice, Baby

The weather the past few days has reminded me that I just moved two thousand miles north. In the evenings it's getting down into the 40s, and there's a permanent light drizzle which just maker things unpleasant. I've also been reminded that I really, really hate the cold. Whoops!

On the other hand, I'm definitely doing better with the cold than I would have in the past. Somehow, I feel like I've begun to adapt - temperatures in the 50 to 60 range feel like 60-70 degrees used to feel, and I can usually get away with just wearing an extra layer of shirts, or maybe a sweater. Actually, a lot of the time, the weather here is downright pleasant. The only problem is when it gets dark and cools off, and that's only a problem because it gets dark by 5 now. :(

That's one thing that totally caught me off guard about moving north. I was here over the summer, and didn't catch on that the long daylight hours would reverse themselves later in the year. It's darker at 5 these days than it was by 9'o'clock most days during the summer, and I expect it's going to keep getting worse until the solstice. It's actually pretty surreal - when I see that it's completely dark outside, I know that I still need to stick around at work for another hour. :/

People are saying that this winter is going to be hellish; in the figurative sense, of course, because they actually mean a lot of snow and ice. I don't know how much snow is considered normal here, but "a lot more" will have to be a decent amount. It'll be fun for the first day or two, at least! I'm pretty glad that I can take public transit to work, because if I tried to drive to work in the snow, I would probably die.

Wednesday, November 10, 2010

shortened URLs = sadface

So on Twitter, I can understand URL shorteners. People pretend to like the 140-character limit, but are somehow totally willing to circumvent it when they really need to. I get that. URL shorteners are completely valid (if still annoying) in that context.

What I cannot for the life of me understand is why somebody would use shortened URLs in a blog. I was reading a post the other day (seem to have lost the link) which used shortened URLs exclusively. If you control the markup (as you do on a blog post) then there is literally no possible reason to use short URLs, ever, unless you are trying to annoy your readers for some reason. Remember how, five years ago, everybody knew to make your URLs as expressive as possible for SEO purposes? And how it ended up being really convenient, because a user could hover over a link, see the URL, and have a sense of what you were linking to? All that's out the window now, thanks to fucking Twitter! :D

I'm not even going to get into the fact that shortened URLs are a horrible idea technically, and occasionally place you at the mercy of the Libyan government. Everybody knows that they're terrible. Twitter is the only reason that we tolerate their proliferation.

(side rant here about how 140 characters is far too short. I am feeling lazy tonight so please imagine an appropriate rant. >_>)

Actually, short URLs are one thing that Identica gets completely right. When it detects a shortened URL, it sets the
title
attribute on the link to the real URL, so that you can see where the link goes by hovering over it. It is impossible to appreciate how useful that is, until you use another mblogging site like Twitter and become sad because you actually have to click the stupid links to see where they go, and then write a disjointed blog post on the topic.

Tuesday, November 9, 2010

Tech Support Counter-scripts

So when you call in to tech support, they are reading off a script. The script walks you through the most common resolutions to people's problems ("Have you tried turning it off and then on again?"), and is entirely useless if you're at all technically competent. If you're like me, your support calls usually end up being exercises in absurdity - can I go along with the script long enough for them to become convinced that they should put me through to someone competent?

Randall Munroe had the right idea. He just didn't take it far enough. It would be really great if everybody implemented backdoors for techies, but really. If we want backdoors, we're going to have to make them ourselves.

What we should do is make a repository of tech support counter-scripts. Let's say (as in the linked xkcd) that you know exactly what the problem with your internet connection is, and you want to minimize the time it takes to convince the person on the other end of the line to just fix your problem. We could crowdsource it, and have everybody who tries the script try tweaking it on actual calls, and iterate toward the fastest way to resolve a given situation.

So here's the really cool bit. We also include some distinctive phrases in the scripts (backdoors, if you will), so that after a while the support people will catch on to what we're doing. At that point, they have a choice: they can either look up the backdoor phrase in the counter-script archive, and find out what specific problem we're trying to get them to solve; or, they could adjust their scripts to counter the counter-script.

Few people are stupid enough to choose a fight against a crowd (just ask /b/), so eventually they will add the backdoor phrases to their scripts, with the instructions that we wanted them to have in the first place. In effect, we'd be social engineering the backdoors into the support system ourselves. Pretty cool, huh? :D

Monday, November 8, 2010

The Terrorists are Running the Asylum

Actually, I sort of feel sorry for the TSA. Their job is to screen millions of airline passengers, with a 100% success rate. Since that's obviously a fool's errand, their real job is to keep us mollified that they're doing enough every time there's a terrorist attack, and meanwhile try to convince us that there is a such thing as going too far in the name of security.

They started out with only the first one, I think. After Richard Reid, they had everyone remove their shoes before getting on flights. It was a perfectly logical course of action for them - a reaction to exactly the attack that had occurred, so we could be confident that that would never happen again, and just enough inconvenience at security checkpoints that we felt safer. Their reasonableness, in fact, was their only mistake, even though they didn't realize that for several more years.

The analogy of a frog in a boiling pot of water is ridiculously overused, but also fits this situation. By taking a series of reasonable steps to individual threats, the TSA can slowly turn up the temperature of airport security, and we'll all go along with it and accept it as necessary. In fact, that's not in their best interests - in the long term, they need us to realize that there are things we're just not willing to do for more security, or they're going to be in an awkward position in the long run. Looking at it from this point of view, the best course of action for the TSA would have been a dramatic overreaction, which would convince us that they were going too far, so that we could all take a breather and then tone things back down.

Banning liquids on flights was a good move, crazy and arbitrary enough (and in response to a narrow enough threat) that we should have all woken up to the ridiculousness of it all. I mean, you can still take several smaller bottles onto a flight; it's a huge inconvenience to you and me, but won't stop a motivated attacker. Who could possibly accept that? But nope, it wasn't far enough, and now we're all used to it. The bar has been raised, and they're going to have to really go over the top if they want a backlash that will let them finally restore sanity to airports.

So fast forward to a year ago. In response to the Underwear Bomber (a suicide bomber who failed to even kill himself. Terrifying!), they've begun installing new X-ray machines which can literally see through our clothes. They claim there are safeguards in place to keep airport employees from looking at our naughty bits, but it would seem that the safeguards aren't too hard to disable, and there's already been one case of security personnel saving nude pictures of people for their own use.

This is the TSA's Hail Mary pass. The government is literally paying people to look at pictures of naked children. If they have any sense, their next move is going to be to wait a few more months, and then uncover a pedophile ring that's been saving pictures from the new X-ray machines. With any luck, that will be a big enough jolt for people to realize that security is about tradeoffs, not absolutes, and that we'll never be 100% effective in preventing terrorist attacks. If we can get past that, we might even be able to have a meaningful national conversation about a reasonable level of security.

In the meantime, though, I should reiterate: The government is paying people to look at naked children. Courtesy of terrorists, even. I don't know when we're all going to get hit with a jolt of sanity, but it can't happen soon enough.

Sunday, November 7, 2010

Sherlock!

"I am the closest thing to a friend that Sherlock Holmes is capable of having."

"And what's that?"

"An enemy."


So I found out the other day that there's a new Sherlock Holmes miniseries being made - a modernized one, no less. Normally that would be the sort of thing that I would avoid like the plague, but I found out about this through a blog post that described it pretty favorably, so what the hell, right? Anyway, long story short, I'm halfway through the first episode and getting the next few and at this rate (and with 1.5 hour episodes) I probably won't have time to think about anything else to blog tonight. XD

(Mild spoiler alert for the rest of the episode. No major plot details, but some things that you wouldn't have expected.)

You know what surprised me about this show? Not the writing; it was entertaining in that dry fast-paced British way, but I was expecting that. Not the way they modernized (though I was pleasantly surprised by how well they did some things - instead of bolting technology onto a classic plot, they basically wrote a new story with things like cell phones playing a central role). Not the story: it was about on par with one of the better episodes of something like CSI, which actually makes it pretty decent, but then again you expect that these days.

What surprised me was the characters. They are far darker than those in any other Sherlock Holmes adaptation I've seen. Watson doesn't just have a vague background as a military surgeon - he has flashbacks, sees a therapist, and discovers halfway through the episode that he misses the danger and excitement of being in a war zone. Holmes, meanwhile, is practically amoral - in his own words, a "high-functioning sociopath" - and cares about nothing except for solving crimes to avoid boredom. He's even occasionally a jerk about being the smartest guy in the room (admittedly, the original Holmes did this from time to time). The character is reflected in the other characters, too. The police don't like him (which was also present in the original stories, I suppose, but not to this extent), with one of them referring to Holmes as "the freak" and even speculating that he might end up on the other side one day.

Inevitable comparison: How does this compare to the Sherlock Holmes movie that came out last year? I enjoyed the film, but more as a Robert Downey Jr. movie than as a Sherlock Holmes movie. RDJ drew comparisons to Hugh Laurie (who stars in another Sherlock Holmes spinoff), and it was a fun movie to see, but that's just it: it was more "fun" than "interesting".

The next episode is about to finish downloading, so I'll leave off on this note: I think this adaptation has the potential to be Interesting.

Saturday, November 6, 2010

My Desktop

So a few days ago I was trying to think of a topic to blog about, and checked the NaBloPoMo homepage, since they post a prompt every day. That day, it was to write about a piece of jewelry that I own, which was a little bit less than helpful. >_> But then, Kiriska suggested that I should write about a piece of hardware that I own. I don't think she was serious, but you know what? That'll totally work. XD

I've owned my desktop computer for about eight years now. None of the parts are original, but the spirit of it lives on!

Origins

So back in the eighth grade (around 2001 or 2002), I had heard about this newfangled "Linux" thing, and I wanted to try it out. With little or no thought as to the consequences, I downloaded an ISO (Mandrake Linux, if you're wondering) and installed it on our family computer. (This actually took several months - we take broadband for granted now, but have you ever tried to download a Linux ISO over dial-up?) Immediately, problems: I set up the dual boot properly, but nobody else in my family was able to figure out the GRUB boot menu, and as a result nobody was able to use the computer unless I was around to boot it up. (I didn't think it was too hard - you go down to the item labeled "Windows", and hit enter - but whatever.)

I'm not sure, but I think this episode contributed a lot to my parents agreeing to let me build my own computer a few months later. I went online, picked out parts from a ton of different websites (either Newegg didn't exist at the time, or I didn't know about it yet), and ended up spending about $600 on my first computer, if I recall correctly. Back then, that got you an Athlon XP 1500 processor (actual speed 1333 MHz - this was the beginning of AMD's labeling shenanigans), 256 MB of RAM, a 40 GB hard drive, and various other bits.

Ever since then, it's been my primary machine, and also a testbed for every crazy idea that I've ever wanted to try out. (When you're me, you have a lot of crazy ideas, apparently.) To be honest, I think that most of my current skill with Linux comes from trying out something a little bit too crazy, and then fixing everything that breaks as a result.

Operating systems

So I mentioned that I started off on Mandrake. After a while, I switched to Red Hat; Mandrake was never especially great, and Red Hat seemed pretty solid. (An aside: Red Hat 8 or 9 was when BitTorrent actually hit the mainstream - every official download mirror was crawling at dial-up speeds, but the BitTorrent download was working just fine. This was when a lot of people sat up and noticed it.) Red Hat (and later Fedora, a spinoff of Red Hat) was nice for a while, but around the middle of high school I decided to take it to the next level. I switched to Gentoo.

For those of you unfamiliar with Gentoo: it is hardcore. Graphical installers? Feh! Installing Gentoo involves downloading and extracting an extremely minimal base system (about 100 MB), and then building the rest of the OS as you see fit from the command line. Prebuilt packages? Decadent! On Gentoo, you build every single piece of software you install from source code. (Sometimes it's overkill, but it's become the defining characteristic of Gentoo.)

Gentoo has this nice property called "rolling releases" - basically, they don't release major versions every once in a while like other OSes. All updates to the system are distributed incrementally through the existing update mechanism. This is useful - it means you never have to reinstall your OS, unless you really want to (or, in my case, really screw it up). End result: After using Gentoo for something like six or seven years, I've only actually had to completely reinstall it once.

Current Incarnation

Today, my desktop has a dual-core 2.5GHz Athlon 64, 2 GB of RAM, several hard drives (the exact count depends on your definition of hard drive), and I think it's about due for another major upgrade. It's on its third CPU, its fourth motherboard, its third case, and in fact I don't think there are many parts that have only been replaced once. Yet somehow, it's still the same computer it was when I first built it all those years ago.

Friday, November 5, 2010

Pirate TV

BitTorrent is a pretty amazing way to get "completely legitimate" TV recordings. Using it, you can distribute bits around a network so efficiently that a several-hundred-megabyte file can be distributed to tens of thousands of people within an hour or two. It's not perfect, but it's so close to optimal that nobody's been able to improve upon it significantly in the past decade. That's pretty amazing, given the speed at which technology advances - ten years is an eternity.

So. What would it take to significantly improve upon BitTorrent in this space?

I have this crazy idea for a protocol that should allow for secure p2p-style live streaming video. Basically, you'd have a single video source, streaming through a tree of nodes that act as stream multipliers to get it widely distributed. Repeaters within the network would have the option of restricting client access based on whatever they wanted - invite-only access, or an add-supported stream, or a micropayment scheme (let's pretend I didn't just open up a can of worms), or something else. All connections would be SSL-encrypted (using both server and client certs would be used, since clients need to be authenticated as well), and some sort of forward error correction encoding would be useful too, I guess.

Inevitably, The Man is going to try to shut this down, so we need to think about defenses. One of the weaknesses of BitTorrent in this area is the tracker, which works by giving out users' IP addresses to entire swarm to facilitate downloads. With connections managed by the user, the data on connections remains a lot more private, so users can't be easily tracked.

The video source needs extra protection, because it represents a single point of vulnerability for a network. The easiest way to track it down would be traffic analysis. This can be mitigated by having a network of a few dozen fast machines (the video source among them) all start sending data to each other at once; most of the data being passed around will be garbage data, but an eavesdropper can't know that because all connections are SSL-enabled. Thus, a dedicated attacker could narrow the source down to one of several nodes, but its location within that group of nodes can be effectively hidden.



I'm not arrogant enough to think this protocol is any good, to be honest. (The incentives are incredibly sloppy, for one thing - it's an area that I hadn't started thinking about until yesterday.) The point I really want to make here is that better p2p protocols that BitTorrent exist; they just aren't being deployed because BitTorrent is widely-used and good enough. It's something that anybody who works against piracy needs to keep in mind: the successor to BitTorrent is going to be a lot harder to shut down, and it's going to appear out of the woodwork as soon as BitTorrent is no longer an attractive option for pirates.

Thursday, November 4, 2010

An Increasingly Frantic Search for Novelty (or: Philosophical Segfaults)

...or at least that's how life feels some days.

I am, in general, a purpose-driven individual; as long as I have something to be working on I'm pretty contented. The flip side is that, when I actually don't have anything to be doing, I handle the listlessness pretty poorly.

About a year and a half ago, I went through a pretty dark spot when I started wondering about my overall purpose in life; what was I actually aiming for? Up until that point, I had sort of powered through school on a weird mixture of arrogance and obligation. That had started to crack as far back as high school, though, and so I decided: I would figure out my purpose in life!

As it turns out, this is a terrible idea. Because, after not finding an answer for a while, I started to wonder about my life so far. Maybe everything I was doing was actually meaningless, maybe everything I had done so far in life was meaningless, maybe there wasn't a point to doing anything anymore, because I couldn't find a reason for any of it... It was the philosophical equivalent of dereferencing a null pointer.

Eventually, I grabbed at a lifeline. Forget about the long term, I decided, because I could just do whatever seemed interesting in the short term, and keep going like that until I found a better answer. It was a total cop-out, but it also got me out of the pit I had dug for myself, and keeps me out when I wander too close to it.

This story has no resolution; I'm still basically operating in magpie-mode, chasing after whatever looks shiny when I find myself with nothing else to do. The alternative is to seriously think about the possibility that there is no purpose to what I'm doing, and that I'm just an empty puppet acting out a script that was written over the past billion years of evolution. (I'd rather not.) It's not exactly fulfilling, but then again, fulfillment isn't a thing I expect anymore.

Honestly, this blog post is about as close as I feel like coming to the topic, and even writing this much sent me on about an hour's worth of meandering melancholy mental tangents. I can't say that I've given up on the problem, that I'm satisfied with not having an answer, without setting off the intellectually honest part of my brain... but if I end up living my whole life without actually finding a reason for having lived it, well, I think I've made my peace with that.

oh god did I just write a blog post about the meaning of life, dammit I think I did

Wednesday, November 3, 2010

An obsession with unpopularity that is, frankly, a bit creepy

A few months ago, I picked up a copy of 2600. (Yes, an actual physical copy. Ironic, right?) I got it at Book People, if I recall correctly, and if you're ever in the Austin area looking for a cool bookstore, Book People is the place you want to be, I think. I could digress for paragraphs about why it is a cool place, but I think I will get to my point instead.

Anyway, it was a pretty neat issue, especially the article about hacking Bluetooth. There was another article, though, wherein the author just talked about how he was picked on in high school, and how now that he was a hacker, he was going to get back at all the people that had wronged him. And you know what? That article creeped me the hell out. Partly because the guy sounded like a total psychopath, but mostly because it crystallized my general unease with how easily geek culture has absorbed the Disneyfied high school narrative of geeks getting picked on. Hold up, that deserves some explanation.

An Aside on my Longstanding Objections to the Generic Disney Channel High School

Have you ever noticed how far divorced the Disney channel is from reality, when it comes to high school? I don't just mean the liberties that they've had to take to fit various shows into a TV format; I mean the fact that what they portray as high school is completely and tragically divorced from reality. I'm mostly referring to the concept of well-defined and insular cliques, and the social structure that implies - the rest of how they portray high school is equally messed-up, but tangential to my actual point. They've created this parallel universe that's sort of like high school, and they apply it so consistently to all their shows that I actually wonder sometimes if there's another country where school actually is like that, and all their writers just happen to be from that country. Then I realize that it's much more likely that all their writers grew up watching movies like Animal House and Revenge of the Nerds, and thought they were hilarious, and are unconsciously trying to emulate a parody. BUT WHATEVER.

So back on topic:

What is it with geeks and this particular kind of martyrdom?

Another example: various slashdot comments which have creeped me out in the same way. (Entirely too lazy to find links.) You will, with alarming frequency, find comments on "geeky" news sites where people assert that geeks are driven toward technology because they're picked on and unpopular with their peers. There are also plenty of creepy misogynistic comments about women only wanting "bad boys", at the expense of geeks, who are the "nice guys"; another narrative that falls more in line with the Disneyfied high school than any real one. I don't know where all this is coming from, but it worries me.

Another example: The second episode of the third season of Leverage, a show that I otherwise admire entirely too much. The plot: the team has to con a software executive, and the only way to get inside his head is through high school, because he's totally obsessed with high school! Because he was picked on in high school, and now fantasizes about showing off to everybody how successful he is now! Because he's a Geek!

(The worst part is, it was otherwise a good episode, but I couldn't enjoy it because of my all-consuming rage at the blatant stereotyping going on >_>)

So, stereotypes I can understand. They happen all the time, and it's not exactly mysterious why they happen. What I can't understand, though, is why geek culture has adopted this particular stereotype.

Attention, geeks everywhere! The "high school martyr" trope is a bad thing, and kind of creepy when you pair it with "but I'll show them all!" later on! It might even be worse than the "geeks = autism" trope, so let's stop perpetuating it, k?

Tuesday, November 2, 2010

Building a storage cluster (on the cheap)

So RAID is one thing, but (as I found out the hard way, recently) you're still screwed if the computer you put the RAID in starts going flaky. Now that I have a bit of cash, I decided it was time to take things to the next level. :O

My initial plan was to build a super-awesome NAS - basically an external hard drive that you plug into the network, so you can use it from multiple computers at once. After a lot of thought, I abandoned this plan. I don't trust commercial NAS systems (though the Drobo looks pretty sweet, I must admit), and building my own would have been too expensive. It'd also be kind of a pain to add storage - take it offline, install a new hard drive, mess around with cables (possibly forgetting to plug some in, which actually happens more often than you'd think >_>), and when you fill up the box, you either start throwing drives out (wasteful!) or upgrade the entire thing. Plus, a NAS is still a single point of failure, which is lame! I can totally do better than that.

So the new plan is to build a Ceph storage cluster. Ceph is a relatively new distributed filesystem that one of the Dreamhost guys did for his PhD thesis. It's Linux-based (and even included in recent kernels), actively maintained, well-designed (aside: I'm still pretty grateful to the distributed systems class I took at UT; without it, I probably wouldn't even know what well-designed meant in this context), and generally meets most of my standards for a distributed filesystem. At the time, it seemed like a pretty good option, and it still does! Now, if only I could find some ridiculously cheap computers to use as servers...

It was at about this point in my thought process that I discovered that Jetway makes some ludicrously cheap minimal servers. You know that plug computer that Marvell came out with the other year? Jetway's thing has much higher specs, runs x86, has a full complement of ports and room for a hard drive or two, and is well-reviewed on Newegg. The only thing Marvell's plug computers really have over it is that they're ridiculously tiny, and who really needs that?

So, long story short: I pick up two of those plus some RAM, add my desktop machine, and I have a three-way storage cluster that's infinitely expandable, open source, has no single points of failure (except for the network parts, maybe? but those things last forever), and ended up costing less than even the cheapest Drobo.

Belatedly, justification

Somebody is inevitably going to comment that I'm trying way too hard, and that backups to an external are cheaper. The trouble with doing your own backups is that you actually have to pay attention to them. My goal here is to get a system going where I don't actually have to think about it day-to-day - everything should just work.

Somebody else will probably say that I should just store everything in the cloud - then I wouldn't have to worry about anything, because they'll certainly take better care of my data than I could. The trouble with the cloud is that it's too expensive (about an order of magnitude more than I'm paying to buy my own storage cluster, which is a lot more than I'm willing to pay). Plus, there aren't any cloud storage providers that I actually trust not to snoop around in my emails, or whatever.

Monday, November 1, 2010

Dogs and Cats Living Together

It's November again! Time for another round of NaBloPoMo. So I recently found out that NaBloPoMo is no longer just a November thing, but you know what? I sort of don't really care. It is a Thing I Do In November, and I don't see any reason to change. :D Plus, there's also the fact that it coincides with NaNoWriMo. By blogging, I sort of feel like a casual jogger tagging along with people who are running a marathon - I know that there are people (hi, Kiriska! :3) that are doing a lot more work than I am this month, but I'm going at a pace that I find comfortable.

In past years, I've written a post or two ahead, so that I'd have a buffer in case something unexpected came up and I didn't have time to blog. I don't think I'm doing that this year. First, it sort of feels like cheating, and second, now that I'm actually working I have a much more predictable schedule.

Anyway, to kick off this month, I'll write about an idea that's been knocking around in my head for a while now.



Everybody loves puppies and/or kittens! This is an Inarguable and Incontrovertible Fact. However, a lot of people aren't in a position to own a pet (particularly those living in college dorms), and others don't really have the time to take care of one properly. What I'm proposing here is really, when you get right down to it, just a way to let college students play with cute animals; therefore, it cannot possibly fail, and might even end up being profitable.

I'm envisioning a big, lounge-type room, with comfortable chairs everywhere, and lots of dogs and cats. (Thus, the name: "Dogs and Cats Living Together".) The animals would all be adopted from local shelters, naturally. There would be chewy things for the dogs to play with, and random structures for cats to climb around on, and all the furniture would be the cheap stuff because there's no way it'll last very long. People would be able to come in and play with the animals whenever they wanted, without any of the worries that come along with taking care of them, or cleaning up after them.

(So who would take care of all the animals? That's the beauty of it - I can't imagine that you'd have any trouble finding applicants for the job on a college campus, when the job description is basically "taking care of dogs and cats".)

As far as actually making money, there are a lot of ways. You could charge a few dollars at the door. You could sell snacks (for both humans and pets!) once people are inside. You could let people adopt the animals, if they really wanted to. There are merchandising opportunities - cat calendars are a dime a dozen, but a cat calendar where you can actually go and play with the cats in the pictures might be enough to set it apart, I dunno. And, as a last resort, you could train the animals to rob banks - wait, pretend I didn't say that one. >_>

Anyway, yeah! I think that'd be pretty neat.

Friday, October 15, 2010

The hardest thing is to go to sleep at night, when there are so many urgent things needing to be done. A huge gap exists between what we know is possible with today's machines and what we have so far been able to finish.

--Donald Knuth

That said, having programming as a hobby is sort of troublesome when you also program 9-to-5. So I think I need a new hobby, preferably one that doesn't involve computers in any way, and ideally one that involves me getting some exercise. >_>

Anybody got suggestions?

Saturday, August 28, 2010

Adventures in the Linux storage stack

I recently got my Gentoo box back from storage after three months, and promptly upgraded all the software on it and tried to install a shiny new SSD. Here are the problems I've run into so far:

  • Media drive failed to mount
  • My media drive is a btrfs RAID-0 between two 500GB drives that I've picked up over the years. One of them wasn't being detected. After ruling out the usual suspects (drive was still working, as confirmed by smartctl, and btrfs' drive detection tool (btrfsctl -a) had been run [but oh god why does it even need this]), I started looking at more exotic causes.

    Diagnosis: The problem turned out to be a regression in the md userspace tools (must be, since I didn't upgrade the kernel): if you have a device which used to belong to an md RAID array, but doesn't anymore, md will still pick it up as a RAID device and lock it. The recommended fix is to zero out the first part of the drive, which isn't an option for me because the drive ACTUALLY HAS DATA ON IT. :[

  • pvmove stalls at 100%
  • When using pvmove to transfer my OS drive to my shiny new SSD, pvmove stalled at 100% completed. No data was being written, which I know thanks to gkrellm. Then my entire system locked up, or at least every process that tried to write data locked up. After rebooting, I tried to resume the transfer (which pvmove supports), and the system locked up again the same way.

    Diagnosis: None. Chalk this one up to LVM being completely fucked. It is a master of walking the thin line between being so useful that it can't be ignored, and so broken that it can't be used. (UPDATE: might have been caused by creating the pv on a whole disk and not a partition. Not sure why that'd be the problem, but it's the only thing I changed and now it works.)

  • Boot environment doesn't see shiny new SSD
  • So I have a completely customized boot process. It's part of a 70% successful experiment from about 6 months back in making an operating system which could survive a hard drive failure. (Harder than you'd think!) Part of that is a heavily customized initrd - basically a micro-OS which is embedded into the kernel image itself, so that I can do funky stuff with the boot process itself. This is all well and good, until I find out that while my OS can see my new hard drive fine, my initrd can't. :(

    Diagnosis: This one was me getting sloppy. When I created the initrd, I deleted a few thousand miscellaneous device nodes from the static /dev directory (because udev is too complex, let's just embed dev nodes for every piece of hardware that could possibly be connected -_-), and it turns out this included the one used by the drive I just added. I only discovered this after a half hour of "oh shit oh shit did I just transfer my entire OS to a DOA drive", of course.

Monday, August 16, 2010

How [should?] we teach history

This blog post on Abd el-Kader came to me secondhand via my RSS reader today. It's pretty fascinating in and of itself, but it's also really long, so don't start on it unless you've got a good chunk of free time ahead of you.

The thing that depressed me about the post is that I have basically zero context for any of it. I never really took an interest in history, not enough to go beyond what they taught us in high school, so for me north Africa and the Middle East are basically blank areas of the map for large swaths of history. In Rumsfeld's terminology, it was an "unknown unknown" until about an hour ago - something I didn't even realize I didn't know. For somebody accustomed to being a know-it-all, this comes as kind of a shock.

The trouble with "all the stuff we didn't learn about in history class" is that there wouldn't be enough time to cover it all if we spent our entire lives studying history. We only spend a limited amount of time in school, and that's a good thing! It just means that we need to find ways to present more varied information in the time we do have. Standardized curricula have their advantages, but the biggest downside in my opinion is that anything which doesn't make the cut remains completely unknown to several consecutive crops of students.

Do I have a solution? Nope, not a chance. I just think it'd be neat if we'd had more structured opportunities for everybody to go out and learn about something different. We can't (and shouldn't have to) increase the amount of time each child spends learning about history, so if we don't want to end up in a situation where everybody knows about the same subset of history, we need to find some way to diversify the knowledge base that we're building in history classes.

After all, if everybody learns the same parts of history, then nobody has anything interesting to say to anyone else about history, right?