Saturday, December 26, 2009

The Mind-takers

If I were to tell you that there was a group, funded by billions of dollars anually, supported by every major corporation in the world, whose sole purpose was to convince you to think a certain way, you'd be a little bit worried, right? What if I then told you that, instead of eliminating this activity, governments of the world merely seek to regulate it, so that beyond outright lying they can use any means necessary to bring you around to their point of view? They have wormed their way onto nearly every page on the Internet, sometimes even altering the text of the page itself, in order to bend you to their will. They not only appear in every form of mass media; in a very real sense they control all forms of mass media. They aggressively target children of any age. They use whatever means they can to stake out some territory inside your mind, and leave their message there, so that it appears again when they want it to, like a posthypnotic suggestion.

At some point while you were reading the preceding paragraph, you probably realized that I was talking about the ad industry, and a switch flipped in your head. "Oh," you thought, "he's talking about the ad industry, they're actually harmless! I will read the rest of this paragraph with that in mind, and appreciate the joke instead of being worried." Frankly, it frightens me a little, how willing everybody is to trust their impression of an industry that spends billions of dollars every single year with the goal of manipulating our impressions.

The Attention Economy

It seems pretty clear that people's attention has value - it's the premise that the entire ad industry is founded on, really. If attention has value, we should be able to treat it as property; it belongs to each of us. It can be thought of as similar to intellectual property, since it's an inherent product of our minds. When you have people's attention, you can put ideas in their heads, and this is immensely valuable if you have a cause (or a product) you'd like to spread around.

(Why don't we spend every waking moment taking in messages from others, giving away our attention to all who seek it? Because attention has value to us individually. To actually do things, and not just passively absorb things that others have done, we need to reserve some of our attention for ourselves.)

Attention can be sold. Every time you turn on the television, you're engaging in an economic transaction: In exchange for letting me watch Hugh Laurie cure people's ills with sarcasm, I agree to give up X number of minutes of my attention to watch these advertisements which are conveniently interspersed within the show. This commercial arrangement (pun unintended) was forced on us, so most of us don't really take it seriously - we ignore ads most of the time, or switch channels, or edit them out entirely if we have a good DVR. The idea is still there, though, even when we're not holding up our end of the bargain.

Attention can also be stolen. Take billboards, for example - you're not getting a damn thing in exchange for having to look at billboards during your morning commute. They've taken your property without your freely-given consent. Why do they get away with this? People don't generally consider their own attention to be their property, so forcing them to sit through commercial messages against their will is usually seen to be at most an annoyance.

Without realizing it, we've developed ways to get discounts on the attention we are charged. When commercials come on TV, we change the channel, and escape without paying attention. (Broadcasters see this as shoplifting, and would love to prevent you from changing the channel during commercials like DVDs already do - but they know how well that would be received. Pun unintended.) On the Web, our eyes slide right past increasingly distracting ads designed to hold our attention hostage - we've learned how not to pay them any attention. When we receive spam, we go several steps further, and have complex technological systems designed with the sole purpose of classifying and deleting spam before we even have to look at it. Marginal attention cost: zero.

Do Marketers Dream of Hypnotic Sheep?

(Yeah, that's not how hypnotic is used, but I really wanted it to rhyme :( )

Targeted advertising is incredibly primitive today. Marketers target incredibly broad categories, and while that level of targeting produces some results for them, it's laughable compared to what they could be doing. We know that they have access to increasingly comprehensive and diverse data about each and every one of us, and it really is worrisome. Today, though, given the sorry state of the art in targeted advertising, we don't have anything to worry about yet.

Yet.

There are really three pieces here. They need raw data to start with, but they already have that coming out their ears, with the advent of electronic point-of-sale systems, and tracking cookies on the Internet, and pervasive CCTV cameras, and RFID tags, and any number of other new technologies. They need computer scientists (specifically, data mining specialists) to sift through that data, tease out correlations and useful facts, and find out everything there is to know about each of their consumers. They're making progress on this one. Finally, they need psychologists, to optimize their marketing so as to maximize their chance of influencing their targets. I don't know what the state of the art is here, but I sincerely hope it's nowhere interesting.

Because the psychological side of this could get very interesting indeed. There's already significant research that's been done on decision making, and on obedience, and on hypnosis, and on other tricks you can play with the human mind that I don't even know about. What if some enterprising young psychologist combined all that into a predictive model of the mind, where you could try out different inputs, and figure out how a person would respond to them, based on an existing profile of the sort we already have? It would be the next best thing to mind control, albeit for an individual.

Let's call this the manufactured meme. It's not a matter of if, but when - someday, people will be able to design viral ideas, and deploy them into an unsuspecting society. Viral advertising and catchy commercial jingles are the beginnings of this. Advertising is trying to evolve into meme manufacturing at this very moment, and the only thing holding it back is the limitation placed on it by our current understanding of the human mind.

The fear I have about advertising - the fear that makes me think we should dismantle the entire industry, while there's still time - is that they will connect the dots, and that they will pull in the specialists they need to sift through the mountains of data that they're collecting, and the psychologists to direct the use of that data and figure out what makes people obey, and they will create the most perfect form of population control that has ever been conceived of by man. Only a fool could think that, after creating a system that can influence populations with a high degree of accuracy, those in power would use it only for good.

History has repeatedly shown that, once a tool exists, it will eventually find its way into the worst possible hands. In this case, the worst hands I can imagine are those of a totalitarian government, or a corporate fiefdom. 1984 tried to predict exactly this, and Brave New World came somewhat closer to what's possible, but both these books depict some fairly coarse-grained population control. The possibilities that manufactured memes and individually-targeted advertising present are far more subtle and terrifying. Imagine an ad for a political candidate that changes what it says to be as convincing as possible to each person that looks at it. The candidate's actual views are irrelevant - as we've learned repeatedly, it doesn't matter a whole lot what they promise during an election. Imagine a government that enforced conformity by turning people against their neighbors when those neighbors didn't toe the party line. Imagine untraceable targeted assassinations through advertising - remember the killer Pokemon episode, that induced seizures? An ad platform that can show everybody a different message could do that to a select group for next to no cost. (Okay, that last one is a little farfetch'd. Pun intended this time.)

This is not science fiction. It may sound farfetched now, but I don't believe we're all that many breakthroughs away from this becoming reality. And it's not a reality I want any part of.

Monday, December 14, 2009

Late-night braindump!

Guys! We've totally missed the real potential with ChromeOS!

It's got potential as a netbook OS, sure. But, beyond that, appliance computing - embedding the OS in the hardware is a huge deal! Especially if it's stateless. Put two copies of the OS on separate flash drives, glue them in, reduce the moving parts (how many do you really need, anyway? OLPC got it down to 0, and something like 3 internal plugs), and you've got a 100% self-contained system that never needs maintenance! As long as we're borrowing tricks from OLPC, you could put a battery on it and have it portable too! Add a hand-crank and you've got truly mobile computing. Everything else along this line of thinking is just implementation detail, but yeah!

An interface (monitor, input devices, whatev) would be nice, but even nicer would be a system that's implicitly based on clustering. You'd take two or three or four of these, plug them into a switch, plug a monitor and a hard drive or two into that same switch, and you've got a computer that's resistant against hardware faults and completely network-based. You could upgrade it just by plugging new components in and... that would be it, yeah! Probably wouldn't even need a restart. We already have the foundations for zero-configuration networking, so this would be the easy part.

Oh hey! as long as it's network based, you could add multiple user terminals and have several people using a shared cluster with full access to all resources. I think most of the problems here are solved; the interesting one I see is networked storage - you'd need encryption, because we have to assume that anybody can see any hard drive, and hmm. That might actually be it. Obviously you'd need a way to provide a private key - maybe adding a USB port to the input devices and plugging in a flash drive. Idea! The input device could be a thin client with a custom interface, which you could carry around and plug in to any cluster. That would take a lot of the burden off the OS on the cluster nodes (which would have to be pretty stripped-down and generic to begin with) and oooh. That'd be neat.

Problem: untrusted clusters. Crypto-computation might be impossible (I have seen some laughably bad research attempting to solve it), so we might have to assume that any given node can potentially snoop on computations and alter them. It might be possible to finagle something with a signed system on the cluster nodes, but they'd be too much of a high-value target to trust that some other kind of manipulation wasn't going on. Trust is a fundamentally hard problem, let's ignore it for now and work around it. Either we implicitly trust the cluster (boring, stupid) or we figure out a way to run computations on untrusted nodes so that - dammit, back at crypto-computation. :[

If Light Peak lives up to its potential, it would be kickass for this. Everything would only need one plug. Problem: fiber is a lot less resilient than CAT-5. Ah, well, not like the layer-1 matters too much.

Appliances require maintenance sometimes! What we really want here is a computer that you can embed in the wall and forget about, and have as a seamless part of the network doing useful stuff, and be reasonably confident that it'll outlast the building. I can't think of a name for this right now, but it'd be pretty cool. Transparent computing? Ubiquitous computing? Ooooh! As long as we're sticking these in the walls, we could create a wireless mesh network with the neighbors and if this actually becomes widespread it'll be usable for stuff. Routing in mesh networks is hard and really outside the scope of this braindump but the potential payoff is too cool for me to ignore.

Anyway, back to transparent networked storage! I am imagining an interface where you can just dump data onto a network, and "pull" all data owned by you back onto your own system when you leave the network. This would be at least a billion times nicer than our current home network storage metaphor, which is complete crap. Network storage shows up as hard drives, which sometimes vanish, and may or may not be accessible based on remote configurations, which can change at any time, and concurrent access is a bad joke, and seriously who came up with this ridiculousness? But I digress. We want all storage to be accessed in a unified way, and always writable would be nice - a reciprocal protocol would guard against people just hogging all the storage, but dunno how that'd work yet. Future braindump topic! Maybe require people to provide all the storage they actually use, but distribute the data around the network so they get the benefit of insurance against disk failure.

(For any of this to be feasible, we have to be able to build 100% secure operating systems. Attention, programmers everywhere! Stop sucking. :( )

If we could set this up in such a way that you could charge people for using resources on your nodes, we'd have a completely distributed clone of EC2. Cool! (Oh, hey, that'd work for storage too. Two birds with one distributed ripoff! :D)

Back to the software side: to share compute resources, we'd need a virtual machine and a stable OS API so that different computers can use the cluster. There are a few good choices right now (JVM and CLR mainly, and LLVM is looking good). Randomly: it'd be nice to have a VM that supported distributed execution of a single program. Might get messy with failures -- no, wait, the VM should handle failures! And the program should just see a single (highly parallel) machine that never fails! Gosh, that'd be neat.

Years of utterly worthless computer security have left us in a state where we can't even pass a file between two computers without bypassing six different "security features" designed to keep computers from ever communicating, just in case one side or the other is infected with a virus. This is pathetic! Hardware is improving at an incredible rate, but that hardly matters when software is completely and pathologically unable to keep up. The true cost isn't that some things are harder than they should be; sending files at all is a parlor trick. The true cost of poor security is all the cool stuff we could be doing if we didn't have to treat every damn network service as an attack vector.

Technology is pretty neat and all, but sometimes I wonder that we don't fully appreciate how primitive this all really is.

Saturday, December 12, 2009

Goodbye Twitter

It's been about eight months since I decided I'd start microblogging. How has it turned out? Actually, you can probably guess from the title that I'm ditching Twitter and just sticking with Identica.

So what's the difference? I started with both of them basically cold, knowing very few people on each. Twitter has the advantage of being a fad right now, so all sorts of people are using it. It's nearly achieved ubiquity; the API is simple enough that you can integrate Twitter support into just about anything. Identica, on the other hand, has some really, really nice features:

* Groups: Twitter has hashtags, where you can tag a notice as being about a particular topic. Groups take that to the next level, and let you subscribe to a topic. Twitterers: imagine being able to follow a hashtag and you'll understand. This makes it tremendously easier to get started with Identica. Instead of having to find interesting people to follow, you can subscribe to a group that interests you, and find the people you'd like to follow as they post interesting stuff to the group. Or, another way they're useful: on Twitter, if you want to ask a question, you have to either address it to somebody you know is knowledgeable (and hope they're around), or ask the world in general, and hope that somebody following you knows the answer. With Identica, you can address the question to a group (or even multiple groups), and get a good answer much more easily.

* XMPP: In all fairness, Identica's XMPP support doesn't always work (it tends to lag during upgrades, for instance), but when it's working, it's a thing of beauty. People sometimes describe Twitter as realtime, but I'm pretty sure that's just because they've lowered their standards; the latency is still measured in minutes even under ideal conditions. Using XMPP on Identica, I can use the site through my existing IM client (pidgin, yaaay), and the latency is measured in seconds. That's getting down to the speed of instant messaging - you can have actual realtime conversations with people on Identica through a chat client! Yeah, it's pretty badass.

* Context: Twitter includes a link on tweets to the tweet you're replying to. That's cute, Twitter! Identica has a similar link on updates, only instead of showing you the update it's replying to, it shows you the entire threaded conversation. Seriously, head over to my profile, and try one of the "in context" links. It's not perfect, since it's not always possible to infer which conversation an update is part of, but it works something like 90% of the time.

If I had to try to put my finger on the biggest difference between Identica and Twitter, it'd be this: Twitter seems to be focused on people, while Identica is focused on conversations. Twitter's social dynamics are interesting in an abstract, six-degrees-case-study sort of way ("how far can we get using just the social graph?"), but it seems to be remarkably resistant to the formation of communities. Conventions like retweets and Follow Fridays help here, but they don't exactly constitute a solution. Instead, they just serve to outline Twitter's real problems.

So farewell, Twitter. It's been fun, but Identica is just more interesting.

Monday, November 30, 2009

Against Sleep

I don't dislike sleep itself.

Right now I should be going to sleep, but I'm not because for some reason, some nights, the apex of my alertness for the entire day hits as soon as my head hits the pillow. Tonight already feels like one of those nights where I lie awake for hours waiting for my brain to shut itself off so I can stop staring at my clock and watching the hours tick by. And every ten minutes I remember something else I could be doing instead of waiting for sleep (like writing this blog post) and I either do it, or it bounces around inside my skull for the next half hour until I remember something more important that I've forgotten to do.

What's worse than the nights, though, is the mornings. I always feel incredibly groggy in the mornings, sometimes (rarely, luckily) to the point that I can't even get myself out of bed - and that's when I've had enough sleep. When I don't get enough sleep, it feels like microscopic ninjas have snuck under my skin while I was sleeping and sprinkled a layer of fine grit right under my epidermis. If I could switch the way I feel in the morning with the way I feel in the middle of the night, I might not be a happier person, but I'd probably at least be a lot more balanced.

And in between these two wonderfully inverted polar opposites, I dream. I'm really glad I don't remember the vast majority of my dreams, because based on the ones I do remember, I apparently dream in low-budget sci-fi horror flicks. Let's look at some recent dreams that I remember. In the most recent one, I had the power to travel back in time and redo parts of my life, but I was hunted and repeatedly killed by somebody with a similar power. In another recent dream, I die in some sort of zombie outbreak caused by a mind-controlling fungus. I wake up decades later after it revives me for some reason; it's taken over the world and nobody realizes it (because the fungus altered everybody's memories, natch). It's never bright and happy dreams, it's always stuff that freaks me the hell out.

So anyway, I've got nothing against sleep, really. It's everything associated with it that I can't stand.

Sunday, November 29, 2009

Decentralization VII: Definitions

I started this series of posts to try and get a better handle on what decentralization is, what it's good for, and what general situations require it. The level of centralization is an important characteristic of systems, but people don't always understand the tradeoffs that exist there. I think, at this point, that it all comes down to the division of control.

The definition I think I'm going to use: A system is decentralized in a certain aspect if control over that aspect is divided up over the components of the system. It's too vague to say that a system is a decentralized system without specifying what aspect you're talking about. Systems can be centralized in some aspects and decentralized in others, and in fact I think nearly all systems are a bit of both.

Mini case study: Blogspot has centralized ownership (Google) and code, along with centralized authentication (Mostly Google accounts, I think?) and centralized execution (runs on Google's servers). On the other hand, it has decentralized authorship (contrast with a newspaper or something like that), and participates in several decentralized protocols (linkbacks, RSS, etc). Contrast with Wordpress, which optionally has decentralized authentication and execution (you can run it on your own servers), and somewhat decentralized (open-source) code. Because of this, we can say that, broadly speaking, Wordpress is less centralized than Blogspot.

Systems can vary in the degree that they're decentralized. Federated systems, which I've mentioned before, have a set of independent servers which are decentralized, but clients connecting to the system don't have to worry about this and can just pretend they're connecting to a centralized system. (Contrast with p2p networks like Gnutella, where every user on the network is simultaneously a client and a server.) Once we understand decentralization as a division of control, it becomes clearer what's going on here: control is being divided among only the participants in the system that want to run servers, and thus reflects the reality that only a few users will actually want the responsibility that comes with that control.

Can we divide up control in more complex ways? Yes, but there's not a whole lot of benefit to it, and it increases the complexity of the system significantly. Simplicity is inherently valuable when it comes to code; complexity is usually a problem to be eliminated. I can't think of any computer systems right now that have more complex control hierarchies than federated systems - they almost certainly exist, but they're not common.

So what are the tradeoffs? Control is important in systems for both technical and business reasons, which aren't always the same. Twitter would definitely be more reliable if they made it more decentralized (more on robustness in a bit), but it would also impact their (nebulous) ability to turn a profit if they gave up control.

Why are decentralized systems more robust? There are a lot of factors that come into play here, really, so I don't have a clean answer yet. Let's look at some failure modes of systems. System overload is less of a problem in decentralized systems, because they're necessarily designed in more scalable ways, which makes it easier to throw hardware at the problem until it goes away. Hardware failure is also less of a problem, because systems that execute in a decentralized manner can be made immune to individual pieces of hardware failing more easily, Software bugs are less harmful in systems with decentralized codebases, because different implementations of the same protocols rarely have the same bugs.

Maybe this is the answer: centralization implies control, and control implies an opportunity for the one in control to make a mistake. If we define robustness as the number of mistakes or hardware failures that would have to occur to take the whole system down, or better yet the probability of a sufficient number of mistakes happening, it's easy to see why decentralized systems are robust. Thought experiment: if we had a system that was decentralized in such a way that any failure would break the system, we would have a system where decentralization made it far more fragile, rather than robust. Luckily, you'd have to be crazy to design something like that.

This is the last post in this series, I think - I've found what I was looking for.

Saturday, November 28, 2009

Climategate

Is it just me, or is it kind of ridiculous the way every scandal in American politics is eventually referred to as something-gate?

For those of you that don't keep up with every little thing to happen in climate politics, somebody hacked into the servers of a climate research group the other week, and released an archive containing selected portions of the past 10 years of their email correspondence, and probably some other stuff too. Naturally, people were able to find all sorts of embarrassing stuff in there. The most talked-about email is one referring to a "trick" somebody used to combine two data sets, back in 1999 - this obviously proves that the entire field of climate research is a hoax, to hear people talk about it.

I won't be talking about any of the specifics of the leaked data, though, because I can't look at the source data in good conscience. Breaking into somebody's servers and putting over a decade of correspondence online for somebody's rivals to pore over is beyond reprehensible. It's especially bad in this case, when the rivals in question are playing politics and have very low standards for proof.

One particular issue that keeps cropping up is the removal of context. People will go and cherry-pick quotes (here is a recent and out-of-mainstream example), and somebody will eventually point out that the code in question only applied to one graph that was only used for "cover art". It's easy, when you're looking at correspondence over a span of many years, to find quotes that look bad out of context - but what does that actually prove?

Let's be miserly to the researchers here. Let's assume that some of these quotes really are as bad as they look out-of-context, that there really is a massive conspiracy to rewrite the data, which is somehow only evident in a few snippets. Even assuming the worst, what does this prove? This is only correspondence for a single institution, which is competing with several other groups - and yet the results all still point in the same direction. Should we posit that every group working on climate science is in cahoots, and engaged in a massive conspiracy? What scares me about this is that many people would say that we should.

Maybe I'm wrong here, but here's the impression I get watching the climate change debate. First, there are serious researchers, who (generally) agree that climate change is probably happening, and do actual work to figure out the extent of it. Second, there are the "green" cheerleaders on the political left, who say mostly inane, common-sensical things about global warming, and push for specific political/technical solutions (looking at you, Al Gore). Their hearts are generally in the right places, except for the ones that are only on this team because they've found a way to make a profit (I'm watching you, Al Gore).

Then there's the anti-global-warming crowd, made up of both right-wing political types and people reacting against the second group, who seem content to take random potshots at the first two groups indiscriminately, cry about conspiracies, and (like all conspiracy theorists) don't really care about having a consistent response to the people doing actual science. It's telling that there are almost no climate scientists in this group. (Members of this group will respond: "because of the conspiracy!", and thus prove my point.) This group is especially irritating to me, because they tend to be very anti-scientist.

Seriously, though - hacking people maliciously is bad enough, but calling it Whatevergate is just awful.

Friday, November 27, 2009

My room is not my room

In the tried and true tradition of people trying to come up with something under a deadline, I'm going to look around and write about the first thing that comes to mind.

I'm staying at home this weekend 'cos of Thanksgiving, and I'm realizing that my room no longer really feels like my room. Everything in the room, with the exception of a floppy red hat that I picked up this summer, is exactly as it was when I left a little over three years ago. Thanks to the ruthless forces of natural selection, everything I really care about has moved with me to my apartment, and everything else (a lot of plastic dinosaurs, for example) is stuff I simply don't need. I suppose you could say that my stuff has evolved.

After a series of ruthless throwings-away, just about everything is gone. Just like with archaeology, the only bits that are left from the 12-year or so span are the ones that happened to land or get stuck in favorable spots. Here's a jigsaw puzzle that's a National Geographic cover; we thought it looked neat so we framed it. I used to do huge jigsaw puzzles all the time with my mom, I wonder when that tradition died out. I kind of miss that.

Next to it is a set of collapsible shelves. I was never content with just having ordinary shelves, so I put them together in some weird way, with varying heights. The whole thing is kind of rickety since it was never supposed to work this way, but I kind of like it. I think it looks better this way, really - it's a pretty nondescript piece of white plastic furniture. There's a brown plastic Tyrannosaurus on the shelf. I've had it for as long as I can remember.

I should mention that all the posters on the wall are tilted by about 10-15 degrees. During some nonconformist phase (back in middle school, I think?) I went and made everything off-center, because I liked the way it looked. I probably would have done the furniture that way too, if it'd been feasible. They're all basically as they were when I left; the only poster I really cared to bring with me to college was a six- or seven-foot-tall picture of a Shuttle launch (it is pretty awesome).

Something that I'm realizing as I look around. The summer after eighth grade is when I got my own computer, and it's also when I really stopped caring what my room looked like, since I was spending a lot more time online and a lot less time in my room. There's almost nothing in here that reflects me after I started high school. Maybe that's part of the reason the room feels so alien to me now - it's largely still my room from eighth grade. That's kind of a depressing thing to realize. :/

At some point (probably next May) I'm going to have to move out of this room properly. I'll take down all the posters, clean out all the drawers, throw out whatever I don't want to bring with me, and officially return the room to its status as a guest bedroom. After that, I'll come back and visit this house, but I'm never going to live here again. That's another depressing thing to realize, but one that I'm going to try to come to terms with before it becomes painfully relevant in about six months. I've already started to refer to my apartment as home; I think that's a good first step.

Wednesday, November 25, 2009

Taking Back Chrome OS

Chrome OS is a pretty interesting platform, all the more so before it's open source. It's pretty strongly tied to Google right now, but there's nothing stopping me or anybody else from taking out all the Google-specific bits, and building up something new in their place. That's the interesting thing about open source - rather than being controlled by whoever wrote the software, it can go in directions that the creators may not have anticipated, and may not want.

The obvious first change would be to switch out the default browser - Firefox OS? Opera OS? This would depend on putting the X stack back into Chrome OS, since if I recall correctly Google dropped that. It's definitely possible, though. I think the harder part would be making the necessary modifications to the browser to allow it to be used as the sole interface to the system. Chrome can apparently view folders, and supports pinning web apps to a menu; I don't know if other browsers would want to do things exactly the same way, but they'd have to do something similar.

We could also switch in other non-browser applications. The command-line addict in me would love to have an OS that boots straight to a command line, and can connect to the internet so I can ssh. Having that on a netbook that I can carry around would be insanely useful. Application vendors could also create single-application systems as demos (and if they do, people will be trying out their apps running on Linux!).

Some people are going to ask what the point is. We already have much more capable Linux distros that can do the same thing, so what's the point? The interesting thing that Chrome OS adds is completely safe, transparent upgrades, and a read-only root. This has two humongous implications for the end-user: they don't have to worry about software upgrades ever, because the system auto-updates, and they can always reboot the system to get it back to a known-good state. Yes, this also means that you lose a lot of flexibility with the system; it's a tradeoff, and in many cases, the benefits of this outweigh the costs.

The core interesting idea of Chrome OS that we can use is the idea of "appliance computing". Looking at it a certain way, the OS really doesn't matter at all - what the user cares about is the application. Chrome OS, minus Chrome, gives us the necessary foundation to build completely stripped-down single-application systems. Obviously, this isn't going to be useful for everybody in all cases, but it's a pretty interesting idea to play around with.

I think the first step should be factoring out the "OS" part of Chrome OS. It'd be useful to have a core OS part, which automatically updates in the background, and an application part, as an image which is overlaid on the OS and contains the application. This way, there could be a single shared source for OS upgrades for all the different single-app systems. It would also make it pretty easy to make your own computing appliance, since you'd just have to package up the necessary files. There would need to be a dependency system so that version mismatches between the two parts don't cause problems, but this is a problem that Linux distros have explored pretty thoroughly.

This is kind of drifting into some ideas I've had about new ways that distros could be structured, but that's a whole other topic for a whole other post. ;)

Free electronic stuff!

This is going to be kind of awesome.

It's a really interesting idea for a promotion, actually - everybody loves free stuff, so the news that this is happening is going to be all over the place by January 7th. I, for one, am going to start planning a shopping list for that day well in advance, just in case I actually get the free stuff (they're capping the giveaway at $100k, which will probably dry up really quickly). If I don't get it, I might actually still buy the stuff - it depends on how I feel about whatever project I come up with.

I really want to build some kind of flying robot, dirigible-style. No idea how that's going to work, yet - ideally it'd be able to float mostly unpowered, so I'd either have to buy a lot of helium, or learn how electrolysis works (hydrogen rocks! :D). That's my tentative plan, anyway. We'll see how long it lasts. No matter what, I'm going to have some kind of plan come January 7th.

This sort of promotion ("hey, guys! free stuff!") would be really neat if more retailers started doing it, actually. Imagine if Newegg or Amazon just announced that 1% of orders on the site would be totally free. People really love lotteries, as history has shown, so this could be really effective. If I'm choosing between a deal offered on two sites, and one of them has a chance of giving me the order for free, I know who I'm buying from.

Anyway, short post today, because Thanksgiving is tomorrow! I've made cornbread and cranberry sauce so far. :D

Tuesday, November 24, 2009

Terrorists we need to watch out for

With the terror alert level pegged at Yellow (or Orange, if you're flying), and thousands of American troops in the Middle East fighting "terrorists", I think it's important that we know who the enemy is. Here are some prominent terrorists of the past few years:

* Richard Reid, the Shoe Bomber - Despite the stringent security measures in place at airports, Reid did the unthinkable, and managed to smuggle a bomb aboard an airplane. After being admonished once by a flight attendant for lighting a match, he was later found struggling with a fuse, another match, and one of his shoes. Next time you have to take your shoes off at the airport, remember this: every air traveler has had to undergo this inconvenience for several years now, because of a terrorist that couldn't detonate his own bomb.

* Gary McKinnon, Super-hacker - The US government is currently trying to extradite this Extremely Dangerous man from the UK. McKinnon broke into nearly a hundred military computer systems during 2001 and 2002, using advanced hacker techniques such as guessing default or blank passwords. He used his access to top-secret computer systems to delete some files, and search for information on alien life, which he believe NASA and the US government were hiding. Gary McKinnon is one of the most credible cyber-terrorists discovered to date.

* Iyman Faris - Suspected of plotting to destroy the Brooklyn Bridge. This would have been a tremendous disaster, on several levels. Luckily, the plot involved cutting the bridge's support cables with an acetylene torch, which would have taken some pretty serious time and effort, since the cables are necessarily pretty sturdy, and consist of thousands of individual wires.

* Najibullah Zazi - One of the most recent terrorists uncovered in the US, Zazi received terrorist training directly from al-Qaeda, including instructions on bomb making. Had he managed to construct a bomb, he could have done some serious damage. Unfortunately for him, he failed to build a bomb despite nearly a year of repeated attempts. Nevertheless, this is hailed by some as the most serious terrorist plot uncovered in the US since September 11th.

* The Fort Dix attackers - Their entire plan was to show up at a military base and kill as many people as possible. This could have been a really serious plot, if they'd managed to acquire the weapons the plot required. However, they made a few fatal blunders: they videotaped themselves firing automatic weapons and talking about the plot, and then sent the footage to a Circuit City to get it made into a DVD. They also attempted to buy their weapons from an FBI informant.

* Star Simpson - While picking up a friend at the airport while wearing a homemade nametag with blinking lights, Simpson was arrested by officers carrying submachine guns. For a few days, the media was in a frenzy, gleefully reporting it as a threat with a "fake bomb". They stopped talking about the situation, naturally without any kind of retraction, once it became clear that she wasn't actually any kind of threat. I feel safer already!

So remember, folks: terrorists are people you need to be afraid of. The next time you see a major terrorist plot being reported on the news, pay attention to the details - given the media's track record in this area, it's far more likely that they alleged terrorist is incompetent, insane, or just plain innocent.

Monday, November 23, 2009

Cop-out post again

No time to write a proper post today :( Instead, I'll just point you to MLIA - hopefully it is enough of a distraction from this fail post >_>

Sunday, November 22, 2009

Monad documentation hate

Monads are a useful construct in Haskell. Even though it's a pure-functional language, monads allow you to "cheat" a little, or at least pretend to, and carry state through computations (or do other stuff). This is pretty important, since a truly pure functional language wouldn't be able to do useful things like generating output.

Unfortunately, just looking at the documentation that exists online about monads, I wouldn't be entirely convinced that they're not just an elaborate joke cooked up by functional programming partisans - the equivalent of a linguistic snipe hunt. Let's look at some examples.

First, the Haskell wiki article on monads, since that comes up pretty high in search results. This is pretty close to coming straight from the source; I bet they have some great, readable documentation, right?

This is the introductory image they use. The caption reads: "Representation of the Pythagorean monad". Thanks, Haskell wiki! Since you haven't yet explained what a monad is, or why there would be a Pythagorean one, this diagram means nothing at all to me. Actually, it's worse than that - having put some serious effort into learning what monads are, and actually having a basic understanding of what they do, this diagram still makes no sense to me. Awesome!

Or, let's look at this tutorial called "A Gentle Introduction to Haskell". It's the first result when I search for "haskell using monads", so it must be pretty useful, right? Unfortunately, it begins by diving straight into the "mathematical laws" that "govern" monads, whatever that means. Actually, I'll just quote it:
Mathematically, monads are governed by set of laws that should hold for the monadic operations. This idea of laws is not unique to monads: Haskell includes other operations that are governed, at least informally, by laws. For example, x /= y and not (x == y) ought to be the same for any type of values being compared.
Which is great and all, but doesn't explain a whole lot to somebody who doesn't yet know what monads are. This is not how you write introductory documentation, guys! To borrow an example from the quote, you don't introduce the == operator by saying that it follows a mathematical law. You say what it actually does first!

So what is a monad, actually? As far as my (admittedly extremely rough) understanding goes, it's the functional equivalent of a wrapper class in OO languages. You can use it to wrap up some boilerplate code that would otherwise be really irritating, in a more or less standard way. Haskell also provides some syntactic sugar in the form of "do" blocks, which you can use for some monads to make your code look like imperative code.

See, now that wasn't so hard, was it?

Saturday, November 21, 2009

Building ChromeOS on Gentoo

Google released the source code to Chrome OS, and there are already some disk images floating around, but we Gentoo users know that if you want it done right, you have to compile it yourself. ;) Here's how I built a VMware image on Gentoo.

The build instructions say you need Linux, but Google, as usual, seems to assume that you're running Ubuntu when you're trying to build Chrome OS. Here are the steps I took to build it, following the documentation that starts here:

(If you don't really care about how I built it, and just want the image, click here.)

1. emerge debootstrap
(debootstrap is necessary to actually build the OS, but you can leave this running in the background while you do the next few steps.)

2. wget http://src.chromium.org/svn/trunk/tools/depot_tools.tar.gz
3. tar xzvf depot_tools.tar.gz
4. export PATH=$PATH:$PWD/depot_tools
5. gclient config http://src.chromium.org/git/chromiumos.git
6. gclient sync
(wait for a really long time while it downloads the source)
7. cd chromiumos.git/src/scripts

(at this point, you'll need debootstrap installed, it'll be in /usr/sbin so let's put that in PATH)
8. export PATH=$PATH:/usr/sbin:/sbin

(if this fails, and you have to rerun it, you might need to delete the repo/ directory between runs. also, a few of these scripts will sudo, mainly to set up the mounts in the chroot; read through the script if you don't trust it, it's pretty short)
9. ./make_local_repo.sh
10. ./make_chroot.sh
11. cd ../..

(now, we download the browser. the link on the build instructions is broken, but this one should work:)
12. mkdir -p src/build/x86/local_assets
13. wget http://build.chromium.org/buildbot/continuous/linux/LATEST/chrome-linux.zip -O src/build/x86/local_assets/chrome-chromeos.zip

(now, we enter the chroot and continue with the build)
14. cd src/scripts
15. ./enter_chroot.sh

(at this point you should be in the chroot)
(grab a snack, this'll take a while)
16. ./build_platform_packages.sh
17. ./build_kernel.sh

(I ran into a conflict here: HAL on the host system somehow blocks ACPI from being installed in the chroot. Stopping hald on the host system worked around it successfully.)
18. ./build_image.sh

If you made it this far, you have an image built! Awesome.

To build a VMware image, use the image_to_vmware.sh script:
19. ./image_to_vmware.sh --from=../../src/build/images/999.999.32509.204730-a1

Note that this script requires qemu-img. You can edit the script and replace qemu-img with kvm-img if (like me) you have kvm installed but not qemu.

I haven't tried building a USB image, but it should work something like this:
20. ./image_to_usb.sh --from=../../src/build/images/999.999.32509.204730-a1 --to=/dev/sdb

So, as of right now, I'm running Chrome OS in VirtualBox. It's pretty slow, being in a VM and all; I'm going to try to get it on my Eee later, when I have more time.

First impressions:
The boot time is pretty freaking ridiculous, especially since it's running in a VM. Something like ten seconds to a login screen. Can't wait to see how it does on real hardware.

There's some kind of bug where I get certificate errors for Google sites - but only the pinned ones. The pinned Gmail tab errors, for instance, but logging into Gmail in a new tab looks fine. Other people have reported similar problems for other builds, it looks like, so I'm expecting that it'll get fixed eventually. It might have something to do with how it tries to automatically log in based on your login credentials; that's complete speculation on my part, though.

The Chrome menu that was in the demo is missing in my build. Not really sure why. :( Could be because I used a different Chrome binary than the one they listed in the install docs. Will have to try that again once they fix the link. >_>

Friday, November 20, 2009

Decentralization VI: Package Management

I have issues with software installation on Linux distributions. As opposed to Windows and Mac OS X, where (for better or worse) you can download a package from a website and install it, Linux packages come through centralized repositories. There are advantages and disadvantages to this method, but Linux distros don't do it because of the advantages; they do it out of necessity. I'll explain why in a bit, but I want to get the pros and cons out of the way first.

Centralizing packaging has certain benefits from a QA standpoint. It allows distros to ensure compatibility between all the various components of the system, and enforce various standards across all the packages. It also allows them to smooth over incompatibilities between distros - if Ubuntu does things a certain way, but the person that wrote the software had Red Hat in mind when they wrote it, central packaging allows Ubuntu devs to sort that out before the software lands on people's machines. Distros also prefer to have control over packaging because there are a lot of different package formats used on Linux, and it would be kind of ridiculous if every software author out there had to support all of them. There aren't hard and fast rules about which parts of installation are distro responsibilities, but there are conventions, at least.

Distros also use centralized distribution, for the most part: when you install an Ubuntu package, you download it from an Ubuntu server, using an Ubuntu package manager. This simplifies finding and installing software, obviously. You don't have to look very far to find any given program, and you're assured that what you're installing is actually the program, and not some random virus. The organization behind the distro also has to provide servers, of course, but this isn't too much of a problem. Bandwidth is cheap these days, and for a distro of any significant size, there are plenty of people willing to donate a spare server or two.

As for the disadvantages, centralized software creates a barrier to entry. Anybody can write a program for Linux, but actually getting it into the distro repositories takes a certain amount of notoriety, which is more difficult to gain without being in the repos in the first place. The result is that there's a lot of software out there that doesn't exist in any repository. Users generally don't like to install software outside of the distro package managers, because when you do, you don't get any of the nice features (such as, oh, being able to uninstall the software) that the package manager provides.

Distributions also get saddled with the unenviable job of cataloguing every useful piece of software for Linux that people have written. This takes a huge amount of developer effort; Gentoo, for instance, has hundreds of people (the vast majority of people working on it!) dedicated to just maintaining the programs you can install. We can really take a more general lesson from this: When you try to centralize something which is, in its natural state, decentralized, it's an expensive and ongoing job.

But pros and cons aside, I said earlier that Linux distributions do this out of necessity, not just because they want to. If you write a piece of software for Windows, Microsoft has a lot of people working pretty hard to ensure that it'll work on future versions of Windows. Backwards compatibility is intricate, uninteresting work. Since Linux is written mostly by volunteers, it's exactly the sort of work that never gets done, because nobody really wants to do it. The result is that backwards compatibility on Linux is a bad joke. Developers break compatibility, often gratuitously, often without warning, and so Linux software needs to be constantly maintained or it simply ceases to function.

In an environment like that, you absolutely need somebody doing the hard work and making sure a piece of software still works every once in a while. That job falls to the distros, because the original authors of the software don't always care that it works outside of the configuration that matters to them. Look at it from the distros' perspective; if you're trying to make a coherent system, but the components of the system are prone to randomly break when you upgrade them, you need to maintain centralized control if you want to have any hope of keeping things stable. In other words, the lack of backwards compatibility on Linux forces distros to centralize software distribution, and do a lot more work than they would otherwise.

These posts are supposed to be case studies in decentralization, so I'll summarize. The difference between Linux and commercial platforms is the degree of compatibility in the system. The degree of compatibility determines the amount of control you need to make sure the system works as a coherent whole. With Linux, the need for control is much higher, so distributions are pushed towards a centralized software distribution model.

Thursday, November 19, 2009

Chrome OS

I just watched the live webcast announcing Google Chrome OS. I was expecting a lot from Google with this, but they've gone even beyond that; this announcement is serious business. They're talking about fundamentally changing the way people use computers.

First, the basics: Chrome OS is exactly what it sounds like. It's an operating system that boots directly into Chrome. The OS is a stripped down Debian install, but that doesn't really matter, as we'll see in a bit. Everything happens through the browser window - there's a file browser built into the browser, for instance. The start menu equivalent (of course there's one) is a Chrome logo in the top left corner of the browser. There's no desktop, no My Computer, nothing else - just Chrome.

This brings us to the first major difference between Chrome OS and other OSes. There are no applications to install; everything you could conceivably want as an application is a web application. They make this a bit easier by pinning some shortened tabs ("application tabs", they call them) at the front of the tab list, so that you have one-click access to your Gmail, for instance. Obviously, this is a pretty radical design choice. The emphasis is definitely on shifting to online services for everything, rather than using desktop applications - and, not at all coincidentally, Google has spent years polishing their online versions of desktop applications. (They showed the online version of Office during the demo, but it looked terrible. Half the screen was taken up by the controls. I can't see how it's a serious competitor to Google Docs, in its current incarnation.)

Not only does this moot one of the usual objections to Linux ("I can't run my apps on it"), it dramatically simplifies securing the system. My understanding of it is, Chrome OS has a small number of signed and/or whitelisted programs that it runs, and the system can assume that anything outside that set is a virus. This is such a fundamentally powerful idea that I'm surprised it took this long for somebody to try it out in a consumer OS. Chrome OS then takes it to the next level by signing the kernel and OS, so that there's (probably) no way at all for malicious code to get in. Their goal is for you to be able to trust that if it boots Chrome OS, it's safe to use. As for updates, it automatically updates itself in the background - this is a difficult thing to get right, but Google is more likely than most to pull it off.

Because there are no local applications, they can get away with having no local storage. This bears repeating, because again, this is pretty radical: you can't save files to the hard drive. You can use flash drives and stuff like that, which makes sense, but the OS drive itself is locked down and mounted read-only. This will be one of the more controversial decisions, I'm sure. It forces you to store all your files and settings in the "cloud", which makes migration easier, but is probably going to be kind of a pain.

I'm not totally clear on how it handles user accounts, but my impression is that you'll be able to sign in to any Chrome OS machine using any Google account, and have it work the same. This is an incredibly powerful idea! It essentially means that you're completely independent of the hardware you're using. If your computer explodes, or you spill apple juice on it, or it's stolen by pirates, or whatever - no problem, it's easy to replace. This ties in with the no-local-files thing - if all of your files are already in the cloud, then it makes sense that you'd be able to log into any Chrome OS machine and have it work the same as your own.

A word on the cloud: This is going to be a sticking point for some people. When I say the "cloud", what I really mean is Google's servers - I doubt they'll allow you to switch to other providers, if that's even feasible. There are legitimate user interface reasons for doing it this way, but it still has the potential to be a privacy nightmare, not to mention the power Google is going to have once they hold everybody's user data. Again, this is one of those things that Google can get right if anybody can; the question is whether or not anybody actually can.

Another word on the cloud: When your internet connection is down, or when you're on an airplane, or when Google's servers go down (it's happened before, it'll happen again), your Chrome OS computer is going to be basically useless. People asked about this in the Q&A, and the answers boiled down to "HTML 5 lets you use web apps offline" and "Google has better uptime than most personal computers, so nyah!" It's not really a satisfactory answer, since HTML5 offline storage is only useful for web apps that have been specifically designed to work offline. In the end, Chrome OS won't be that useful if you're someplace without free and easy Internet access.

Random thought: It's accepted among security experts that local access = root access, for a sufficiently determined attacker, for lots of reasons. Google has taken some steps to prevent that (signed OS is a big one), but it remains to be seen whether it's enough. There are a lot of really devious attacks that you can use if you have unfettered local access.

So what's the big idea?

The paradigm that Google is aiming for with this is something called "appliance computing" - treating the computer like any other appliance. You have a basic expectation that a refrigerator, for example, will just work, without requiring constant maintenance. Appliance computing is when computers are that simple to use. This is something that people have wanted for a long time, but the existing model of computation made really difficult. Designing an OS that doesn't require administration is hard, but based on the info I've read so far, it seems like Google might have pulled it off. Designing a truly secure OS is really hard, but while I'm never willing to bet against hackers, I think Google has at least done better than anybody else who's tackled this problem.

Their goal with Chrome OS is netbooks and things like that, which makes sense. While Chrome OS looks nice, I wouldn't ever use it as my primary operating system. There's a certain level of control over my systems that I prefer to have, and one of the key goals of Chrome OS is shifting most of that control (and the associated responsibility) to Google. As a spare system, on the other hand, Chrome OS will be really useful, and that's how most netbooks are used these days anyway.

Wednesday, November 18, 2009

Death of Advertising, round 2: personal RFP

I've blogged on this topic once before, but I didn't have as much to say.

Advertising is annoying, isn't it? It sometimes feels like an overenthusiastic robot salesman that always follows you around, interjecting every few minutes about so-and-so product that people who fit your profile might enjoy. Not only is it annoying, it's remarkably inefficient - I don't care about 99.5% of the ads I see on a daily basis. This really sucks!

So, why not kill the ad industry?

I've wanted to say that for a long time, but until recently I couldn't back it up. If we want to get rid of advertising, we need to replace it with something better, otherwise nobody is going to listen. This is setting a remarkably low bar, actually: "we need to improve on a universally-reviled system." It's actually pretty surprising that nobody's managed to do this so far.

Enter Project VRM. You may have heard of CRM; VRM is basically that in reverse. They've floated the idea of a "personal RFP", the idea being that instead of being advertised to, you put something up on a website when you need something, detailing your requirements, and vendors come to you. Think of it as a reverse eBay.

This is a pretty disruptive (and therefore awesome?) idea. It could completely change the dynamic between buyers and sellers, for the better on both sides. If I'm a buyer, I don't have to spend as much time on comparison shopping - options are brought to me, instead. If I'm a seller, I don't have to waste time and money on broadcast advertising - I can just search for people who want what I'm selling, and contact them directly. I can also customize my offer to each individual, something which companies would love to do right now, but can't because broadcast advertising is such a limiting medium.

One problem on the way to adoption could be the chicken-and-egg problem. There's not really any reason for buyers to use something like this until there are sellers, and vice versa. We could get out of this by crowdsourcing comparison shopping, at least initially - having people look at requests, and finding the best deal online that matches, in exchange for a cut of the sale. This would probably even happen spontaneously, as long as the service doesn't explicitly discourage it.

Would this actually kill advertising? Maybe not, but it would certainly change the nature of it. Right now, advertising has two major functions: convincing you that you need a product, and directing you to someplace you can get it. The former goal wouldn't really be affected at all; only the latter would change. All isn't lost, though. In a perfect world, people would see the advertisements, consider them, decide they want the product... and instead of clicking on them, go to a personal RFP site, and get it that way. If this takes off in any meaningful way, we could see advertising take a huge hit.

Tuesday, November 17, 2009

Decentralization V: Regulating thepiratebay

So I heard today that thepiratebay is shutting down their tracker for good, and shifting to using DHT exclusively. Hooray, this blog deals with current events for once!

There are plenty of arguments either way between centralized trackers versus trackerless DHT. Centralized trackers are much simpler for everybody, they allow you to keep statistics on your torrents, and they let you control who can access a torrent. DHT is more robust, since it cannot be taken down except by extraordinary measures, and cheaper, since that's one less server you need to keep running. More importantly, in this case, it also lets you spread around the legal liability, diffusely enough that the network can't be taken down by lawsuits.

Let's be honest, that's the real reason TPB is doing this. For all their talk of this being the right technical decision, and not motivated by the current lawsuits, the fact is that they're potentially facing a lot of liability because they run a tracker in addition to indexing torrents. Switching to DHT exclusively means that they're free from that, and their defense that they're equivalent to a search engine has a better chance of working.

So here's the interesting point. If you want to shut down a website, it's relatively easy to do - there are several centralization points, such as the domain name, or the server it's hosted on, that are owned and paid for by people, and people are subject to the law. A fully decentralized service, on the other hand, is much more tricky. Take Freenet, for example. The system is specifically designed to be completely anonymous, secure, and deniable - if you're careful, it's next to impossible to prove that you downloaded or uploaded something from or to Freenet. I'll be blunt here - child pornography is traded occasionally on Freenet, and while there are a lot of people that would like to shut it down, for good reasons, it is basically impossible. If you want to shut down BitTorrent DHT, or the Freenet network, you're basically out of luck.

Here's the relevant question, then. How do you regulate a completely decentralized system? Is it even possible? I would argue that, with the Internet's current architecture, it's not. This is a huge deal - right now, it is possible to build completely secure and untraceable communication networks, at next to zero cost to yourself, which cannot be taken down by anything less than the scale of a massive military operation. It doesn't even have to have a lot of people using it. Take, as another example, any of the various mega-botnets running around these days. These are networks composed of millions of computers, next to impossible to shut down, where almost all of the participants in the network are there involuntarily.

What does it mean for society, now that communication has the potential to be completely unregulable? How do we shut down a terrorist cell, when they talk to each other over encrypted VoIP instead of cell networks? (Actually, that's not much of a problem. Despite what the government tells you, most terrorists would be lucky to set off a firecracker, much less talk about setting one off over VoIP.) Do laws about libel still mean anything, if speech can be truly, untraceably anonymous? What about copyright law? Now I'm drifting into old questions, though.

There are interesting parallels between legal regulation and hardware failure, actually. Decentralization, in its ability to protect against the latter, actually seems to prevent the former very effectively.

Monday, November 16, 2009

Things I nearly forgot today

* To get out of bed.

* To wake up, once I got out of bed. It took a few extra hours.

* Today is Monday. Crap.

* To finish writing my statement of purpose for grad school apps. I've had that 95% complete for way too long.

* To do other stuff with my grad school apps.

* What I'm doing here.

* To avoid tvtropes.org like the plague.

* What we're doing for our stats project. Luckily, one of my group members remembered! (The other found out for the first time today, I am pretty sure.)

* That I needed to pick up my sister after I got done with the stats group meeting.

* That ice cream doesn't actually count as dinner, no matter how much I want it to.

* To eat dinner. (Fried rice from home, plus Sriracha, eventually!)

* To do laundry. No, wait, put that under things that I actually did forget.

* To update this blog.

Sunday, November 15, 2009

Decentralization IV: Nonresident societies

Wouldn't it be neat if we could be citizens of the Internet? And not just because it'd vindicate this xkcd. More and more, we can live our lives almost completely online (Second Life being the classic-if-somewhat-irrelevant example). Taking this to the extreme, what if we could become citizens of the Internet? What would the consequences be? Would it even make sense?

(There are obvious problems with treating the Internet as a place, of course, and I blame sci-fi for a lot of them, going back to Neuromancer, and probably well before that. Sci-fi has stuck us with the gigantic lie that is "cyberspace", and convinced a lot of people that would otherwise know better that the Internet is a place, filled with things, which are separated by distance, and separated by [fire]walls. All of this is complete nonsense.)

What I'm talking about here is decentralized governance, maybe, until I find a better word for it. All governments and countries today are centralized in a really important way: centralized jurisdiction. The government has control, imperfect though it may be, over your presence in the country at any given time. They can force you to stay or to leave, and it's difficult to stay or leave without the government's tacit approval. There's a rough balance between the government's jurisdiction over you, and your ability to travel off the grid. This is enforced by the laws of physics, on some level: if we could teleport, for instance, or become invisible, the balance here would be very different.

Whenever there's a limitation imposed by the laws of physics, we can expect technology to take a crack at it. High speed transportation is changing the balance: with the advent of air travel, for example, you can pass through a country without ever being in it in any meaningful sense. This doesn't really change the balance in any meaningful way, though, because governments can simply step up control of airports, and pull the balance back in the other direction. Fundamentally, it's still the same situation.

Jurisdiction is important. It forms the basis for both laws enforced by governments, and services provided by governments. A military, for instance, protects a territory. It wouldn't make sense for countries to have militaries, unless there was a clear-cut equivalence between a government and the territory it owns. With an Internet-based government, though, you would lose that equivalence. When every citizen of a country can leave at will, simply by disconnecting, there are many services that it's impossible to provide.

You should be objecting at this point that this is all moot, since any Internet citizen would also be a citizen of whatever country they happened to be living in offline. While this is true now, it's not a necessary condition. We could imagine a region, with minimal (or without any) governance, designed specifically for people from various Internet nations to coexist. This could become arbitrarily ludicrous; imagine being a police officer in such a place, and not knowing whether or not any given person you saw was within your jurisdiction. Or, imagine trying to collect taxes from citizens that suddenly switch their allegiance for a few weeks every April.

Since I'm running short on time now, I'll just go ahead and assert that an Internet nation can only regulate what happens on its own servers and systems. This is not entirely useless, and a lot of services (especially identity-related services, which every government in the world is currently sucking at) could be provided this way. Even so, at this point we're talking about such a watered-down and neutered conception of citizenship that an Internet nation ceases to have any meaning at all.

Exclusive Internet citizenship is pretty much a bust. All is not lost, though. We could imagine a form of Internet dual-citizenship, where an online "virtual country" provides some additional services, and works with governments to provide them across national borders. (Actually, I really hope the phrase "virtual country" doesn't ever catch on. I'm starting to hate it already. >_>) This would represent a hybrid decentralization of government - doing what can be done in a decentralized way, but falling back to the existing central government for everything else.

Saturday, November 14, 2009

College Puzzle Challenge non-post!

Today is the College Puzzle Challenge! Non-stop puzzle solving from 11 AM to 11 PM, it'll be awesome. But, if you're expecting a real post from me, you're crazy. :p

Instead, I have a fortuitous link or two!



These two blog posts look at corporate politics via The Office. Even if you don't watch The Office, they are filled with seriously awesome stuff! But you should maybe consider watching The Office anyway. <_<

Friday, November 13, 2009

Review: Sony Reader PRS-500

Somebody asked me to do a review of this, since I occasionally talk about how awesome it is. (My theory is that, because it's a relatively niche product, Sony forgot to have their crack team of anti-engineers work on it and cripple it in some way.)





So what's good about it? Mostly, it rides on the strengths of the e-ink screen. The battery life is simply ridiculous - I only have to charge it every few weeks, or every few days if I'm reading non-stop. In terms of books, the battery lasted long enough to get through Anathem (nearly 1000 pages) on a single charge. E-ink only uses energy when it updates the screen, so it's perfect for this kind of device. For people that say that an iPhone is all the ebook reader they'll ever need: yeah, let's see your iPhone last this long.

Physically, it's comparable in size to a really thin paperback - it's a bit larger, thinner, and heavier. It'll even fit in your pocket, if you have big pockets. It comes with a fairly sturdy cover, which is pretty nice - as long as you don't manage to snap it in half or something, it's almost as sturdy as a real book. There are multiple buttons you can use to flip pages, which is convenient if you want to hold it in different ways. It connects to a computer using a normal mini-usb plug.

The contrast on the screen could be higher, but it's still perfectly readable in most lighting. Bright light definitely helps, though - imagine a book printed on medium gray paper. That's about how reading the screen feels. You can adjust the font size, but I usually keep it on the smallest setting so I can see more text at a time. You can't change the font that it uses, which would be nice, but isn't really essential. The screen has 170 DPI, which I sometimes wish was higher, because the edges of the letters look jagged if you look really closely. It's not bad enough to be distracting when you're reading, though. Overall, the reading experience is pretty good - it's much easier on the eyes than an LCD screen.

It has about 100MB of usable storage built in, which is enough for several dozen ebooks - I haven't even come close to using up the space yet, since I usually delete books from it when I finish reading them. It also comes with an SD card slot, though, so if you wanted to you could put thousands of books on it at a time, and swap out SD cards for even more space. Basically, for all intents and purposes, you can treat it as having unlimited storage.

One thing I wish it had is a way to jump to a specific page. It's usually pretty good about remembering your page, but if you lose it somehow it's a huge pain to get back to it. It'd also be nice if you could do more from the interface: you can't delete books, for instance. For that, you have to use Sony's provided software, which is a gripe in its own right, because it is kind of shittastic. It took me forever to figure out how to even copy a book onto the device.

This is largely mitigated, though, because there's an pretty good open source replacement for it, called Calibre. Not only does it actually work (on Windows, Mac, and Linux, no less), it handles conversions between different ebook formats pretty smoothly. The only thing you might conceivably need Sony's software for is firmware updates, and since I'm pretty sure they've stopped supporting the PRS-500, that's not a huge concern anymore. >_>

One more thing: Sony is launching a bookstore based on EPUB, but for some inexplicable reason, they've decided that the PRS-500 isn't important enough to update with EPUB support. I applaud their zeal in trying to retroactively screw over this device, but since EPUB DRM has already been broken, I'm not anticipating too much trouble converting to an older format.

Edit: Wow, I should complain about stuff more often! I just saw via mobileread that Sony will offer a free upgrade to PRS-500 owners. XD

Thursday, November 12, 2009

Decentralization III: B.Y.O.I.

There's one question you can ask of any new Internet technology that can usually predict whether or not it has any chance of succeeding as infrastructure: "Can I just run my own, and have it work the same way?"

This sort of decentralization is common to nearly all successful Internet-scale technologies. I can run my own Web server, and anybody can access it just fine - I don't have to get permission from the Web Corporation, or run my website on hardware provided by them, or anything like that. Same with e-mail - anybody can make their own mail server, and send messages to anybody (unless the ISP blocks it, which many do these days). Same with XMPP - anybody can run their own XMPP server, and use it to chat with anybody else in the world using XMPP. Same with Google Wave - Google knows what I'm talking about! They designed it to run in a federated model, so that people can set up their own Wave servers, and communicate with people using other ones. Same with dozens or hundreds of other protocols.

Of course, there are some notable exceptions, in every direction. Google is arguably Internet-scale, even though they're a centralized service. Internally, though, they've got decentralized infrastructure all over the world. Basically, they've gotten some of the benefits of decentralized infrastructure, by paying for all of it themselves. IRC, on the other hand, is sort of a counterexample because even though it has decentralized infrastructure, it scales pretty poorly - if you're on one IRC server, you can't talk to people on other servers. It's a tradeoff, really. IRC servers have enough problems to deal with even without being able to send messages between servers.

Twitter's a counter-counterexample, because I love to hate on Twitter: They fancy themselves to be an Internet-scale service, but I don't believe they can scale to match the demands that come with truly being Internet-scale. (They're still getting taken down by DDoS attacks every once in a while. Can you imagine the kind of DDoS it would take to knock Google offline?!) Clever engineering may save them yet, but I wouldn't count on it.

As a counter-counter-counterexample (can he do that? :O), we have Identi.ca and the OpenMicroBlogging network. This is a twitter-like service that actually can scale, because they've designed it to be usable across multiple servers. For example, if I have an account on Identica and somebody else has an account on Brainbird, I can subscribe to them and everything just sort of works. I guess that instead of being a counter^3 example, this is just an example: anybody can run their own instance and have it work with everybody else's.

The BYOI (bring your own infrastructure) approach works. But what are the tradeoffs? Decentralization in these cases brings scalability and robustness, both of which are really important for systems that are trying to gain traction. On the other hand, you lose centralized control. This makes it more difficult (next to impossible, in some cases) to update the protocol, and it means that some unwanted uses of the system (such as spam) are impossible to control.

It also means you have to subdivide your namespace, and this can be a tricky problem. With Twitter or AIM, you can talk to people using just a username. With email or XMPP, you have to specify what server they're on as well (user@server.com), because people can use the same username across different servers. People have tried to design decentralized systems that use a centralized namespace before, but it's a fundamentally hard problem to solve without compromising your design. I would argue that DNS (the system that maps domain names to addresses) has managed it - whenever you type in a website address, you're pretty confident that anybody else in the world will see the same website, because it's a centralized namespace on a decentralized system. On the other hand, they only managed this by charging money for domain names, which isn't something that'd work well in other places.

Finally, for a system based on a decentralized protocol, you get some level of additional security over time. There are two classes of security holes: holes in individual applications, and holes in the protocol itself (the latter being much rarer). With a centralized system, these have basically the same impact, since there's really only one instance and one implementation. With a decentralized system, there are usually a few major implementations, and a lot of minor ones around the edges. If tomorrow a major bug was found in BIND (the most common DNS server) and all BIND servers had to be taken down until it was fixed, the Internet would mostly continue to function. You can't get that with a centralized service.

Tuesday, November 10, 2009

PiCoWriMo

As some of you are intimately aware, November is National Novel Writing Month! One of these years, I'm actually going to do it, but right now is really a bad time I think. Maybe next year. Several of my friends are doing it, though, so this is a tribute to them!

A NaNoWriMo is a bit much for me right now, but thanks to the magic of the metric system, I can at least manage a PiCoWriMo. So here is a fifty word story:

The suburban lights sprawled beneath them, in jarringly uniform rows, like government-issue constellations. As they fell, they tumbled, and glimpsed for a moment their makeshift dirigible, scudding silently against the Moon's halo. They landed hand in hand, mad grins on their faces, as noiselessly as two trees in the forest.

Monday, November 9, 2009

Default applications on Linux

Guys, I'm interviewing with Microsoft today, wish me luck! To commemorate this, I'm going to do a tribute to the anonymous Linux Hater's excellent blog.

How to set default applications

On Windows: Right-click on a file of a given type, go to "Open With", and set the application that you want it to open with.

On a Mac: Pretty similar, except you use "Get Info" instead of "Open With".

On Linux: It depends on what desktop environment you're using. Gnome, KDE, XFCE, and every other desktop, does it in a completely different, completely uncoordinated way. Also, if you switch desktops, you lose all your settings.

Q: But doesn't that make people's lives more difficult?

A: Linux is about choice! Specifically, the choice we made to do whatever the hell we wanted.

Q: But what if I want my program to set the default application for a certain filetype?

A: Oh, you should never need to do that. We won't tell you what you should do instead, but we're very sure that you'll never want to do that.

Q: Wait, I thought Linux was about choice! What about my choice as an application developer?

A: Did we say that? Actually, it's just an excuse we use to avoid any of the hard work that might be involved in actually standardizing things. See, because when we don't, users have to make choices they otherwise wouldn't need to worry about, but this way they feel better about it. Choice, baby!

But wait, never fear! There's the Portland Project, which had the lofty goal of unifying Gnome, KDE, and all the other desktops out there! Surely, an effort this important to the success of the Linux desktop would receive the huge amount of developer attention it needs to become successful!

...no, wait, my bad, I'm thinking of a less dysfunctional operating system. Actually, xdg-utils (the result of the Portland project, which was supposed to unify all this) is buggy, mostly unmaintained, and hasn't actually seen a release in years. On top of that, instead of being an awesome unifying solution for the completely separate default application systems that exist, it's just a few shell scripts that support Gnome, KDE3, and sometimes XFCE. There's not even a generic fallback mechanism in there! It's completely useless for users of anything other than Gnome and KDE (and really, XFCE too).

Wouldn't it be nice if there was a tool that would let you just go to a command line and type "open whateverfile.whatever"? All the desktops could standardize on it, users of other desktops wouldn't be left out, and everything would be simpler. I have been thinking about this for a while, and I even took a few shots at writing such a tool. So you can imagine how dismayed I was when I discovered that this tool already exists on Macs and it works perfectly. You'd think the Linux community would have been all over that; ripping off Apple is what they do best!

Sunday, November 8, 2009

Decentralization II: currency

There are two major systems of currency in this country right now. (There are probably others that I'm overlooking, but we'll simplify.) I'm referring to cash, and credit cards. Ostensibly, these two are a single currency, since they're readily interchangeable and tied together, but they're clearly two separate systems.

Cash is a decentralized system. Any given bill holds its value more or less independent from its surroundings, assuming of course that the government is still around. Credit and debit cards, on the other hand, form a centralized system - your credit card may have a fancy picture on it, but unless whoever you're trying to pay has a dedicated communication channel with the credit card company, it's just a very pretty piece of plastic. This distinction is made clear in several ways.

First, authentication of value: Cash uses a relatively weak distributed authentication scheme - the bill or coin only has to look valid, which can be made difficult, but never impossible. This only works because breaking the authentication (counterfeiting) on a large scale can be expensive, and each individual break only nets you a relatively small payoff (the value of the coin or bill). Credit and debit cards, on the other hand, use relatively strong authentication for value - you contact a centralized server, which can keep track of all currency in the system. This is far more difficult than the decentralized cash model, but it's also next to impossible to break, assuming the server is written securely. (Note that authentication of value isn't the same as authenticating the owner of the money - both cash and credit cards are pretty bad at this, in different ways.)

There's also the issue of robustness. It's not unheard of (even if it's not exactly common) for credit card processing to "break" at a given venue, forcing them to process all transactions with cash, or by other means. This can happen because credit cards simply don't work without being able to contact a central location to get the balance on the card, and subtract from it. Cash, on the other hand, is available for use under a much wider range of circumstances. Because it holds its value in a fully decentralized manner, it's not subject to any kind of communication breakdown, and so it works as long as people are willing to accept that it has value. (Note, though, that the authentication is weaker because of this - thus, counterfeiting is possible for cash, while a fake credit card wouldn't get you very far.)

Cash also has the property that, since no centralized communication is required for its use, it can be used without being tracked. In other words, cash can be completely anonymous. With credit cards, there's a single point through which all transaction data flows, and this in turn means that data about when the card is used can be collected very efficiently. This is another property that applies in general to centralized systems, but it can be applied to decentralized systems too - it's just harder. We could imagine, for instance, a system in which people were required to scan all currency that they handled into some sort of centrally controlled device, to prevent counterfeiting. This could be circumvented, since the scanning step isn't required for the transaction, but it's a way to graft centralized notions of control onto an otherwise distributed system.

This is an important point, actually. In a lot of cases, centralized systems can have decentralized aspects added, or vice versa. With money, for instance, it's all printed in a few centralized locations, and this contributes to it having value: if anybody could print their own money, it'd be worthless. When it comes to distributed systems, introducing centralization can be a powerful technique, if the tradeoff is worth it.

In other news, I haven't figured out everything I want to do for the rest of these, so I just might take requests. XD

Saturday, November 7, 2009

Decentralization I

Which is better, centralization or decentralization? I've been giving this question a lot of thought lately, in the context of distributed systems, and I've come to realize a few things.

First, the question applies to more things than you'd expect. For once, I'm not going to spend the entire time talking about tech - command and market economies, for instance, can be understood as centralized and decentralized systems, respectively, and analyzed as such.

Second, there are fundamental limitations in both directions. For example, a fully centralized system will always undergo some sort of failure eventually, thanks to Murphy's law, while a completely decentralized system has to deal with malicious individuals, asynchrony, and other such issues that keep people that work on distributed systems awake at night.

Third, it's not an all-or-nothing question; it may not even be a smooth gradient. In addition to the tradeoff between centralization and decentralization, we need to consider hybrid decentralized systems, which have proven to be a good compromise. Popular examples include e-mail, and XMPP (aka Jabber, aka Google Talk).

Over the course of this month, I'm going to be writing a series of posts about decentralization. Many of them won't even be about computers, but about other systems that I'm looking at from this perspective. I think that there are a few core principles that account for the difference between centralized and decentralized systems, and I'm trying to tease out what those are.


Friday, November 6, 2009

The OOM Killer

There is a small but vocal contingent of Linux "advocates" that are only too happy to tell you that Linux is super-awesome and will solve all your problems. "Linux will do everything that wind0ze will do, only better, and it'll do it for free!" and on and on like that.

I'm not writing this post specifically to annoy people like that, but it certainly wouldn't hurt. :3

So what's this "OOM killer"? It's something that Linux fanboys generally don't like to talk about, assuming they even know it exists. Let's get some background first, on memory allocation.

When a process is created on Linux, the operating system needs to copy all the memory from the parent process into the new process. It does this because, following the traditional process abstraction, the child process needs to inherit all the data from the parent, but also needs to be able to modify it without messing up the parent. Linux uses a technique called "copy on write", or COW, to do this really quickly. The trick with COW is, instead of actually copying the data, you just mark it read-only, and point both the parent and the child to the same copy. Then, if either of them tries to write to it, you copy it secretly and pretend that it was a writable copy all along. This works really, really well, since the vast majority of memory ends up never being written to for various reasons.

Normally, Linux will handle when it runs out of memory by returning an error to the process that tried to request the memory. This works more or less well. There's an unfortunate tendency among programmers to ignore the result of malloc, though, which means that some programs will start to randomly crash when you get close to running out of memory. The point is, though, that at least there's a way to detect the situation - carefully written programs can avoid crashing by checking the return value of malloc and reacting properly if the system is out of memory.

But there's another problem here. Notice that with COW, the actual memory allocation happens at some random time, when you try to write to some random variable. If the system is out of memory, and has to make a copy, then you've got a problem. (Ideally, you'd make sure that there's enough free memory when you create the process, but then you're wasting a lot of memory - can't have that!) You can't just tell the program that you couldn't allocate memory, because the program didn't try to allocate memory in the first place! You have an error that can't be properly handled. So, Linux handles this situation with... the OOM killer.

The OOM (out of memory) killer does exactly what it sounds like: when you run out of memory, it kills a random process so that the system can keep going. They've developed some rather elaborate heuristics for how it selects the process to kill, so that it's less likely to be a process that you really care about, but as described in this awesome analogy, that's somewhat akin to an airline trying to decide which passengers to toss out the airplane if they're low on fuel. No matter what you do, the fact remains that you've gotten yourself into a bad situation.

I've seen swap provided as a solution to this, but that's basically just saying "don't run out of memory in the first place" - it's not terribly helpful. The fact is, no matter how carefully you write your program, and how meticulous you are about checking for errors, there's still a chance that your program will crash randomly, for no reason at all. Yay, Linux!

Thursday, November 5, 2009

An Egg-and-Chicken Situation

I was reminded today that, while it's perfectly obvious to me that the egg came first, there are a lot of people that still aren't convinced. So, without further ado...

Initially, we can trivially say that all chickens were preceded by dinosaur eggs. This is no fun though, so I'll go ahead and strengthen the "paradox": which came first, the chicken or the chicken egg?

We define a creature as a "chicken" based on its DNA being sufficiently close to the modern species of chicken.

First, we take it as a given that any creature sufficiently chicken-like to be called one will have hatched from an egg. (Indeed, if a chicken-looking creature was born by some other means, we would probably not accept it as a chicken - and this is a proof in and of itself, albeit a less interesting one.) Thus, every chicken is preceded by at least one egg. However, this does not preclude an infinite cycle, which is the source of the paradox.

We next note that, because a chicken does not have the same DNA as either of its parents, and we're defining chicken-ness based on DNA, its possible that a chicken could be born where one or both of its parents are not-quite-chickens. Furthermore, I assert that, since there has been a point in time at which chickens did not exist, this must have happened at least once. If we consider that first chicken, it had the same chicken DNA when it was an egg, so it was in fact preceded by an egg. QED.