Sunday, November 29, 2009

Decentralization VII: Definitions

I started this series of posts to try and get a better handle on what decentralization is, what it's good for, and what general situations require it. The level of centralization is an important characteristic of systems, but people don't always understand the tradeoffs that exist there. I think, at this point, that it all comes down to the division of control.

The definition I think I'm going to use: A system is decentralized in a certain aspect if control over that aspect is divided up over the components of the system. It's too vague to say that a system is a decentralized system without specifying what aspect you're talking about. Systems can be centralized in some aspects and decentralized in others, and in fact I think nearly all systems are a bit of both.

Mini case study: Blogspot has centralized ownership (Google) and code, along with centralized authentication (Mostly Google accounts, I think?) and centralized execution (runs on Google's servers). On the other hand, it has decentralized authorship (contrast with a newspaper or something like that), and participates in several decentralized protocols (linkbacks, RSS, etc). Contrast with Wordpress, which optionally has decentralized authentication and execution (you can run it on your own servers), and somewhat decentralized (open-source) code. Because of this, we can say that, broadly speaking, Wordpress is less centralized than Blogspot.

Systems can vary in the degree that they're decentralized. Federated systems, which I've mentioned before, have a set of independent servers which are decentralized, but clients connecting to the system don't have to worry about this and can just pretend they're connecting to a centralized system. (Contrast with p2p networks like Gnutella, where every user on the network is simultaneously a client and a server.) Once we understand decentralization as a division of control, it becomes clearer what's going on here: control is being divided among only the participants in the system that want to run servers, and thus reflects the reality that only a few users will actually want the responsibility that comes with that control.

Can we divide up control in more complex ways? Yes, but there's not a whole lot of benefit to it, and it increases the complexity of the system significantly. Simplicity is inherently valuable when it comes to code; complexity is usually a problem to be eliminated. I can't think of any computer systems right now that have more complex control hierarchies than federated systems - they almost certainly exist, but they're not common.

So what are the tradeoffs? Control is important in systems for both technical and business reasons, which aren't always the same. Twitter would definitely be more reliable if they made it more decentralized (more on robustness in a bit), but it would also impact their (nebulous) ability to turn a profit if they gave up control.

Why are decentralized systems more robust? There are a lot of factors that come into play here, really, so I don't have a clean answer yet. Let's look at some failure modes of systems. System overload is less of a problem in decentralized systems, because they're necessarily designed in more scalable ways, which makes it easier to throw hardware at the problem until it goes away. Hardware failure is also less of a problem, because systems that execute in a decentralized manner can be made immune to individual pieces of hardware failing more easily, Software bugs are less harmful in systems with decentralized codebases, because different implementations of the same protocols rarely have the same bugs.

Maybe this is the answer: centralization implies control, and control implies an opportunity for the one in control to make a mistake. If we define robustness as the number of mistakes or hardware failures that would have to occur to take the whole system down, or better yet the probability of a sufficient number of mistakes happening, it's easy to see why decentralized systems are robust. Thought experiment: if we had a system that was decentralized in such a way that any failure would break the system, we would have a system where decentralization made it far more fragile, rather than robust. Luckily, you'd have to be crazy to design something like that.

This is the last post in this series, I think - I've found what I was looking for.

No comments: