Thursday, April 30, 2009

Blanket Licensing

There is a push, in some circles, for blanket licensing of digital music. Essentially, you would pay a small fee, and in exchange have the right to download as much music as you want. I have seen several of the most respected people in the "free culture" community, most recently Richard Stallman, endorse this approach, which is a shame. Not only is it a poorly-thought-out idea, but it would be far worse in many ways than the current system.

First, a necessary distinction - there are both voluntary and involuntary blanket licensing schemes being proposed, both idiotic. In a voluntary system, there would be a few groups that would collect payment in exchange for the right to download music, and consumers would choose one, and everything would be happy until some rightsholder gets upset for some reason, and denies one service rights to their IP, and then all hell breaks loose because this system only makes sense if every service has rights to every song. Can you imagine the confusion that would result if some services only had rights to some songs, and the set of songs they had access to wasn't even constant? Involuntary blanket schemes have the same problems, but with the added downside that everybody, even people that don't like music, has to pay for it.

But, a music tax? That's kind of crazy, isn't it? Surely nobody would ever suggest doing something like that?

Here is a more troublesome problem. In a modern, decentralized filesharing network, it's impossible to accurately track what files are being downloaded. (BitTorrent is a notable exception here, but only because it was explicitly designed for such tracking.) However, it's the job of the organization collecting the money to distribute it to artists based on popularity. Since popularity can't be accurately tracked, how are they going to decide who gets what? Since it's not really possible to do it in a fair and transparent way, what recourse will artists have if they believe they're being cheated? And then, worse yet, you end up with a situation where fans can artificially inflate the measured popularity of their favorite bands at no cost to themselves. Gaming the system would become an arms race, and eventually the measured download counts would have little to no bearing on reality. Furthermore, let's not forget that by centralizing, we've introduced a single point of failure. Any corrupt accountant could throw piles of money to whichever artists they happened to like, and no one would be the wiser. The fact is that it is completely impossible on multiple levels to ensure that the collected money will be distributed fairly to the artists.

Blanket licensing would also end up being far more monopolistic than the current system, which is actually a pretty impressive feat. Consider the plight of somebody trying to start a new collection organization. In order to be taken seriously, you'd need to acquire rights to every song available from other organizations; if the existing organizations have made deals which lock out newcomers (and they will, short of legislation to the contrary) then you're stuck. Only licensing some content won't cut it - if you have to give people a list of songs they can't legally download with your service, they'll just go elsewhere.

This would also be a death sentence for independent artists. An artist looking to strike out on their own would be in a difficult situation - everybody expects music to be free, so nobody would actually be willing to pay them. They would be forced to sign up with a label, which could then pay them next to nothing, since that's better than the nothing they would get on their own.

And then, this is just for music. Further on down the road, are we going to have another monthly fee for movie downloads? ebook downloads? software? accessing web pages that were previously free? Where does it end? And this is assuming that entire industries would be able to unite under a single banner; almost certainly, this is a ridiculously optimistic assumption.

At some point we, as a society, are going to have to come to terms with the fact that anything that can be represented as binary - all information, in other words - will be available for free to everybody on the Internet. We can hem and haw and decry the whole state of affairs, we can theorize about ridiculous payment schemes to assuage our own guilty consciences ("it's okay as long as somebody's getting paid!"), but at the end of the day, there's something we need to accept. The solution cannot possibly involve the current system of copyright, which is now so far out of touch with the way the world actually works that, were it suggested anew today, it would be regarded as a joke. Instead of molding society to the law, we need to find a system that actually works, and remake the law to match that.

Thursday, April 23, 2009

Notes from the switch to a command-line IRC client

For the past few years, I've been using XChat for all my IRC needs. It's been reliable so far, though I've had a few issues with the way it handles logs and configuration files. Every few months, though, somebody will tell me to switch to irssi or one of the other command-line IRC clients, because they're "better" in some vague, ineffable way. So, for the past week or so, I've been using Weechat. Here are my thoughts on the transition.

Weechat > XChat:
  • Because it's command-line, I can use it with screen, which is incredibly convenient when I'm out somewhere.
  • The control commands are a bit nicer than XChat's
  • Handles configuration really well - you can make changes through the interface, and save them back to disk with a single command
  • Lower memory usage
XChat > Weechat:
  • Having a channel tree on the side of the screen is infinitely better than keyboard shortcuts for channel navigation. I've always said that the keyboard shortcut model for navigating between channels sucked hard, but now I have actual experience to back it up.
  • You can't click on URLs in a command-line client, unless the terminal has some special support for it, and the URL is short enough that it doesn't get wrapped in half
  • You can't copy/paste from a command-line client. I'm sure there's some kind of ugly hack to get around this, since it's a pretty basic thing to want to do, but there's no way to make it work cleanly.
  • It's impossible to get a command-line client to integrate with your desktop, so there's no way, for instance, to get a notification on highlights
  • I miss being able to scroll with a mouse wheel. Page Up/Down just isn't the same.
  • Many of the keyboard shortcuts are really obscure. Traditionally, this is where command-line client zealots say something inane about how I can change them to something better if I don't like them.

Friday, April 17, 2009

Decentralized Reputation II

(for latecomers - previous post)

I wasn't looking forward to coming up with rules for dealing with transitive trust, but as it turns out, I'm in luck - I stumbled upon subjective logic a few days ago. It basically adds a few new logical operators for dealing with opinions in a surprisingly clean way. This paper has a better explanation than the wikipedia article, if you're curious. There are two key points that I'm going to make use of throughout this post: it's possible to reason about the amount of trust you should assign to opinions even through an arbitrarily deep trust chain, and in order to perform this reasoning accurately it's critical that you have access to the entire chain, not just the transitive opinions of people you trust.

Before I start, there are two things that you absolutely have to know about before this post will make any sense: cryptographic hashes, and public-key cryptography. (CS majors and other such people can probably skip this paragraph. :) A hash function takes some arbitrary data, and generates from it a fixed-size block of data. The hash needs to have some properties: given some data, you can generate a hash, but given a hash, you can't find some chunk of data which hashes to the same thing, or tell anything about the data that went into the hash. With public-key cryptography, you have two keys, which correspond to two functions that are inverses of each other. Furthermore, you can assume that key pairs are unique. With public-key, you can do things like give somebody one of the functions, and encode data with the other one, which simultaneously protects the data (because it's encrypted), and proves that you are who you say you are (because you would have to have the corresponding private key, if the data decrypts correctly with the public key you gave them).

Identity

This identifying property of public-key crypto is tremendously useful. It means that, for as long as you can keep the private key secret, you can prove that all messages signed with your private key are actually from you. Thus, a public/private keypair is at the core of the idea of an identity. Identity using public key crypto is largely a solved problem, so I won't spend all that much time on it.

One thing that's unfortunately missing from current implementations is a good method of key revocation. Rather than fiddling around with revocation keys that have to be broadcast somehow, I'm going to introduce the concept of an "identity server". An identity server in this scheme is a server, which has your private key (or better yet, a proxy key) and can answer requests about it. Individuals will be able to run their own ID servers, of course, but somehow I think that not everybody will want to.

Reputation

So we have an identity system, with identity servers - great! Except, computers can generate keypairs pretty darn quickly these days, so identities are a dime a thousand. We need some way to distinguish between legitimate people and hordes of spambots. We'd also like a way to know if a given person's trustworthy, which turns out to be a very similar problem. Getting a meaningful measure of trust from raw data is kind of tricky, which is why I'm going to invoke Jøsang's work on subjective logic here. We can compute a meaningful measure of trust for an identity given a network of trust and opinion ratings from various users, such that there's at least one unbroken chain between us and the identity we're trying to evaluate.

The big question (how will this all work?) then becomes three much easier questions: where do these ratings come from?, where do we go to get these ratings?, and how will we find unbroken chains?

Where the ratings come from is easy - people are already able to judge the trustworthiness of the people they interact with regularly, which should be enough data for the system to function. As we all know, everybody is at most six degrees away from Kevin Bacon, so the maximum number of hops we'll have to go to find somebody is twelve.

Where to get the ratings is a bit trickier, but still doable. The problem is that people tend to go offline at random times, meaning that we'll need to store the ratings in the Internet somewhere. A distributed service like DNS would be good, but we'll still need a stable source for ratings, that the distributed system could then be a cache of. (Aside: in fact, because of the nature of digital signatures, the distributed cache system wouldn't even need to be trusted. If you were walking down the street one day, and a shady-looking guy stepped out of an alley and handed you a digitally signed document on a floppy disk, you could be 100% certain that the document was authentic, if you had access to the signer's public key. Neat, eh?) An obvious candidate is the identity servers, since those already need to be up all the time. In my scheme, a person's identity server would be responsible for storing and giving out all the opinions that people have had about that person.

Alright, so here's one of the nifty bits - this came to me at about 3 AM while I was trying to fall asleep, as good ideas sometimes do. It seems that if the identity server is responsible for maintaining all opinions about you, good or bad, then it has no real reason to keep the bad ones. It could easily tell whoever asked that nobody has ever said anything bad about you, and nobody would be the wiser. What we need is a way for somebody leaving an opinion to prove that it was accepted. The obvious way to do this is to have the identity server sign incoming opinions and give back the signature, but there's still vulnerability here - the identity server could pretend to be suddenly deaf when you're saying mean things. On the other hand, we can imagine a three-step exchange, where you hand the server a hash of the opinion you want to give it, the server signs that and gives it back, and then you give the server the opinion. Remember, cryptographic hashes are opaque, so this means that the server is forced to accept all opinions. The user creating the opinion can simultaneously submit it to the distributed cache - this would ensure that an identity server which drops negative opinions can be discovered.

As for the third question - how are trust chains discovered? - this one is a little difficult. Finding optimal paths through a graph is easy if you have the entire graph lying around; not so much if it's spread out over thousands of computers, some of which are down at any given time. Ideally, we'd like to be able to find the best trust path between ourselves and some random other identity. However, we don't really need the best path; if a lot of paths exist, then some pretty good path is usually good enough. In that case, our slightly unreliable distributed cache suddenly looks pretty good - it has all the necessary data already, and with our relaxed requirements can give us exactly what we need.

So, there you have it. As far as I can think through, this proposal is mostly solid; I've spotted a few weak points already, and I might try to paper them over in a future post, but the core structure should stay the same. Even though this was originally a thought experiment, to prove that something like this is possible, I have a strange urge to actually implement it now ._.

Wednesday, April 8, 2009

Decentralized Reputation I

This is the first post in a series on online reputation systems. I've been meaning to write this for a while, since I've been trying to design one that actually works.

The need for automated reputation

Identity is a troublesome concept on the Internet, where everybody is by default anonymous. In general, it's impossible (without user interaction) to say that two users on different sites are actually the same person. This leads to interesting consequences occasionally, such as the recently revealed social network attacks. It also means that when a new user is registering on a website, it's very difficult to verify that they're a real person, and not some kind of spambot.

The problem I'm trying to solve here is the problem of trust on the Internet - can this random user be trusted? Identity is only half the problem. Obviously, trust is useless without identity; it's impossible to trust someone if you can't even verify who they are. I contend that the converse is also true; that is, identity is useless if you can't establish somehow that you trust that identity. With careful application of public-key cryptography, identity can be considered a solved problem, but trust is far from it.

What we need here is some kind of distributed decentralized reputation system. At a high level, every user in the system needs to be able to see some measure (but not necessarily the same measure) of any other user's trustworthiness. Every user also needs to be able to influence any other user's trustworthiness. The system must also have some way to limit the amount of damage a set of malicious users can do.

Advogato's trust metric provides these properties, and follows my thinking about how to design such a system pretty closely. The only real issue I have with it, after a cursory reading, is that it's centralized, which severely limits its usefulness. On the other hand, finding paths in a potentially very large nonresident graph is kind of difficult. Nevertheless, I feel like it should be possible. In a future post, I'll outline my idea for such a system.

Friday, April 3, 2009

Macbook Pro

Yeah, I've joined the Cult of Mac. Here are my impressions after using my 15-inch MBP for a week or so.

Hardware

The aluminum shell is nice. The whole thing feels very sturdy (way more so than my last laptop), without being overly heavy. I think I would have to really exert myself to put a crack in this thing. The display is good, even though I'm not really a fan of glossy displays. My favorite thing about it is that it dynamically adjusts its brightness based on the ambient light - it's a parlor trick, sure, but it's a damn useful one. I tried it outdoors the other day, and I'm happy to report that it's still readable in direct sunlight. I wouldn't recommend it, since the backlight can't quite keep up with a G-type star, but if you're desperate you won't kill your eyes trying to read it.

Input devices are a mixed bag. The keyboard is a little disappointing overall - I like the backlit keys a lot, but the keys aren't especially responsive, and worst of all, I can't rearrange them. Once you've gotten used to a non-qwerty layout, going back to qwerty is just obnoxious. The multi-touch trackpad is brilliant, on the other hand, and I predict that everybody will soon be copying multi-touch gestures for scrolling and navigation, assuming there aren't any patents in the way. I didn't think it was possible for something to be more convenient than a scroll wheel on a mouse, but Apple has done it somehow. I only have two small gripes about the touchpad. First, it's nice that you can click anywhere on it, but it takes a ton of pressure to click relative to what you use just moving the cursor. If they could dial the resistance back by like a factor of five, it'd be really convenient to use. Second, tap-to-click has a noticeable delay (maybe a quarter to a third of a second) compared to actually clicking. This is just long enough to be an annoyance, so it seems like Apple could clean this one up with a little tweaking.

As for the system specs, they're pretty satisfactory. So far everything is snappy enough, even with all the random visual effects, that I don't really feel like I need the system to be faster. It'd be nice to have a little extra RAM, I guess, since 2 GB isn't all that much these days, but Apple charges a lot for RAM. :( If it ends up bothering me more than it does now, I'll probably pick up some extra from Crucial or somebody else. The battery life is excellent - I haven't tried running it down yet, but based on some quick maths I think it'd go for about 3-4 hours on a charge.

Software

Mac OS X is really unexpectedly nice.

You know how people say that OS X "just works"? Yeah, it used to annoy me too, but now I kind of see what they meant by it. Apple has put some serious effort into smoothing over rough corner cases, and it shows. Boot Camp, for instance, is a shining example of this - you can repartition the drive that the OS is running on by dragging a slider and clicking a button. This is the sort of thing which normally requires serious voodoo. Then, because apparently that wasn't impressive enough, they provide a complete set of drivers for the Windows side, so that Windows can work properly on the hardware in the laptop. Or, to take another example, printer configuration. I decided I wanted to install a shared network printer, so I started clicking around in system preferences. Twenty seconds later, the printer was installed. It Just Worked(tm).

The eyecandy is nice, but could be better - probably, I'm just spoiled by Compiz. The implementation of multiple workspaces, for instance, feels a bit sluggish compared to the Compiz desktop plane plugin. Overall, though, this isn't exactly a huge issue.

I know a bit about operating systems, and I can usually find plenty to criticize, but in terms of system architecture OS X actually isn't that bad. Software installation, for instance, is normally a sore point for me, but OS X is about halfway to what I'd consider a good implementation. (Linux, by way of contrast, is also about halfway, though it gets different things wrong and varies by distro; Windows has literally the worst possible software installation system I can imagine.) Programs go in a single folder, and by and large they stay there. It's not perfect, since it still allows programs to do retarded things, but at least the retarded way isn't the standard, accepted way.

Having a BSD-flavored system underneath everything is really convenient. I don't know how I'd ever get anything done without being able to drop down to a command line every once in a while. Plus, since it's UNIX, it integrates relatively nicely with the rest of my systems. It also means that most UNIX software is available for installation, via MacPorts.

Still not touching iTunes with a 10-foot pole, though. I can manage my own music kthx.

Overall

When my mom first saw me using this laptop, she said, and I quote, "Wow, that's a badass-looking laptop!" What more needs to be said?