Thursday, October 30, 2008

The fundamental uselessness of online identity

On the Internet, the very concept of identity is now so completely and utterly broken that it borders on irrelevant. The real surprise is that this still surprises some people. But maybe I should back up and explain.

Online identity, in the form of accounts, is used these days for two main purposes. First, it allows websites to keep track of various pieces of information about you, such as your name, your date of birth, those embarrassing pictures from the party last weekend, and most importantly what you're allowed to do on the website. Second, these same systems are used to ensure that people don't create an arbitrarily large number of extra identities, since presumably this would be a bad thing for various reasons.

The first use is broken; not through any inherent fault, but because current authentication systems are so mind-bogglingly awful that phishing has compromised a sizable number of accounts on any given website. (In extreme cases, and usually due to a combination of SQL injection and an inexplicable failure to hash passwords, there have even been cases where every single account for a website is stolen.) Now, the million dollar question: how do you keep a website running smoothly when an arbitrary number of your users are actually acting maliciously, and you have no way to detect it?

The second use of identity on the web, though, is so completely broken that it's a wonder people even try anymore. Despite increasingly deseprate measures by some site owners, it remains laughably trivial to create multiple accounts on any website that allows open registrations. Requiring a valid email address? There are temporary email sites that will let you generate a new email address in under a minute. Checking IPs? Not only is it dead wrong with the increasingly widespread use of NAT, it's also trivial to find an open proxy. CAPTCHA? Only prevents machine registrations; I can still sit down and keep making accounts by hand until I get bored. (OpenID only exacerbates this problem, incidentally: for the price of a domain name you can create an infinite (seriously!) number of OpenIDs.)

"But then," you may ask, "how do I prevent people from making a ton of accounts and spamming up my website?" Well, there's a simple solution, but you won't like it. Still want it: Here it is:

Build your website from the ground up with the assumption that every user has an infinite number of accounts.

See, I told you you wouldn't like it. If you were to design a website around this principle, there are two paths you can take.
  1. Design your site in such a way that it has no per-user quotas: Since everybody has infinite accounts, limits set on users are useless. This isn't perfect, since user moderation is still nigh-impossible, but it's an improvement over current practices.
  2. Require some kind of investment from your users before an account becomes useful: This can be a contribution of effort (as stackoverflow does), some kind of monetary account fee, or something else entirely.
Incidentally, since I mentioned OpenID earlier, I may as well point out that it's a definite improvement for both uses of identity, since it offers more security for the first use (if you're using a competent provider - myopenid, for example, offers both key-based auth that you can install to your browser, and a service that will call you to verify logins), and drives home the impossibility of the second. OpenID has a few of its own problems, though, but that's really getting out of the scope of this post, so I digress. >_>


Kiriska said...

I think I want to argue the terminology used here. My first thought on seeing "online identity" was completely different, though that might be because my brain has shifted entirely from CS-related subjects to art-related subjects where "online identity" is essential for self-promotion and marketability. Obviously, that "online identity" is completely different from the online identity you're talking about here. Mostly.

Perhaps "online identity verification/limitation" is more apt. I never saw the point of trying to limit multiple user accounts. Identity theft on social networking sites like MySpace and Facebook may be of concern, but that's one of the reasons I thought Facebook's email limitation was clever. If it hadn't gone 100% public and had stayed limited to .edu and work domains, then presuming that all school and work emails contain some string variant of the real name, there is some degree of security. The downside is obviously those with their own domains that could claim to be a business but aren't really. Still, that's a lot of trouble to go to to prank some idiot friend of yours.

I guess the other problem is the obvious spammers, for which I have no real solution. The sad fact of it is that the persistent spammer just has no life and/or is being paid to be adamant, so you either need to match those resources or grow a higher tolerance.

I think I had something else to say, but I've had about twelve hours of sleep this week and my brain is laughing at me right now, sorry. :<

P. Static said...

Yeah, I kind of shifted from talking about identity to authentication at some point there, but from the POV of a website operator, they're kind of the same thing. Identity is meaningless without a way to verify it, authentication is meaningless without some underlying concept of identity, etc etc.

...though, you raise a good point. From a random internet user's point of view, identity is completely separate from the authentication the website uses. I should do a followup on that :x