I wrote recently about the paradox of identity management and how the easier it is to offer information, the more often it will be exchanged.
To address some of these issues, let me propose something different: The creation of an infrastructure that allows people to generate secure (effectively anonymous) pseudonyms in a manner that each person can have at most one such ID. (There would be various classes of these IDs, so people could have many IDs, but only one of each class.) I’ll call this a QID (the Q “standing” for “unique.”)
The value of a unique ID is strong — it allows one to associate a reputation with the ID. Because you can only get one QID, you are motivated to carefully protect the reputation associated with it, just as you are motivated to protect the reputation on your “real” identity. With most anonymous systems, if you develop a negative reputation, you can simply discard the bad ID and get a new one which has no reputation. That’s annoying but better than using a negative ID. (Nobody on eBay keeps an account that gets a truly negative reputation. An account is abandoned as soon as the reputation seems worse than an empty reputation.) In effect, anonymous IDs let you demonstrate a good reputation. Unique IDs let you demonstrate you don’t have a negative reputation. In some cases systems try to stop this by making it cost money or effort to generate a new ID, but it’s a hard problem. Anti-spam efforts don’t really care about who you are, they just want to know that if they ban you for being a spammer, you stay banned. (For this reason many anti-spam crusaders currently desire identification of all mailers, often with an identity tied to a real world ID.)
I propose this because many web sites and services which demand accounts really don’t care who you are or what your E-mail address is. In many cases they care about much simpler things — such as whether you are creating a raft of different accounts to appear as more than one person, or whether you will suffer negative consequences for negative actions. To solve these problems there is no need to provide personal information to use such systems.
As I propose it, the unique ID system would allow you to generate exactly one sub-QID for a given string. In addition, some of the strings would be integers, and thus have a ranking. For example, your QID.1 would be a highly important sub-QID. You would be highly protective of its reputation. Your QID.3400 would be low ranked, and you might not care if it got a bit of negative reputation. (Of course, in a working system, it should not be possible to associate these two IDs together, except the owner could prove she owned both.)
More commonly, the strings would be meaningful. LiveJournal might ask you to provide your “QID.LiveJournal” which only they would use or care about. This would assure them that you can’t appear twice. If they banned you for abuse, you could not come back. Sites wanting to conduct internet votes and polls would be keen to use a QID, as would most message boards, blog comments and the like.
Reputation need not be attached to all sub-QIDs. For example, only Livejournal itself need worry about what it thinks of a given user’s QID.LiveJournal. While one could have IDs which anybody can make reputation comments on, that allows for denial of service attacks with false comments. However, it should be noted that reputation marks are an excellent use for QIDs, since you usually want to collect all the reputation comments of a single person about another person together. People who participate in a reputation bombing could themselves gain bad reputation for doing so, at least within that system, and their reputation votes corrected.
It is better if people can only leave reputation comments on you if you have entered into an interaction with them. Since QIDs will almost certainly be based on digital signature, if I demand a sub-QID from you, I can demand one that includes a signed assertion that I, or some collection of parties, will be able to make reputation comments on that ID in the appropriate reputation database.
How to make it work
The simplest, though not sufficiently secure way to do this is to have a master QID issuing certificate authority. You would have to present something unique and verifiable about yourself to this authority to receive certification for a master key. A typical example might be an existing generally unique (and probably identifying) number such as a government ID number. They would need to record that this number has been “used” so that you can’t come and certify a different master key later. You would have to trust them to not associate this number in any way with your master key. This is difficult because it is hard to hide the date that a certificate was created, and any incremental backup of the database of used numbers will allow a connection to be drawn.
It is possible, with work and illegality, to get two governmental ID numbers, but this is probably rare enough to not present a major problem for most uses, even voting in national elections. It may also be possible to use some truly unique biometric to assure people don’t get two QIDs, though twins can be an issue. Biometrics tend to return a range of values rather than a unique integer, which often means that they must be stored rather than simply checked off, but this technology is improving. A biometric which is not readily extractable without your cooperation is of course preferred — if it becomes common to have a database mapping the biometric to real people (such as fingerprints or eventually DNA) then any storage of the biometric is effectively a storage of your “real” identity.
In the simple, but not very secure case, you would simply be trusting this central agency to protect your identity. This is not very good for true anonymity, as they would presumably disclose it due to a court order or corrupt employee or computer security breach. Fortunately, there is some help in the field of Blinding (cryptography), a technique which allows a party to digitally sign something without knowing (or being able to record) what they are signing. This would be the likely technique for the generation of individual string-associated QIDs. However, it is also necessary that these organizations keep no logs about the creation of such QIDs because knowing when they are asked for, or from what IP address, may allow users to associate one sub-QID with another sub-QID for the same user. It may make sense to have two different CAs, using different levels of blinding to get a certificate from a trusted master party that says “This key belongs to the holder of a unique QID for the string XYZ” and nothing else in a way that no party can tie that certificate to a master key or another sub-QID.
Multiple issuers of QIDs
Ideally it is desirable to have many competing parties who can issue QIDs that will be trusted, or indeed to have tree hierarchies of them certified by a modest number of root keys. However, it most not be possible to get two master QIDs by going to two different providers, or even to get two sub-QIDs with the same code string through the use of two different providers. There are a variety of ways to attack this problem, though they all largely require that all providers use the same “base” value — government ID number or biometric — to assure uniqueness. At a simple level there could just be a master database of who already has generated a master QID. While this database could be kept only to authorized providers, its contents should be be presumed to be very secret, including information as to when each person got their master QID.
Unfortunately government ID numbers are not sparse. It’s almost always possible to map them, or even encrypted versions of them back to the original number, and thus person, through simple brute force attacks. Even a highly sparse system can be attacked this way by anybody with a database of known numbers that includes you, including of course, the government itself. Likewise anybody with a database of biometrics can usually search it, though it may be possible to provide more security here. There is a dilemma here — we don’t want to generate a large privacy invading database of biometric data in the interests of producing a privacy-protecting technology.
One interesting approach derives from David Chaum’s work in anonymous digital cash. In this system, a person’s real identity is broken up into two “halves” (in a cryptographic way so that one half is useless but both halves reveal everything.) When doing a unique transaction, the other party randomly requests either the first or second half, and publishes it. If you do this twice (ie. cheat) the odds are 50% that both halves of your true identity will be published, unmasking you. If this is something you have to do regularly, it’s too risky to cheat. This only works, however, if you can’t simply walk away the second time if somebody requests the other half of one of these identity pairs. Unfortunately there are circumstances in this proposed system where this might happen, though ideally this can be solved if it is possible to attach a highly negative reputation to a QID or sub-QID which does this.
Problems also exist if keys are compromised or lost. Since the goal of the system is to bar you from just getting another key, it is hard to deal with lost keys, and even harder to deal with stolen or revealed keys. In addition there is the problem of a revealed key allowing all your sub-QIDs to be tied together, or even worse, tied to a real identity. Certainly it will be common due to error or deliberate choice for sub-QIDs to be associated with a real identity, and this need not be particularly harmful in and of itself, but the accidental association of say the most important QID.1 with a real identity or others could be quite troublesome.
Again, if you simply totally trust the master entity or entities, these problems are not too hard to solve. If they know who you are, they can internally revoke master keys and allow the use of new ones. Normally it should not be necessary to revoke a set of sub-QIDs but this could happen if a PC with a collection of the private keys in the clear is compromised. This is bad because such a public revocation would make it easy to tie all the sub-QIDs together. Of course the thief could also do this publicly or privately.
With multiple issuers, the revocation of an old master key is trouble because the master issuers must know not to re-issue a sub-QID which has been issued before under the old master. They thus must be able to get a reliable list of all sub-QIDs issued under that master, something which in theory does not exist except perhaps with the user. (Generally, it is preferred that two attempts to generate the same sub-QID would generate something that is the same in both certificates without requiring any local storage at the CAs about what was done. However, this means that a compromise of the master key allows tracking what all the sub-keys are even if they weren’t stored in the same place as they commonly would.)
To solve key loss problems, key escrow may be necessary. However, this does not solve key compromise.
Simpler, cheaper QIDs
To kick-start such a program, it would also be possible to assign QIDs in a cheaper way that is not nearly as accurate. In such a system, some people might get multiple master QIDs and thus be able to throw away reputations, but the system might still be better than what exists. Already some sites use techniques like these to generate their identities.
A simple example is to mail a letter to a person asking for an identity at their street address. The letter would contain a magic password. This can be suborned by having many addresses, or trying to pretend there is a large apartment building at your location unless a database can counter this. Another common technique is to send a text message to a cell phone, though again it is not hard for people to get multiple cell phone numbers cheaply. A slightly better system might involve a letter to a person via “general delivery” at the post office. This will cause the post office to verify a photo-ID on pickup, but only to match the name, usually. Combinations of these methods could also be used (you must pick up a letter at your post office, and at your street address, and a text message on your cell phone) which would make it harder, but not impossible to get multiple master QIDs.
Once such a system kickstarted the QID world, a move could be done to more well verified unique IDs. Those who snuck more than one of the cheaper QIDs could only import the reputation of the best of them.
It is important to understand this is intended to be a user enabling technology — something that lets an otherwise anonymous user prove that they are unique and not another account for another user — but it could be used by some as a restricting technology if it becomes too mandatory. One would not want to demand even this level of identity for ordinary web use or most other applications. In addition, you don’t want to have to use this in places where it is perfectly acceptable to have multiple identities, such as E-mail. I have literally hundreds of E-mail addresses because I give a different one to each web site I register on, and I would not want to be forced to tie them all to a single sub-QID, but anti-spammers might call for this.
Instead for spam, QIDs could be issued to non-persons (such as corporations) in a similar unique way. They could be distinguished from the QIDs issued to persons. (Of course you can own several corporations, but in many jurisdictions this is expensive.) Such a QID might be used by a mail server, rather than a mail user, to identify themselves in sending mail while allowing the server to authenticate its users as it likes.
Like other identity systems, there is also the risk of people using it because it’s there, rather than because assuring the uniqueness of users is important. As such it should not be trivially easy to use.
Clearly more design and research are needed to make such a system workable. There are many trade-offs between the risks of having a trusted central agency and the complexities required by not doing so. It may make sense to have more than one namespace with more than one system, the choice of system depending on the threat model. Voting in important elections is very different from posting comments on blogs or anti-spam, and the same approach may not be best for all these.
In addition, if use of this system will require large databases of biometrics or real identifying information, it should only be used when the alternative is just as bad. In many cases, truly anonymous IDs (which can be created in bulk) may be fine for the problem. This is usually the case when only positive reputation matters, which is quite common.
While it’s pretty clear how to do this with a single master agency that makes promises to protect and destroy data, I welcome proposals for how to make this work better without such a monopoly, while dealing with the issues of key loss and compromise. In particular ways to make it work with multiple agencies in different countries, so no one country has special power to extract identity.
Of course, all of this requires more use of digital signature by users than we’ve ever been able to see deployed en masse, which creates another major barrier to adoption.