Privacy

The Personal Cloud and Data Deposit Box

Last night I gave a short talk at the 3rd "Personal Clouds" meeting in San Francisco, The term "personal clouds" is a bit vague at present, but in part it describes what I had proposed in 2008 as the "data deposit box" -- a means to acheive the various benefits of corporate-hosted cloud applications in computing space owned and controlled by the user. Other people are interpreting the phrase "personal clouds" to mean mechanisms for the user to host, control or monetize their own data, to control their relationships with vendors and others who will use that data, or in the simplest form, some people are using it to refer to personal resources hosted in the cloud, such as cloud disk drive services like Dropbox.

I continue to focus on the vision of providing the advantages of cloud applications closer to the user, bringing the code to the data (as was the case in the PC era) rather than bringing the data to the code (as is now the norm in cloud applications.)

Consider the many advantages of cloud applications for the developer:

  • You write and maintain your code on machines you build, configure and maintain.
    • That means none of the immense support headaches of trying to write software to run on mulitple OSs, with many versions and thousands of variations. (Instead you do have to deal with all the browsers but that's easier.)
    • It also means you control the uptime and speed
    • Users are never running old versions of your code and facing upgrade problems
    • You can debug, monitor, log and fix all problems with access to the real data
  • You can sell the product as a service, either getting continuing revenue or advertising revenue
  • You can remove features, shut down products
  • You can control how people use the product and even what steps they may take to modify it or add plug-ins or 3rd party mods
  • You can combine data from many users to make compelling applications, particuarly in the social space
  • You can track many aspects of single and multiple user behaviour to customize services and optimize advertising, learning as you go

Some of those are disadvantages for the user of course, who has given up control. And there is one big disadvantage for the provider, namely they have to pay for all the computing resources, and that doesn't scale -- 10x users can mean paying 10x as much for computing, especially if the cloud apps run on top of a lower level cloud cluster which is sold by the minute.

But users see advantages too:

Topic: 

Speaking on Personal Clouds in SF, and Robocars in Phoenix

Two upcoming talks:

Tomorrow (April 4) I will give a very short talk at the meeting of the personal clouds interest group. As far as I know, I was among the first to propose the concept of the personal cloud in my essages on the Data Deposit Box back in 2007, and while my essays are not the reason for it, the idea is gaining some traction now as more and more people think about the consequences of moving everything into the corporate clouds.

Your session has expired. Forgot your password? Click Here!

We see it all the time. We log in to a web site but after not doing anything on the site for a while -- sometimes as little as 10 minutes -- the site reports "your session has timed out, please log in again."

And you get the login screen. Which offers, along with the ability to log in, a link marked "Forget your password?" which offers the ability to reset (OK) or recover (very bad) your password via your E-mail account.

The same E-mail account you are almost surely logged into in another tab or another window on your desktop. The same e-mail account that lets you go a very long time idle before needing authentication again -- perhaps even forever.

So if you've left your desktop and some villain has come to your computer and wants to get into that site that oh-so-wisely logged you out, all they need to is click to recover the password, go into the E-mail to learn it, delete that E-mail and log in again.

Well, that's if you don't, as many people do, have your browser remember passwords, and thus they can log-in again without any trouble.

It's a little better if the site does only password reset rather than password recovery. In that case, they have to change your password, and you will at least detect they did that, because you can't log in any more and have to do a password reset. That is if you don't just think, "Damn, I must have forgotten that password. Oh well, I will reset it now."

In other words, a lot of user inconvenience for no security, except among the most paranoid who also have their E-mail auth time out just as quickly, which is nobody. Those who have their whole computer lock with the screen saver are a bit better off, as everything is locked out, as long as they also use whole disk encryption to stop an attacker from reading stuff off the disk.

Topic: 

Meter to show speakers when they are losing the audience

Any speaker or lecturer is familiar with a modern phenomenon. A large fraction of your audience is using their tablet, phone or laptop doing email or surfing the web rather than paying attention to you. Some of them are taking notes, but it's a minority. And it seems we're not going to stop this, even speakers do it when attending the talks of others.

Don't count my old passwords as failed login attempts

Like most people, I have a lot of different passwords in my brain. While we really should have used a different system from passwords for web authentication, that's what we are stuck with now. A general good policy is to use the same password on sites you don't care much about and to use more specific passwords on sites where real harm could be done if somebody knows your password, such as your bank or email.

Understanding when and how to be secure

Over the years I have come to the maxim that "Everything should be as secure as is easy to use, and no more secure" to steal a theme from Einstein. One of my peeves has been the many companies who, feeling that E-mail is insecure, instead send you an E-mail that tells you you have an E-mail if you would only log onto their web site (often one you rarely log into) with the password you set up 2 years ago to read it.

The efficacy of trusted traveler programs

A new paper on trusted traveler programs from RAND Corp goes into some detailed math analysis of various approaches to a trusted traveler program. In such a program, you pre-screen some people, and those who pass go into a trusted line where they receive a lesser security check. The resources saved in the lesser check are applied to give all other passengers a better security check. This was the eventual goal of the failed CLEAR card -- though while it operated it just got you to the front of the line, it didn't reduce your security check.

The analysis shows that with a "spherical horse" there are situations where the TT program could reduce the number of terrorists making it through security with some weapon, though it concludes the benefit is often minor, and sometimes negative. I say spherical horse because they have to idealize the security checks in their model, just declaring that an approach has an X% chance of catching a weapon, and that this chance increases when you spend more money and decreases when you spend less, though it has diminishing returns since you can't get better than 100% no matter what you spend.

The authors know this assumption is risky. Turns out there is a form of security check which does match this model, which is random intense checking. There the percentage of weapons caught is pretty closely tied with the frequency of the random check. The TTs would just get a lower probability of random check. However, very few people seem to be proposing this model. The real approaches you see involve things like the TTs not having to take their shoes off, or somehow bypassing or reducing one of the specific elements of the security process compared to the public. I believe these approaches negate the positive results in the Rand study.

This is important because while the paper puts a focus on whether TT programs can get better security for the same dollar, the reality is I think a big motive for the TT approach is not more security, but placation of the wealthy and the frequent flyer. We all hate security and the TSA, and the airlines want to give better service and even the TSA wants to be hated a bit less. When a grandmother or 10 year old girl gets a security pat down, it is politically bad, even though it is the right security procedure. Letting important passengers get a less intrusive search has value to the airlines and the powerful, and not doing intrusive searches that seem stupid to the public has political value to the TSA as well.

We already have such a program, and it's not just the bypass of the nudatrons (X ray scanners) that has been won by members of congress and airline pilots. It's called private air travel. People with their own planes can board without security at all for them or their guests. They could fly their planes into buildings if they wished, though most are not as big as the airliners from 9/11. Fortunately, the chance that the captains of industry who fly these planes would do this is tiny, so they fly without the TSA. The bypass for pilots seems to make a lot of sense at first blush -- why search a pilot for a weapon she might use to take control of the plane? The reality is that giving a pass to the pilots means the bad guy's problem changes from getting a weapon through the X-ray to creating fake pilot ID. It seems the latter might actually be easier than the former.

The "Forgetful Broker" is needed for Data Deposit Box

For some time I've been advocating a concept I call the Data Deposit Box as an architecture for providing social networking and personal data based applications in a distributed way that tries to find a happy medium between the old PC (your data live on your machine) and the modern cloud (your data live on 3rd party corporate machines) approach. The basic concept is to have a piece of cloud that you legally own (a data deposit box) where your data lives, and code from applications comes and runs on your box, but displays to your browser directly. This is partly about privacy, but mostly about interoperability and control.

This concept depends on the idea of publishing and subscribing to feeds from your friends (and other sources.) Your friends are updating data about themselves, and you might want to see it -- ie. things like the Facebook wall, or Twitter feed. Feeds themselves would go through brokers just for the sake of efficiency, but would be encrypted so the brokers can't actually read them.

There is a need for brokers which do see the data in certain cases, and in fact there's a need that some types of data are never shown to your friends.

Crush

One classic example is the early social networking application the "crush" detector. In this app you get to declare a crush on a friend, but this is only revealed when both people have a mutual crush. Clearly you can't just be sending your crush status to your friends. You need a 3rd party who gets the status of both of you, and only alerts you when the crush is mutual. (In some cases applications like this can be designed to work without the broker knowing your data through the process known as blinding (cryptography).)

Topic: 

Working on Robocars at Google

As readers of this blog surely know, for several years I have been designing, writing and forecasting about the technology of self-driving "robocars" in the coming years. I'm pleased to announce that I have recently become a consultant to the robot car team working at Google.

Of course all that work will be done under NDA, and so until such time as Google makes more public announcements, I won't be writing about what they or I are doing. I am very impressed by the team and their accomplishments, and to learn more I will point you to my blog post about their announcement and the article I added to my web site shortly after that announcement. It also means I probably won't blog in any detail about certain areas of technology, in some cases not commenting on the work of other teams because of conflict of interest. However, as much as I enjoy writing and reporting on this technology, I would rather be building it.

My philosophical message about Robocars I have been saying for years, but it should be clear that I am simply consulting on the project, not setting its policies or acting as a spokesman.

My primary interest at Google is robocars, but many of you also know my long history in online civil rights and privacy, an area in which Google is often involved in both positive and negative ways. Indeed, while I was chairman of the EFF I felt there could be a conflict in working for a company which the EFF frequently has to either praise or criticise. I will be recusing myself from any EFF board decisions about Google, naturally.

Banks: Give me two passwords

Passwords are in the news thanks to Gawker media, who had their database of userids, emails and passwords hacked and published on the web. A big part of the fault is Gawker's, who was saving user passwords (so it could email them) and thus was vulnerable. As I have written before, you should be very critical of any site that is able to email you your password if you forget it.

Some of the advice in the wake of this to users has been to not use the same password on multiple sites, and that's not at all practical in today's world. I have passwords for many hundreds of sites. Most of them are like gawker -- accounts I was forced to create just to leave a comment on a message board. I use the same password for these "junk accounts." It's just not a big issue if somebody is able to leave a comment on a blog with my name, since my name was never verified in the first place. A different password for each site just isn't something people can manage. There are password managers that try to solve this, creating different passwords for each site and remembering them, but these systems often have problems when roaming from computer to computer, or trying out new web browsers, or when sites change their login pages.

The long term solution is not passwords at all, it's digital signature (though that has all the problems listed above) and it's not to even have logins at all, but instead use authenticated actions so we are neither creating accounts to do simple actions nor using a federated identity monopoly (like Facebook Connect). This is better than OpenID too.

Tags: 

Can your computer be like your priest?

I've had a blogging hiatus of late because I was heavily involved last week with Singularity University a new teaching institution about the future created by Nasa, Google, Autodesk and various others. We've got 80 students, most from outside North America, here for the summer graduate program, and they are quite an interesting group.

Topic: 

Explicit interfaces for social media

The lastest Facebook flap has caused me to write more about privacy of late, and that will continue has we head into the June 15 conference on Computers, Freedom and Privacy where I will be speaking on privacy implications of robots.

Social networks want nice easy user interfaces, and complex privacy panels are hard to negotiate by users who don't want to spend the time learning all the nuances of a system. People usually end up using the defaults.

Topic: 

When is "opt out" a "cop out?"

As many expected would happen, Mark Zuckerberg did an op-ed column with a mild about face on Facebook's privacy changes. Coming soon, you will be able to opt out of having your basic information defined as "public" and exposed to outside web sites. Facebook has a long pattern of introducing a new feature with major privacy issues, being surprised by a storm of protest, and then offering a fix which helps somewhat, but often leaves things more exposed than they were before.

For a long time, the standard "solution" to privacy exposure problems has been to allow users to "opt out" and keep their data more private. Companies like to offer it, because the reality is that most people have never been exposed to a bad privacy invasion, and don't bother to opt out. Privacy advocates ask for it because compared to the alternative -- information exposure with no way around it -- it seems like a win. The companies get what they want and keep the privacy crowd from getting too upset.

Sometimes privacy advocates will say that disclosure should be "opt in" -- that systems should keep information private by default, and only let it out with the explicit approval of the user. Companies resist that for the same reason they like opt-out. Most people are lazy and stick with the defaults. They fear if they make something opt-in, they might as well not make it, unless they can make it so important that everybody will opt in. As indeed is the case with their service as a whole.

Neither option seems to work. If there were some way to have an actual negotiation between the users and a service, something better in the middle would be found. But we have no way to make that negotiation happen. Even if companies were willing to have negotiation of their "I Agree" click contracts, there is no way they would have the time to do it.

The peril of the Facebook anti-privacy pattern

There's been a well justified storm about Facebook's recent privacy changes. The EFF has a nice post outlining the changes in privacy policies at Facebook which inspired this popular graphic showing those changes.

But the deeper question is why Facebook wants to do this. The answer, of course, is money, but in particular it's because the market is assigning a value to revealed data. This force seems to push Facebook, and services like it, into wanting to remove privacy from their users in a steadily rising trend. Social network services often will begin with decent privacy protections, both to avoid scaring users (when gaining users is the only goal) and because they have little motivation to do otherwise. The old world of PC applications tended to have strong privacy protection (by comparison) because data stayed on your own machine. Software that exported it got called "spyware" and tools were created to rout it out.

Facebook began as a social tool for students. It even promoted that those not at a school could not see in, could not even join. When this changed (for reasons I will outline below) older members were shocked at the idea their parents and other adults would be on the system. But Facebook decided, correctly, that excluding them was not the path to being #1.

Topic: 

Data Hosting architectures and the safe deposit box

With Facebook seeming to declare some sort of war on privacy, it's time to expand the concept I have been calling "Data Hosting" -- encouraging users to have some personal server space where their data lives, and bringing the apps to the data rather than sending your data to the companies providing interesting apps.

I think of this as something like a "safe deposit box" that you can buy from a bank. While not as sacrosanct as your own home when it comes to privacy law, it's pretty protected. The bank's role is to protect the box -- to let others into it without a warrant would be a major violation of the trust relationship implied by such boxes. While the company owning the servers that you rent could violate your trust, that's far less likely than 3rd party web sites like Facebook deciding to do new things you didn't authorize with the data you store with them. In the case of those companies, it is in fact their whole purpose to think up new things to do with your data.

Nonetheless, building something like Facebook using one's own data hosting facilities is more difficult than the way it's done now. That's because you want to do things with data from your friends, and you may want to combine data from several friends to do things like search your friends.

One way to do this is to develop a "feed" of information about yourself that is relevant to friends, and to authorize friends to "subscribe" to this feed. Then, when you update something in your profile, your data host would notify all your friend's data hosts about it. You need not notify all your friends, or tell them all the same thing -- you might authorize closer friends to get more data than you give to distant ones.

Topic: 

Police robots everywhere?

It is no coincidence that two friends of mine have both founded companies recently to build telepresence robots. These are easy to drive remote control robots which have a camera and screen at head height. You can inhabit the robot, and drive it around a flat area and talk to people by videoconferencing. You can join meetings, go visit people or inspect a factory. Companies building these robots, initially at high prices, intend to sell them both to executives who want to remotely tour remote offices and to companies who want to give cheaper remote employees a more physical presence back at HQ.

There are also a few super-cheap telepresence robots, such as the Spykee, which runs Skype video conferencing and can be had for as low as $150. It's not very good, and the camera is very low down, and there's no screen, but it shows just how cheap such a product can get.

"Anybots" QA telepresence robot

When they get down to a price like that, it seems inevitable to me that we will see an emergency services robot on every block, primarily for use by the police. When there is a police, fire or ambulance call to an address, an officer could immediately connect to the robot on that block and drive it to the scene, to be telepresent. The robot would live in a small, powered protective closet either paid for by the city, but more likely just donated by some neighbour on the block who wants the fastest possible emergency response. Called into action, the robot's garage door would open and the robot would drive out, and probably be at the location of the emergency within 60 to 120 seconds, depending on how densely they are placed. In the meantime actual first responders might also be on the way.

What could such a robot do?

Towards a more secure web, and better TLS

Today an interesting paper (written with the assistance of the EFF) was released. The authors have found evidence that governments are compromising trusted "certificate authorities" by issuing warrants to them, compelling them to create a false certificate for a site whose encrypted traffic they want to snoop on.

The privacy risks of genetic genealogy (23andMe part 2)

Last week, I wrote about interesting experiences finding Cousins who were already friends via genetic testing. 23andMe's new "Relative Finder" product identifies the other people in their database of about 35,000 to whom you are related, guessing how close. Surprisingly, 2 of the 4 relatives I made contact with were already friends of mine, but not known to be relatives.

Many people are very excited about the potential for services like Relative Finder to take the lid off the field of genealogy. Some people care deeply about genealogy (most notably the Mormons) and others wonder what the fuss is. Genetic genealogy offers the potential to finally link all the family trees built by the enthusiasts and to provably test already known or suspected relationships. As such, the big genealogy web sites are all getting involved, and the Family Tree DNA company, which previously did mostly worthless haplogroup studies (and more useful haplotype scans,) is opening up a paired-chromosome scan service for $250 -- half the price of 23andMe's top-end scan. (There is some genealogical value to the deeper clade Y studies FTDNA does, but the Mitochondrial and 12-marker Y studies show far less than people believe about living relatives. I have a followup post about haplogroups and haplotypes in genealogy.) Note that in March 2010, 23andMe is offering a scan for just $199.

The cost of this is going to keep decreasing and soon will be sub-$100. At the same time, the cost of full sequencing is falling by a factor of 10 every year (!) and many suspect it may reach the $100 price point within just a few years. (Genechip sequencing only finds the SNPs, while a full sequencing reads every letter (allele) of your genome, and perhaps in the future your epigenome.

Discover of relatives through genetics has one big surprising twist to it. You are participating in it whether you sign up or not. That's because your relatives may be participating in it, and as it gets cheaper, your relatives will almost certainly be doing so. You might be the last person on the planet to accept sequencing but it won't matter.

Topic: 
Tags: 

Terror and security

One of the world's favourite (and sometimes least favourite) topics is the issue of terrorism and security. On one side, there are those who feel the risk of terrorism justifies significant sacrifices of money, convenience and civil rights to provide enough security to counter it. That side includes both those who honestly come by that opinion, and those who simply want more security and feel terrorism is the excuse to use to get it.

On the other side, critics point out a number of counter arguments, most of them with merit, including:

  • Much of what is done in the name of security doesn't actually enhance it, it just gives the appearance of doing so, and the appearance of security is what the public actually craves. This has been called "Security Theatre" by Bruce Schneier, who is a friend and advisor to the E.F.F.
  • We often "fight the previous war," securing against the tactics of the most recent attack. The terrorists have already moved on to planning something else. They did planes, then trains, then subways, then buses, then nightclubs.
  • Terrorists will attack where the target is weakest. Securing something just makes them attack something else. This has indeed been the case many times. Since everything can't be secured, most of our efforts are futile and expensive. If we do manage to secure everything they will attack the crowded lines at security.
  • Terrorists are not out to kill random people they don't know. Rather, that is their tool to reach their real goal: sowing terror (for political, religious or personal goals.) When we react with fear -- particularly public fear -- to their actions, this is what they want, and indeed what they plan to achieve. Many of our reactions to them are just what they planned to happen.
  • Profiling and identity checks seem smart at first, but careful analysis shows that they just give a more free pass to anybody the terrorists can recruit whose name is not yet on a list, making their job easier.
  • The hard reality is, that frightening as terrorism is, in the grand scheme we are for more likely to face harm and death from other factors that we spend much less of our resources fighting. We could save far more people applying our resources in other ways. This is spelled out fairly well in this blog post.

Now Bruce's blog, which I link to above, is a good resource for material on the don't-panic viewpoint, and in fact he is sometimes consulted by the TSA and I suspect they read his blog, and even understand it. So why do we get such inane security efforts? Why are we willing to ruin ourselves, and make air travel such a burden, and strip ourselves of civil rights?

There is a mistake that both sides make, I think. The goal of counter-terrorism is not to stop the terrorists from attacking and killing people, not directly. The goal of counter-terrorism is to stop the terrorists from scaring people. Of course, killing people is frightening, so it is no wonder we conflate the two approaches.

The odds of knowing your cousins: 23andme Part 1

Bizarrely, Jonathan Zittrain turns out to be my cousin -- which is odd because I have known him for some time and he is also very active in the online civil rights world. How we came to learn this will be the first of my postings on the future of DNA sequencing and the company 23andMe.

(Follow the genetics for part two and other articles.)

23andMe is one of a small crop of personal genomics companies. For a cash fee (ranging from $400 to $1000, but dropping with regularity) you get a kit to send in a DNA sample. They can't sequence your genome for that amount today, but they can read around 600,000 "single-nucleotide polymorphisms" (SNPs) which are single-letter locations in the genome that are known to vary among different people, and the subject of various research about disease. 23andMe began hoping to let their customers know about how their own DNA predicted their risk for a variety of different diseases and traits. The result is a collection of information -- some of which will just make you worry (or breathe more easily) and some of which is actually useful. However, the company's second-order goal is the real money-maker. They hope to get the sequenced people to fill out surveys and participate in studies. For example, the more people fill out their weight in surveys, the more likely they might notice, "Hey, all the fat people have this SNP, and the thin people have that SNP, maybe we've found something."

However, recently they added a new feature called "Relative Finder." With Relative Finder, they will compare your DNA with all the other customers, and see if they can find long identical stretches which are very likely to have come from a common ancestor. The more of this they find, the more closely related two people are. All of us are related, often closer than we think, but this technique, in theory, can identify closer relatives like 1st through 4th cousins. (It gets a bit noisy after this.)

Relative Finder shows you a display listing all the people you are related to in their database, and for some people, it turns out to be a lot. You don't see the name of the person but you can send them an E-mail, and if they agree and respond, you can talk, or even compare your genomes to see where you have matching DNA.

For me it showed one third cousin, and about a dozen 4th cousins. Many people don't get many relatives that close. A third cousin, if you were wondering, is somebody who shares a great-great-grandparent with you, or more typically a pair of them. It means that your grandparents and their grandparents were "1st" cousins (ordinary cousins.) Most people don't have much contact with 3rd cousins or care much to. It's not a very close relationship.

However, I was greatly shocked to see the response that this mystery cousin was Jonathan Zittrain. Jonathan and I are not close friends, more appropriately we might be called friendly colleagues in the cyberlaw field, he being a founder of the Berkman Center and I being at the EFF. But we had seen one another a few times in the prior month, and both lectured recently at the new Singularity University, so we are not distant acquaintances either. Still, it was rather shocking to see this result. I was curious to try to figure out what the odds of it are.

Tags: 

Pages