You are here

Data Deposit Box pros and cons


Recently, I wrote about thedata deposit box, an architecture where applications come to the data rather than copying your personal data to all the applications.

Let me examine some more of the pros and cons of this approach:

The biggest con is that it does make things harder for application developers. The great appeal of the Web 2.0 "cloud" approach is that you get to build, code and maintain the system yourself. No software installs, and much less portability testing (browser versions) and local support. You control the performance and how it scales. When there's a problem, it's in your system so you can fix it. You design it how you want, in any language you want, for any OS you want. All the data is there, there are no rules. You can update the software any time, other than the user's browser and plugins.

The next con is the reliability of user's data hosts. You don't control it. If their data host is slow or down, you can't fix that. If you want the host to serve data to their friends, it may be slow for other people. The host may not be located in the same country as the person getting data from it, making things slower.

The last con is also the primary feature of data hosting. You can't get at all the data. You have to get permissions, and do special things to get at data. There are things you just aren't supposed to do. It's much easier, at least right now, to convince the user to just give you all their data with few or no restrictions, and just trust you. Working in a more secure environment is always harder, even if you're playing by the rules.

Those are pretty big cons. Especially since the big "pro" -- stopping the massive and irrevocable spread of people's data -- is fairly abstract to many users. It is the fundamental theorem of privacy that nobody cares about it until after it's been violated.

But there's another big pro -- cheap scalability. If users are paying for their own data hosting, developers can make applications with minimal hosting costs. Today, building a large cloud app that will get a lot of users requires a serious investment in providing enough infrastructure for it to work. YouTube grew by spending money like water for bandwidth and servers, and so have many other sites. If you have VCs, it's relatively inexpensive, but if you're a small time garage innovator, it's another story. In the old days, developers wrote software that ran on user's PCs. Running the software didn't cost the developer anything, but trying to support on a thousand different variations of the platform did.

With a data hosting architecture, we can get the best of both worlds. A more stable platform (or so we hope) that's easy to develop for, but no duty to host most of its operations. Because there is no UI in the data hosting platform, it's much simpler to make it portable. People joked that Java became write-once, debug everywhere for client apps but for server code it's much closer to its original vision. The UI remains in the browser.

For applications with money to burn, we could develop a micropayment architecture so that applications could pay for your hosting expenses. Micropayments are notoroiusly hard to get adopted, but they do work in more restricted markets. Applications could send payment tokens to your host along with the application code, allowing your host to give you bandwidth and resources to run the application. It would all be consolidated in one bill to the application provider.

Alternately, we could develop a system where users allow applications to cache results from their data host for limited times. That way the application providers could pay for reliable, globally distributed resources to cache the results.

For example, say you wanted to build Flickr in a data hosting world. Users might host their photos, comments and resized versions of the photos in their data host, much of it generated by code from the data host. Data that must be aggregated, such as a search index based on tags and comments, would be kept by the photo site. However, when presenting users with a page filled with photo thumbnails, those thumbnails could be served by the owner's data host, but this could generate unreliable results, or even missing results. To solve this, the photo site might get the right to cache the data where needed. It might cache only for users who have poor hosting. It might grant those who provide their own premium hosting with premium features since they don't cost the site anything.

As such, well funded startups could provide well-funded quality of service, while no-funding innovators could get going relying on their users. If they became popular, funding would no doubt become available. At the same time, if more users buy high quality data hosting, it becomes possible to support applications that don't have and never will have a "business model." These would, in effect, be fee-paid apps rather than advertising or data harvesting funded apps, but the fees would be paid because the users would take on the costs of their own expenses.

And that's a pretty good pro.


An application that is hosted separate from its data will by the nature of the network have much higher latency and lower bandwidth access to the data.

This limits the user experiences such application can provide.

The right to cache data locally at the application host is a partial work around to this. In reality even the caching has high latency if you want to keep it coherent with other applications views of the same data. Locking the data to be owned by the application to avoid cache sync and verification latencies is effectively the same as not using your own data host in the first place as it essentially gives up having the data host be the master copy of the data.

I expect caching data locally with an application is likely just as expensive to boot strap as providing the data hosting yourself in the first place. caching = hosting.

Regardless, users being able to buy high quality data hosting with APIs such that applications external to the hosts can make great use of them sounds exciting. It leaves users with much more control over their data.

You misunderstand. The data is not divided from the app. The apps run on the data host. In fact, if the data host is your pc, you get superb latency.

In that mode it's a mixture of the old PC model and the new cloud model. Like the cloud but your PC becomes part of it and the browser is your window to the app running on your PC. Sounds silly at first, but it means you can get real portability.

Similar in some ways to what the Gears browser extensions allow people to do with their web apps... Allows them to run locally including storing data locally when the user is offline and resync with the cloud when back online.

This is for social apps that interact with your friends, so it won't be able to do much offline.

Add new comment

Subscribe to Comments for "Data Deposit Box pros and cons"