Submitted by brad on Tue, 2009-11-17 16:18.
(Update: I had a formatting error in the original posting, this has been fixed.)
A few weeks ago when I wrote about the non deployment of SSL I touched on an old idea I had to make web transactions vastly more efficient. I recently read about Google’s proposed SPDY protocol which goes in a completely opposite direction, attempting to solve the problem of large numbers of parallel requests to a web server by multiplexing them all in a single streaming protocol that works inside a TCP session.
While calling attention to that, let me outline what I think would be the fastest way to do very simple web transactions. It may be that such simple transactions are no longer common, but it’s worth considering.
Today the way this works is pretty complex:
- You do a DNS request for www.example.com via a UDP request to your DNS server. In the pure case this also means first asking where “.com” is but your DNS server almost surely knows that. Instead, a UDP request is sent to the “.com” master server.
- The “.com” master server returns with the address of the server for example.com.
- You send a DNS request to the example.com server, asking where “www.example.com is.”
- The example.com DNS server sends a UDP response back with the IP address of www.example.com
- You open a TCP session to that address. First, you send a “SYN” packet.
- The site responds with a SYN/ACK packet.
- You respond to the SYN/ACK with an ACK packet. You also send the packet with your HTTP “GET” reqequest for “/page.html.” This is a distinct packet but there is no roundtrip so this can be viewed as one step. You may also close off your sending with a FIN packet.
- The site sends back data with the contents of the page. If the page is short it may come in one packet. If it is long, there may be several packets.
- There will also be acknowledgement packets as the multiple data packets arrive in each direction. You will send at least one ACK.
The other server will ACK your FIN.
- The remote server will close the session with a FIN packet.
- You will ACK the FIN packet.
You may not be familiar with all this, but the main thing to understand is that there are a lot of roundtrips going on. If the servers are far away and the time to transmit is long, it can take a long time for all these round trips.
It gets worse when you want to set up a secure, encrypted connection using TLS/SSL. On top of all the TCP, there are additional handshakes for the encryption. For full security, you must encrypt before you send the GET because the contents of the URL name should be kept encrypted.
A simple alternative
Consider a protocol for simple transactions where the DNS server plays a role, and short transactions use UDP. I am going to call this the “Web Transaction Protocol” or WTP. (There is a WAP variant called that but WAP is fading.)
- You send, via a UDP packet, not just a DNS request but your full GET request to the DNS server you know about, either for .com or for example.com. You also include an IP and port to which responses to the request can be sent.
- The DNS server, which knows where the target machine is (or next level DNS server) forwards the full GET request for you to that server. It also sends back the normal DNS answer to you via UDP, including a flag to say it forwarded the request for you (or that it refused to, which is the default for servers that don’t even know about this.) It is important to note that quite commonly, the DNS server for example.com and the www.example.com web server will be on the same LAN, or even be the same machine, so there is no hop time involved.
- The web server, receiving your request, considers the size and complexity of the response. If the response is short and simple, it sends it in one UDP packet, though possibly more than one, to your specified address. If no ACK is received in reasonable time, send it again a few times until you get one.
- When you receive the response, you send an ACK back via UDP. You’re done.
The above transaction would take place incredibly fast compared to the standard approach. If you know the DNS server for example.com, it will usually mean a single packet to that server, and a single packet coming back — one round trip — to get your answer. If you only know the server for .com, it would mean a single packet to the .com server which is forwarded to the example.com server for you. Since the master servers tend to be in the “center” of the network and are multiplied out so there is one near you, this is not much more than a single round trip. read more »
Submitted by brad on Wed, 2009-10-28 22:03.
I just returned from Jeff Pulver’s “140 Characters” conference in L.A. which was about Twitter. I asked many people if they get Twitter — not if they understand how it’s useful, but why it is such a hot item, and whether it deserves to be, with billion dollar valuations and many talking about it as the most important platform.
Some suggested Twitter is not as big as it appears, with a larger churn than expected and some plateau appearing in new users. Others think it is still shooting for the moon.
The first value in twitter I found was as a broadcast SMS. While I would not text all my friends when I go to a restaurant or a club, having a way so that they will easily know that (and might join me) is valuable. Other services have tried to do things like this but Twitter is the one that succeeded in spite of not being aimed at any specific application like this.
This explains the secret of Twitter. By being simple (and forcing brevity) it was able to be universal. By being more universal it could more easily attain critical mass within groups of friends. While an app dedicated to some social or location based application might do it better, it needs to get a critical mass of friends using it to work. Once Twitter got that mass, it had a leg up at being that platform.
At first, people wondered if Twitter’s simplicity (and requirement for brevity) was a bug or a feature. It definitely seems to have worked as a feature. By keeping things short, Twitter makes is less scary to follow people. It’s hard for me to get new subscribers to this blog, because subscribing to the blog means you will see my moderately long posts every day or two, and that’s an investment in reading. To subscribe to somebody’s Twitter feed is no big commitment. Thus people can get a million followers there, when no blog has that. In addition, the brevity makes it a good match for the mobile phone, which is the primary way people use Twitter. (Though usually the smart phone, not the old SMS way.)
And yet it is hard not to be frustrated at Twitter for being so simple. There are so many things people do with Twitter that could be done better by some more specialized or complex tool. Yet it does not happen.
Twitter has made me revise slightly my two axes of social media — serial vs. browsed and reader-friendly vs. writer friendly. Twitter is generally serial, and I would say it is writer-friendly (it is easy to tweet) but not so reader friendly (the volume gets too high.)
However, Twitter, in its latest mode, is something different. It is “sampled.” In normal serial media, you usually consume all of it. You come in to read and the tool shows you all the new items in the stream. Your goal is to read them all, and the publishers tend to expect it. Most Twitter users now follow far too many people to read it all, so the best they can do is sample — they come it at various times of day and find out what their stalkees are up to right then. Of course, other media have also been sampled, including newspapers and message boards, just because people don’t have time, or because they go away for too long to catch up. On Twitter, however, going away for even a couple of hours will give you too many tweets to catch up on.
This makes Twitter an odd choice as a publishing tool. If I publish on this blog, I expect most of my RSS subscribers will see it, even if they check a week later. If I tweet something, only a small fraction of the followers will see it — only if they happen to read shortly after I write it, and sometimes not even then. Perhaps some who follow only a few will see it later, or those who specifically check on my postings. (You can’t. Mine are protected, which turns out to be a mistake on Twitter but there are nasty privacy results from not being protected.)
TV has an unusual history in this regard. In the early days, there were so few stations that many people watched, at one time or another, all the major shows. As TV grew to many channels, it became a sampled medium. You would channel surf, and stop at things that were interesting, and know that most of the stream was going by. When the Tivo arose, TV became a subscription medium, where you identify the programs you like, and you see only those, with perhaps some suggestions thrown in to sample from.
Online media, however, and social media in particular were not intended to be sampled. Sure, everybody would just skip over the high volume of their mailing lists and news feeds when coming back from a vacation, but this was the exception and not the rule.
The question is, will Twitter’s nature as a sampled medium be a bug or a feature? It seems like a bug but so did the simplicity. It makes it easy to get followers, which the narcissists and the PR flacks love, but many of the tweets get missed (unless they get picked up as a meme and re-tweeted) and nobody loves that.
On Protection: It is typical to tweet not just blog-like items but the personal story of your day. Where you went and when. This is fine as a thing to tell friends in the moment, but with a public twitter feed, it’s being recorded forever by many different players. The ephemeral aspects of your life become permanent. But if you do protect your feed, you can’t do a lot of things on twitter. What you write won’t be seen by others who search for hashtags. You can’t reply to people who don’t follow you. You’re an outsider. The only way to solve this would be to make Twitter really proprietary, blocking all the services that are republishing it, analysing it and indexing it. In this case, dedicated applications make more sense. For example, while location based apps need my location, they don’t need to record it for more than a short period. They can safely erase it, and still provide me a good app. They can only do this if they are proprietary, because if they give my location to other tools it is hard to stop them from recording it, and making it all public. There’s no good answer here.
Submitted by brad on Wed, 2009-09-23 17:50.
It seems that with more and more of the online transactions I engage in — and sometimes even when I don’t buy anything — I will get a request to participate in a customer satisfaction survey. Not just some of the time in some cases, but with every purchase. I’m also seeing it on web sites — sometimes just for visiting a web site I will get a request to do a survey, either while reading, or upon clicking on a link away from the site.
On the surface this may seem like the company is showing they care. But in reality it is just the marketing group’s thirst for numbers both to actually improve things and to give them something to do. But there’s a problem with doing it all the time, or most of the time.
First, it doesn’t scale. I do a lot of transactions, and in the future I will do even more. I can’t possibly fill out a survey on each, and I certainly don’t want to. As such I find the requests an annoyance, almost spam. And I bet a lot of other people do.
And that actually means that if you ask too much, you now will get a self-selected subset of people who either have lots of free time, or who have something pointed to say (ie. they got a bad experience, or perhaps rarely a very good one.) So your survey becomes valueless as data collection the more people you ask to do it, or rather the more refusals you get. Oddly, you will get more useful results asking fewer people.
Sort of. Because if other people keep asking everybody, it creates the same burn-out and even a survey that is only requested from 1 user out of 1000 will still see high rejection and self-selection. There is no answer but for everybody to truly only survey a tiny random subset of the transactions, and offer a real reward (not some bogus coupon) to get participation.
I also get phone surveys today from companies I have actually done business with. I ask them, “Do you have this survey on the web?” So far, they always say no, so I say, “I won’t do it on the phone, sorry. If you had it on the web I might have.” I’m lying a bit, in that the probability is still low I would do it, but it’s a lot higher. I can do a web survey in 1/10th the time it takes to get quizzed on the phone, and my time is valuable. Telling me I need to do it on the phone instead of the web says the company doesn’t care about my time, and so I won’t do it and the company loses points.
Sadly, I don’t see companies learning these lessons, unless they hire better stats people to manage their surveys.
Also, I don’t want a reminder from everybody I buy from on eBay to leave feedback. In fact, remind me twice and I’ll leave negative feedback if I’m in a bad mood. I prefer to leave feedback in bulk, that way every transaction isn’t really multiple transactions. Much better if ebay sends me a reminder once a month to leave feedback for those I didn’t report on, and takes me right to the bulk feedback page.
Submitted by brad on Sun, 2009-06-07 16:29.
Twenty years ago (Monday) on June 8th, 1989, I did the public launch of ClariNet.com, my electronic newspaper business, which would
be delivered using USENET protocols (there was no HTTP yet) over the internet.
ClariNet was the first company created to use the internet as its platform for business, and as such this event has a claim at being the birth of the “dot-com” concept which so affected the world in the two intervening decades. There are other definitions and other contenders which I discuss in the article below.
In those days, the internet consisted of regional networks, who were mostly non-profit cooperatives, and the government funded “NSFNet” backbone which linked them up. That backbone had a no-commercial-use policy, but I found a way around it. In addition, a nascent commercial internet was arising with companies like UUNet and PSINet, and the seeds of internet-based business were growing. There was no web, of course. The internet’s community lived in e-Mail and USENET. Those, and FTP file transfer were the means of publishing. When Tim Berners-Lee would coin the term “the web” a few years later, he would call all these the web, and HTML/HTTP a new addition and glue connecting them.
I decided I should write a history of those early days, where the seeds of the company came from and what it was like before most of the world had even heard of the internet. It is a story of the origins and early perils and successes, and not so much of the boom times that came in the mid-90s. It also contains a few standalone anecdotes, such as the story of how I accidentally implemented a system so reliable, even those authorized to do so failed to shut it down (which I call “M5 reliability” after the Star Trek computer), stories of too-early eBook publishing and more.
There’s also a little bit about some of the other early internet and e-publishing businesses such as BBN, UUNet, Stargate, public access unix, Netcom, Comtex and the first Internet World trade show.
Extra, extra, read all about it: The history of ClariNet.com and the dawn of the dot-coms.
Submitted by brad on Sat, 2009-03-14 16:43.
As you may know, I allow anonymous comments on this blog. Generally, when a blog is small, you don’t want to do too much to discourage participation. Making people sign up for an account (particularly with email verification) is too much of a barrier when your comment volume is small. You can’t allow raw posting these days because of spammers — you need some sort of captcha or other proof-of-humanity — but in most cases moderate readership sites can allow fairly easy participation.
Once a site gets very popular, it probably wants to move to authenticated user posting only. In this case, once the comment forums are getting noisy, you want to raise the bar and discourage participation by people who are not serious. My sub blog on Battlestar Galactica has gotten quite popular of late, and is attracting 100 or more comments per post, even though it has only 1/10th the subscribers of the main blog. Almost all post using the anonymous mechanism which lets them fill in a name, but does nothing to verify it. Many still post under the default name of “Anonymous.”
Some sites let you login using external IDs, such as OpenID, or accounts at Google or Yahoo. On this site, you can log in using any ID from the drupal network, in theory.
However, drupal (which is the software running this site) and most other comment/board systems are not very good at providing an intermediate state, which I will call “casual comments.” Here’s what I would like to see:
- Unauthenticated posters may fill in parameters as they can now (like name, email, URL) and check a box to be remembered. They would get a long-term cookie set. The first post would indicate the user was new.
- Any future posts from that browser would use that remembered ID. In fact, they would need to delete the cookie or ask the site to do so in order to change the parameters.
- If they use the cookie, they could do things like edit their postings and several of the things that registered users can do.
- If they don’t pick a name, a random pseudonym would be assigned. The pseudonym would never be re-used.
- Even people who don’t ask to be remembered would get a random pseudonym. Again, such pseudonyms would not be re-used by other posters or registered users. They might get a new one every time they post. Possibly it could be tied to their IP, though not necessarily traceable back to it, but of course IPs change at many ISPs.
- If they lose the cookie (or move to another computer) they can’t post under that name, and must create a new one. If they want to post under the same name from many machines, create an account.
- The casual commenters don’t need to do more special things like create new threads, and can be quite limited in other ways.
In essence, a mini-account with no authorization or verification. These pseudonyms would be marked as unverified in postings. A posting count might be displayed. A mechanism should also exist to convert the pseudonym to a real account you can login from. Indeed, for many sites the day will come when they want to turn off casual commenting if it is getting abused, and thus many casual commenters will want to convert their cookies into accounts.
The main goal would be to remove confusion over who is posting in anonymous postings, and to stop impersonation, or accusations of impersonation, among casual posters.
I don’t think it should be too hard to make a module for drupal to modify the comment system like this if I knew drupal better.
Submitted by brad on Mon, 2008-04-07 14:58.
Ok, admit it, who likes blogging in to a vacuum. You want to know how many people are actually reading your blog.
I have created a simple Perl script that scans your blog’s log file and attempts to calculate how many people read the blog and the RSS feeds.
You can download the feed reader script. I release it under GPL2.
It’s a perl script, so you would go to your web server log in the shell, and type “perl feedreaders.pl logfilename”
or if you like just “tail -99999 blogfilename | perl feedreaders.pl -” because you only need to scan a couple of days worth of logs to get the figures.
Here are some notes:
- I take advantage of the fact that most blog aggregators now report how many people they are aggregating for. There is no standard but I have put in code to match the common patterns.
- I identify common RSS feed URLs, as well as the most common “main feed” names. If you have other feeds that it doesn’t pick up on, it’s easy to add them to the list at the start of the program.
- A reader has to fetch the feed or home page multiple times from the same IP to count
- On the other hand, people who change IPs regularly will count multiple times. People behind caches may count just once all together.
- I try to eliminate fetches from the most common non-RSS-aggregating spiders
- Based on my experiences, Google Reader and Bloglines are the most popular aggregators, then NewsGator.
- At least one aggregator identifies as Mozilla, custom code tags it.
- It also counts people who fetch your non-RSS blog page multiple times as readers.
- Programs that don’t say they handle multiple users get grouped among the singles.
- Programs with only a few fetches are not counted
I invite my 1146 main blog readers to give it a whirl. (The 53 readers of the new Battlestar blog feed won’t see this notice, nor the 72 reading the comments.
Submitted by brad on Sun, 2008-04-06 17:07.
Recently, while keynoting the Freedom 2 Connect conference in Washington, I spoke about some of my ideas for fiber networks being built from the ground up. For example, I hope for the day when cheap kits can be bought at local stores to fiber up your block by running fiber through the back yards, in some cases literally burying the fiber in the “grass roots.”
Doc Searls, while he was listening to the talk made up a clever term — “Glass Roots” to describe this, and other movements to deploy fiber bottom up, without waiting for telcos and city governments. Any time you can deploy a technology without permission and red tape, it quickly zooms ahead of other technology. Backyard fiber, — combined with cheaper, mass produced free-space-optics or gigabit EHF radio equipment to bridge blocks together across streets or make links to hilltops — could provide the bandwidth we want without waiting.
Because let’s face it. While wireless ISPs sound great and are indeed great for serving some types of customers, right now real bandwidth requires a wire or glass fiber in the ground, and that means monopoly telcos and cable companies as well as the hassles of city government. We want our gigabits (forget megabits) and we want them now.
There are other elements to this Glass Roots movement, though usually with city involvement. Several small towns have put in fiber based ISPs with good success. My friend Brewster Kahle, from the Internet Archive, has brought 100 megabit service to housing projects in San Francisco using some city-laid fiber and the Archive’s bandwidth. You go, Brewster.
Brough Turner has the right idea. We should get dark fiber under our streets, and lots of it, installed and leased by a company that is only in the fiber business, and not in the business of selling you video or phone service or internet. While this company might get a franchise, the important difference is that the franchised monopoly would not light the fiber. Instead, anybody could lease a fiber from their house to a major switching point, and light it any way they want. Darth Vader would tell us “you don’t understand the power of the dark fiber.”
Why is that important? While fiber and wire are basic, the technologies to “light them up” run on Moore’s law. They get obsolete very quickly. Instead of monopoly rents and long cost-plus amortization tables, you want lots of turnover in the actual electronics found at the ends. You want the option to get the latest stuff, which is usually faster and cheaper than the stuff from 2 years ago. Lots faster and lots cheaper.
If you get a lot of free market competition on what lights those endpoints, it gets even better. The result is plenty of choice in how you light it and who you get connectivity from. And that eliminates all the issues around network neutrality or walled gardens. The investment in the dark fiber can probably be amortized over a decade or two, which is long enough.
One might argue the monopoly should even just be at the level of a conduit which it’s easy to drag other things like fiber or wire through. And indeed, whoever does bury pipes under the streets should expect to pull other wires before too long. But having monopoly lockdown at any level above the glass is what slows down the advance of broadband. Get rid of that lockdown, and the real glass roots revolution can begin.
Submitted by brad on Mon, 2008-03-03 16:15.
Over the weekend I was at the [BIL conference]http://www.bilconference.com, a barcamp/unconference style justaposition on the very expensive TED conference. I gave a few talks, including one on self driving cars, privacy and AI issues.
The conference, being free, was at a small community center. This location did not have internet. Various methods were possible to provide internet. The easiest are routers which can take cellular network EVDO cards and offer an 802.11 access point. That works most places, but is not able to handle many people, and may or may not violate some terms of service. However, in just about all these locations there are locations very nearby with broadband internet which can be used, including hotels, businesses and even some private homes. But how to get the access in quickly?
What would be useful would be an “instant internet kit” with all you need to take an internet connection (or two) a modest distance over wireless. This kit would be packed up and available via courier to events that want internet access on just a couple of days notice.
What would you put in the kit? read more »
Submitted by brad on Fri, 2008-02-29 02:10.
As our devices get more and more complex, configuring them gets harder and harder. And for members of the non-tech-savvy public, close to impossible.
Here’s an answer: Develop a simple browser plug-in for all platforms that can connect a USB peripheral to a TCP socket back to the server where the plugin page came from. (This is how flash and Java applets work, in fact this could be added to flash or Java.)
Once activated, the remote server would be able to talk to the device like its USB master, sending and receiving data from it and talking other USB protocol commands. And that means it could do any configuration or setup you might like to do, under the control of a web application that has access to the full UI toolset that web applications have. You could upload new firmware into devices that can accept that, re-flash configuration, read configuration — do anything the host computer can do.
As a result, for any new electronics device you buy — camera, TV remote control, clock, TV, DVD player, digital picture frame, phone, toy, car, appliance etc. — you could now set it up with a nice rich web interface, or somebody else could help you set it up. It would work on any computer — Mac, Linux, Windows and more, and the web UIs would improve and be reprogrammed with time. No software install needed, other than the plug-in. Technicians could remotely diagnose problems and fix them in just about anything.
So there is of course one big question — security. Of course, the plug-in would never give a remote server access to a USB device without providing a special, not-in-browser prompt for the user to confirm the grant of access, with appropriate warnings. Certain devices might be very hard to give access to, such as USB hard drives, the mouse, the keyboard etc. In fact, any device which has a driver in the OS and is mounted by it would need extra confirmation (though that would make it harder to have devices that effectively look like standard USB flash drives into which basic config is simply read and written.)
One simple security technique would be to insist the device be hot plugged during the session. Ie. the plugin would only talk to USB devices that were not plugged in when the page was loaded, and then were plugged in as the app was running. The plugin would not allow constant reloading of the page to trick it on this.
For added security, smarter devices could insist on an authentication protocol with the server. Thus the USB device would send a challenge, which the server would sign/hash with its secret key, and the USB device could then check that using a public key to confirm its talking to its manufacturer. (This however stops 3rd parties from making better configuration tools, so it has its downsides.) It could also be arranged that only devices that exhibit a standard tag in their identification would allow remote control, so standard computer peripherals would not allow this. And the plugin could even maintain and update a list of vendors and items which do or don’t want to allow this.
There are probably some other security issues to resolve. However, should we resolve this it could result in a revolution of configuring consumer electronics, as finally everything would get a big screen, full mouse and keyboard web UI. (Non portable devices like cars and TVs would require a wireless laptop to make this work, but many people have that. Alternately they could use bluetooth, and the plugin could have a similar mode for working with paired bluetooth devices. Again, doing nothing without a strong user confirmation.)
This works because basic USB chips are very cheap now. Adding a small bit of flash to your electronics device and a mini-USB socket that can read and write the flash would add only a small amount to the cost of most items — nothing to many of them, as they already have it. Whatever new toy you buy, you could set it up on the web, and if the company provides a high level of service, you could speak to a tech support agent who could help you set it up right there.
Submitted by brad on Tue, 2008-02-19 18:43.
I’m a director of BitTorrent Inc. (though not speaking for it) and so the recent debate about P2P applications and ISPs has been interesting to me. Comcast has tried to block off BitTorrent traffic by detecting it and severing certain P2P connections by forging TCP reset packets. Some want net neutrality legislation to stop such nasty activity, others want to embrace it. Brett Glass, who runs a wireless ISP, has become a vocal public opponent of P2P.
Some base their opposition on the fact that since BitTorrent is the best software for publishing large files, it does get used by copyright infringers a fair bit. But some just don’t like the concept at all. Let’s examine the issues.
A broadband connection consists of an upstream and downstream section. In the beginning, this was always symmetric, you had the same capacity up as down. Even today, big customers like universities and companies buy things like T-1 lines that give 1.5 megabits in each direction. ISPs almost always buy equal sized pipes to and from their peers.
With aDSL, the single phone wire is multiplexed so that you get much less upstream than downstream. A common circuit will give 1.5mbps down and say 256kb up — a 6 to 1 ratio. Because cable systems weren’t designed for 2 way data, they have it worse. They can give a lot down, but they share the upstream over a large block of customers under the existing DOCSIS system. They also will offer upstream on near the 6 to 1 ratio but unlike the DSL companies, there isn’t a fixed line there. read more »
Submitted by brad on Mon, 2008-02-11 11:08.
Fast internet access at home has spoiled me. Like Manfred Macx in Tourist I feel like I’ve lost my glasses when I’m a tourist. I get annoyed that I can’t quickly and easily get at all the information that’s out there.
I would gladly rent the ultimate tourist mobile device. A large GPS equipped PDA (and also a cell phone for tourists roaming from other countries or from CDMA vs. GSM) that has everything. Every database that can be had on geo-data for the region I’m walking. It has mobile data service of course but also just pre-caches the region I’m in.
Not just the maps and the lists of tourist-related items like restaurants. I want reviews of those restaurants and ratings and even the menus, so I can easily ask “Where’s a the best place in the $15/plate range near here” and similar questions. I don’t just want every hotel in a town (not just the ones in the popular databases) I want their recently updated price offers. And with the data connection, I want something like Wotif for the hotels tied into the computer reservation networks.
I don’t just want to know where the museum is, I want all of its literature. I want its internal map, with all of the placards translated into my language. Indeed, I want just about everything I need to read in a geolocation translated into my language.
And I want opinions on everything, from travel writers, tourists and locals. I want every single major travel book on the area loaded and ready and searchable. (Because I will be searching I want this to be bigger than a typical PDA/phone and have a moderately usable keyboard, or a really big touchscreen keyboard.)
I want it to have a decent camera, both in case I forget to bring mine with me, but for something grander. I want to be able to photograph any sign, any menu, and have it upload the photo to a system that OCRs the text and translates it for me. This is no longer science fiction — decent camera based OCR is available, and while translation software still has its hiccups it’s starting to get decent. In fact, as this gets better, the need for a database of signs at locations becomes less. Of course it should also be able to let locals type messages for me on it which it translates.
It should be trainable to my voice as well, so I can enter text with speech recognition instead of typing. Both for using the device, and saying things that are translated for locals, either to the screen or output from today’s quality text to speech systems. This will get better as the translation software gets better. In some cases, the processing may be done in the cloud to save battery on my device. But as I’ve noted the normal portability requirements on this device are not the same as for my everyday PDA. I don’t mind if this is big and a bit heavy, sized more like a Kindle than an iPhone.
It should be able to take me on walking and driving tours, of course.
And finally, at additional cost, it should connect me to a person, via voice or IM, who can help me. That can be a travel agent to book me a room of course, but it can also be a local expert — somebody who perhaps even works sometimes as a tourist guide. Earlier I wrote of the ability to call a local expert where people with local expertise would register, and when they were online, they could receive calls, billed by the minute. Your device would know where you were, and might well connect you with somebody living one street over who speaks your language and can tell you things you want to know about the area.
Now some of the things I have described are expensive, though as such a device became popular the economies of scale kick in for popular tourist areas. But I’m imagining tourists paying $20 to $30 a day for such a device. Rented 2/3 of the year, that’s $5,000 to $7,000 of revenue in a single year — enough to pay for the things I describe — every travel guide, every database, high volume data service and more. And I want the real thing, not the advertising-biased false information found in typical tourist guides or the “I’m afraid to be critical of anything” information generated by local tourist bureaus.
Why would I pay so much? Travel costs for a party of tourists are an order of magnitude higher than this. I think it would be a rare day that such a device didn’t save you more than this by finding you better food at a better price, savings on hotels and more. And it would save you time. If you are paying $200 to $400/day to travel, including your airfare, your hours are precious. You want to spend them seeing the best things for your taste — not wondering where things are. Saving you an hour of futzing pays for the device.
With scale, it could come down under $10/day, making it crazy not to get it. In fact, locals would start to want some of these databases.
Of course, UI is paramount. You must not have to spend the time you save trying to figure out the UI of the device. That is non-trivial, but doable for a budget like this.
Submitted by brad on Thu, 2008-01-31 22:59.
eBay has announced sellers will no longer be able to leave negative feedback for buyers. This remarkably simple change has caused a lot of consternation. Sellers are upset. Should they be?
While it seems to be an even-steven sort of thing, what is the purpose of feedback for buyers, other than noting if they pay promptly? (eBay will still allow sellers to mark non-paying buyers.) Sellers say they need it to have the power to give negative feedback to buyers who are too demanding, who complain about things that were clearly stated in listings and so on. But what it means in reality is the ability to give revenge feedback as a way to stop buyers from leaving negatives. The vast bulk of sellers don’t leave feedback first, even after the buyer has discharged 99% of his duties just fine.
Fear of revenge feedback was hurting the eBay system. It stopped a lot of justly deserved negative feedback. Buyers came to know this, and know that a seller with a 96% positive rating is actually a poor seller in many cases. Whatever happens on the new system, buyers will also come to notice it. Sellers will get more negatives but they will all get more negatives. What matters is your percentile more than your percentage. In fact, good sellers may get a better chance to stand out in the revenge free world, because they will get fewer negatives than the bad sellers who were avoiding negatives by threat of revenge.
As such, the only sellers who should be that afraid are ones who think they will get more negatives than average.
To help, eBay should consider showing feedback scores before and after the change as well as total. By not counting feedback that’s over a year old they will effectively be doing that within a year, of course.
There were many options for elimination of revenge feedback. This one was one of the simplest, which is perhaps why eBay went for it. I would tweak a bit, and also take a look at a buyer’s profile and how often they leave negative feedback as a fraction of transactions. In effect, make a negative from a buyer who leaves lots and lots of negatives count less than one who never leaves negatives. Put simply, you could give a buyer some number, like 10 negatives per 100 transactions. If they do more than that, their negatives are reduced, so that if they do 20 negatives, each one only counts as a half. That’s more complex but helps sellers avoid worrying about very pesky buyers.
Feedback on buyers was always a bit dubious. After all, while you can cancel bids, it’s hard to pick your winner based on their feedback level. If your winner has a lousy buyer reptutation, there is not normally much you can do — just sit and hope for funds.
If eBay wants to get really bold, they could go a step further and make feedback mandatory for all buyers. (ie. your account gets disabled if you have too many feedbacks not left older than 40 days.) This would make feedback numbers much more trustable by other buyers, though the lack of fear of revenge should do most of this. eBay doesn’t want to go too far. It likes high reputations, they grease the wheels of commerce that eBay feeds on.
One thing potentially lost here is something that never seemed to happen anyway. I always felt that if the seller had very low reputation (few transactions) and the buyer had a strong positive reputation, then the order of who goes first should change. Ie. the seller should ship before payment, and the buyer pay after receipt and satisfaction. But nobody ever goes for that and they will do so less often. A nice idea might be that if a seller offers this, this opens up the buyer to getting negative feedback again, and the seller would not offer it to buyers with bad feedback.
Submitted by brad on Tue, 2008-01-29 10:21.
A couple of weeks ago many wrote about the mistakes of spock which made us call them the “evil spock” for the way they had you mass mail your friends by fooling you into thinking they were already users of Spock.
The newest company to make a similar mistake is called NotchUp. I am loathe to discuss their business, because this means they get publicity for being bad actors, but it involves companies paying candidates for the chance to interview them rather than just giving all the fees to the headhunters. (Something that could only work in a boom market, I expect.) But in this case, some of the fees go to the headhunters, of course, and in a particularly nasty turn, 10% of them go to the “friend” who “invited” you to sign up.
When I get a bunch of invites for something brand new in a short period, it’s either something really hot, or something fishy. In this case it’s the latter. And one person suggests they didn’t authorize NotchUp to email their entire linked-in contact list so there may be something really fishy.
Here are some of the mistakes:
- The offering of affiliate fees to spam your friends, effectively an Amway style marketing system, has been pernicious for some time. While this should be strongly discouraged, I am not calling for its total prohibition, but it should never be secret. Every such message should contain a note explaining the financial incentive.
- The ad comes with your friend’s name on it, but the reply address is a dummy “invite@notchup” which I presume doesn’t work. Any site that does this sort of mailing should put in the friend’s real e-mail, so I can complain to them.
- The ad comes as a combined HTML and plain text message. Which would be good except the plain text part is just “Go read the HTML part.” Seriously. Boy is that evil.
- The site contains no “contact us” information for users who have issues. Their FAQ is all about signing up.
- The site has no “opt out” to stop my friends from doing these mass mailings to me. These are not particularly useful, because I have many email addresses and in fact whole domains that come to me, but they are better than nothing.
- It may have some of these things if I sign up. Of course as somebody who wants to opt-out, I hardly want to create an account just to do that. A few other sites have had this flaw. (I have no idea if you can opt out by signing up, I presume it does give you the ability to at least not get mailings because you have already been fished by your friend.)
Whether their headhunting model sounds interesting or not, the company’s practices seem slimy enough that I would wait for a nicer competitor to come along if you want to get headhunted this way.
Submitted by brad on Tue, 2008-01-15 13:10.
Bruce Schneier has made a fuss by writing about how he leaves his wireless internet open. As a well regarded security expect, how can he do this. You’ll see many arguments for and against in his posting. I’ll expand on one of mine.
Part of Bruce’s argument is one I express different. I sometimes say “Firewalls are a hoax.” They are the wrong choice for security, but we sell them as a good choice. Oddly, however, this very fact does make them a valid choice. I will explain the contradiction.
Firewalls, I should say, are a form of network security — creating an internal network which is “trusted” and protected from the outside world. In an obscure way, encrypting your wireless net is in this class of security. Note that the “firewall” programs that run on PCs are not network firewalls so they are generally not in this class of security, though they are called Firewalls.
The right way to do things, in the ideal world, is to secure each PC, and to have that PC encrypt its traffic end-to-end with all the sites it communicates with. If you do this, you have almost no need for firewalls or encryption on the network. This is important because in many cases, the idea that your internal network is trustable is a dangerous one. That’s because many networks are populated with insecure consumer computers which frequently get infected with malware (viruses, trojans etc.) They can get infected because they are laptops that visit exposed networks they are not secured well enough for — because you thought you could get away with less on the home net — or because their owner is tricked into downloading malware, or going to a web site that exploits a browser bug, etc.
Once a local computer is infected, your trusted local net betrays you, as the malware now gets to take advantage of all that trust.
We don’t live in that ideal world. The same insecurity these consumer computers (and yes, I mean Windows but other OSs are not immune) have makes them unsuitable for general exposure. The firewall industry gets to sell firewalls because the workstations are so insecure.
In the real world, virus/trojan attacks are the most common. Up to 30% of PCs are “botted” — taken over by malware and acting as zombies under the control of some distant master. A significant number are just plain compromised in other ways, though botting seems the most popular motive today for taking control of systems. The volume of attacks coming in via outsiders sniffing or connecting to your wireless network is insignificant in comparison, I think research would show.
And sadly, while we would like all web traffic to be HTTPS and all E-mail to be secured over TLS, this is just not an option. Most web servers don’t over encrypted versions, and even the ones that do get rarely used because the UI was not set up correctly for it. (Ideally, http should have been designed so that you don’t have to put your encryption desires into the URL — https vs. http — so that it could be negotiated for each connection. Even then, it would be hard to do this though identity certificates could make it happen.)
So we must surf the web in the open, or at best through an encrypted tunnel to a proxy that surfs in the open. So this does call for encrypting one’s wifi. However, again, the number of people sniffing private homes wifi is tiny in comparison to the other threats.
One of the factors supporting Bruce’s choice is that most security continues to have bad UI. The computer and security industries regularly vastly underestimate the importance of good UI. The hard truth is that good security with bad (hard to use) UI simply doesn’t get deployed very much unless you force it and force it hard. This suggests that lesser security with good UI can actually deliver more real world results than better security with bad UI.
For encrypting networks, the UI is poor. Different vendors use different passphrase algorithms to input keys. For many devices (phones, digital picture frames etc.) even entering a passphrase is difficult. We’re starting to see some better UI but it’s slow to deploy and for now it is no surprise that people want to leave their nets open, both for their own devices, and to give access to guests in their home or office.
To my mind the ideal UI is a device tries to connect to the network, and the AP or a computer flashes a light that says that one, and exactly one device is asking to join the net. You then push a button to confirm that device. Also good is the ability to allow arbitrary devices to connect in a secured channel but with no special ability to route packets to one another or into general devices. A full configuration has an internal net (with routing), guest devices that can’t route to the internal net or to other guests, and host devices which can be seen by guests but not the outside world.
Oddly, as I said at the start, the choices we make affect the value of the choices. Because NATs and firewalls provide some security, people (and vendors) allow the computers behind these NATs and firewalls to be insecure in a way they never would or could if the NATs and firewalls weren’t there. This in turn makes the NATs and firewalls worthwhile. And yes, random attacks from outside will always be more probable than attacks from the inside from compromised machines, and they will be more probable than attacks from neighbours. So it’s not as simple as we like. However, computers are going to roam more and more. My PDA has wifi and roams. It also has EVDO and some day those networks will open and need more endpoint security.
So is Bruce right or wrong? Both. The real world risk of what he’s doing isn’t great. It’s not zero, either. The real question is whether the UI penalties of an encrypted network are worse than the risk. And that decision varies from person to person. Better UI and protocol design could mostly eliminate the tradeoff, which is the real lesson.
Submitted by brad on Thu, 2007-11-29 22:26.
If you have bought a home router or access point, you know it comes by default listening to some NAT based IP address, and the setup guide tells the user to type "http://192.168.1.1" or similar into their browser.
Instead, these companies should define a domain, like "setup.linksys.com" that points to a page that redirects to that IP address. In addition, the box, before it is set up, should have a mini DHCP server and DNS server that returns the right address for that domain for people who just plug a PC into the box. (I guess it could return that address for any domain you type in if the box is not configured,n ot just the official one.)
This would serve several purposes. The instructions to the unskilled user become less cryptic. Just plug your PC into the box, boot it and type this easy to remember name into the browser.
If the user is more sophisticated and changes the address of the router, a cookie could be set so the redirect goes to the valid address, but of course if the cookie is lost the user will have to remember, but that's always true. And the user who does not use DHCP from the router will also have to use the numeric address, so it must be printed as an alternative for such folks. But one value of the whole thing is that if it got standardized, it would make it easy to figure out the address for a box if you know the brand. The domain could and should be printed on it. Along with the default password (which should then be changed of course.)
Submitted by brad on Tue, 2007-10-09 01:56.
I may be on the extreme, but I use hundreds of different E-mail addresses. Since I have whole domains where every address forwards to me (or to my spam filters) I actually have an uncountable number of addresses, but I also have a very large number of real ones I use. That’s because I generate a new address for every web site I enter an E-mail address on. It lets me know who sells or loses my address, and lets me cut off or add filtering to mail from any party. (By the way, most companies are very good, and really don’t sell your E-mail.)
As I said, I’m on the extreme, but lots of people have at least a handful of addresses. They have personal ones and work ones. They have addresses given by ISPs, and ones from gmail, hotmail and the like. But I regularly run into sites that assume that you have only one.
One of the worst behaviours is when I mail customer service. That mail comes from my current “private” address. It’s an unfiltered address that only goes out in E-mails to people I mail, and so replies always work. But they usually write back “You must send mail from the E-mail address in our records.” Even when I have told them my account number or other such information. And in fact, even when I tell them what the E-mail address is, they insist it be in the “From” line.
With most E-mail clients, I can indeed put any address in the From line I want, including yours or any of mine. So this is a pointless form of security. Their software has been written to key off this, and won’t let their agents identify the user another way. Unfortunately some mail agents that I use on the road don’t make it easy to enter an arbitrary From, so this is a pain.
Another problem is contact databases and social networks. LinkedIn likes you to know the E-mail address of somebody you are contacting in advance. But which one did they use with LinkedIn? And which one have I used? The address I have registered with some of these sites is not the one you use to mail me, so I can direct that mail. So if you use their systems to check for people in your contact list, you won’t find me, and I may not find you. Not that there’s an easily solution to this, but they haven’t even really tried.
Now as I said, I create these emails on the fly, and from reading them, I can tell what site they are for. But that doesn’t mean I can remember what I created after the fact. Sadly, many sites are also demanding you log in using “your E-mail address” rather than a userid that you pick. While this assures that IDs are unique, it’s also not hard to come up with a unique ID to use that’s not an E-mail and can be the same over all the sites you wish it to be. Sometimes to log in or do certain functions, I have to remember what E-mail I generated for them. (If I can get them to mail me something, I can solve that.)
Of course, many of them will mail me my password. Which is hugely, terribly wrong. No site should be able to E-mail you your password, because that means they are storing it. They should at best be able to reset your password and send you an E-mail which will let you log in and create a new password. While you should keep unique passwords for sites where real damage can be done (like banks) most people keep common passwords for sites where compromise of your “account” is not particularly bothersome. But if sites store it, it means they all are getting access to all the rest, if they wish to, or if they are compromised. I wrote this blog post to give people something to point at when sites expect you to have just one E-mail. I probably need another to point sites at when they are storing my password and will mail it to me. (Especially ones that say they dare not send you messages by E-mail because it is not secure, but which will send you your password by E-mail.)
Submitted by brad on Sat, 2007-07-28 19:29.
I’m quite impressed with Google’s mobile maps application for smartphones. It works nicely on the iPhone but is great on other phones too.
Among other things, it will display live traffic on your map. And I recently saw, when asking it for directions, that it told me that there would be “7 minutes of traffic delay” along my route. That’s great.
But they missed the obvious extension from that. Due to the delay, 101 is no longer my fastest route. They should use the traffic delay data to re-plot my route, and in this case, suggest 280. (Now it turns out that 280 is always better anyway, because aside from the fact it has less traffic, people drive at a higher average speed on it than 101, and the software doesn’t know that. Normally it’s a win except when it’s raining in the hills and not down by the shore.)
Now I’ve been wanting mapping and routing software to get a better understanding of real road speeds for a while. It could easily get that by taking GPS tracklogs from cabs, trucks and other vehicles willing to give them. It could know the real average speed of travel on every road, in every direction, at any given hour of the day. And then it could amend that with live traffic data. (Among other things, such data would quickly notice map errors, like one-way streets, missing streets, streets you can’t drive etc.)
Now to get really smart, the software should also have a formula for “aging” traffic congestion based on history and day of the week. For example, while there may be slow traffic on a stretch of highway at 6:30 pm, if I won’t get there until 7:30 it should be expected to speed up. As I get closer it can recalculate, though of course some alternate roads (like 101 vs. 280) must be chosen well in advance.
And hey, Google Mobile maps, while your at it, could you add bookmarks? For example, I would like to make a bookmark that generates my standard traffic view, and remember areas I need maps of frequently. And of course since traffic data can make them different, bookmark routes such as one’s standard commute. For this, it might make sense to let people bookmark the routes in full google maps, where you can drag the route to your taste, and save it for use in the mobile product, even comparing the route times under traffic. One could also have the device learn real data about how fast I drive on various routes, though for privacy reasons this should not be store unencrypted on servers. (We would not want our devices betraying us and getting us speeding tickets or liability in accidents due to speeding, so only averages rather than specific superlimit speeds should be stored.)
Also — there are other places in a PDA/phone with an address, most notably events in the calendar. It would be nice while looking at an event in the calendar (or to-do list) to be able to click “locate on the map.”
Submitted by brad on Sat, 2007-07-14 23:30.
For various reasons, a wide variety of otherwise free wifi hotspots require you to go through a login screen. (This is also common of course with for-pay hotspots where you must enter an account or room number.)
These login screens sometimes exist to control how many people access the hotspot. Sometimes they are just there to make sure the user knows who is providing the hotspot so as to be thankful. Often they are there to get you to click agreement to a set of terms and conditions for use (which most people just ignore but click on anyway.) Whatever reason they are there, they create problems. For example, they block non-browser oriented devices, like wifi phones, from using the hotspots. They also interfere with non-browser applications that want to use the network before the user has gone through the procedure with the browser.
Since we’re not going to make them go away, can we improve things? There have been suggestions in the past for standardizing the login protocols, so that devices like wifi phones can still get in, as long as there is no typing or little typing. One could even standardize delivery of a short message or logo from the hotspot provider so you know who has provided the free service. Clicking agreement to terms remains a problem on such issues. I don’t know how far those efforts have gotten, but I hope they do well.
Until then however, it might make sense to build a giant database of hotspots along with information on how to log into them. In most cases it involves doing a web fetch and then posting a form with a box checked and possibly some text in a box. There are really only so many different classes of login system. The database could map from SSIDs (for non-default SSIDs) or even MAC addresses. Laptops could easily store a large MAC based database, while phones and PDAs would have more trouble. However there are techniques, using hash tables and bitmaps designed for spell checking, which can compress these tables, since false hits on unknowns are not a problem.
Better still would be a way to “fingerprint” the login pages, since again there are only so many basic types. Then just store a set of scripts to calculate the fingerprints and scripts to fill out the forms.
When a laptop user — anywhere — using this system encountered a hotspot whose login page did not match any fingerprint (or which matched but failed to login) the software could capture the attempted session and fire off an E-mail (to be sent later, when connected) to the people maintaining the scripts. This team, perhaps paid, perhaps volunteer, could quickly develop scripts so that the next person to use that hotspot gets automatic login. Of course this doesn’t help at a new conference hotspot where all the conference goers can’t update their lists until they get on, but that’s only the first time.
Now one problem is that these scripts would automate the checking of “I agree to the terms” buttons. And that does raise some interesting issues. First, over whether the user truly agreed. Next, over whether the script provider is liable for violations. And third, whether the hotspot owners will feel the need to make their login unscriptable (for example using CAPTCHAs or worse) to prevent people doing auto-logon. I mean they tried to make it hard to log on for some reason, we suppose.
Standardization would help here. Perhaps somebody could draw up a contract with the basic terms found in almost all these terms of service (no spam, prohibitions on various illegal uses) and users could agree to that (on behalf of all hotspots) and they would be satisfied. The scripts could be programmed to be able to extract the terms and offer the user the chance to see them. On a wifi phone, the phone could extract the terms and E-mail them to the phone’s owner (the phone would be configured with that E-mail) over SMTP over TLS (don’t want to reveal the E-mail address to sniffers) so the user has a copy and can at least review them later.
Of course, not having hotspot owners afraid of liability would be nice, too.
Submitted by brad on Fri, 2007-06-29 12:48.
Earlier I wrote about the frenzy buying Plastation 3s on eBay and lessons from it. There’s a smaller scale frenzy going on now about the iPhone, which doesn’t go on sale until 6pm today. With the PS3, many stores pre-sold them, and others lined up. In theory Apple/AT&T are not pre-selling, and limiting people to 2 units, though many eBay sellers are claiming otherwise.
The going price for people who claim they have one, either for some unstated reason, or because they are first in line at some store, is about $1100, almost twice the cost. A tidy profit for those who wait in line, time their auction well and have a good enough eBay reputation to get people to believe them. Quite a number of such auctions have closed at such prices with “buy it now.” If you live in a town without a frenzy and line it might do you well to go down to pick up two iPods. Bring your laptop with wireless access to update your eBay auction. None of the auctions I have seen have gone so far as to show a picture of the seller waiting in line to prove it.
eBay has put down some hard terms on iPhone sellers and pre-sellers. It says it does not allow pre-sales, but seems to be allowing those sellers who claim they can guarantee a phone. It requires a picture of the actual item in hand, with a non-photoshopped sign in the picture with the seller’s eBay name. A number of items show a stock photo with an obviously photoshopped tag. In spite of the publicised limit of 2, a number of people claim they have 4 or more.
It seems Apple may have deliberately tried to discourage this by releasing at 6pm on Friday, too late to get to Fedex in most places. Thus all most sellers can offer is getting the phone Monday, which is much less appealing, since that leaves a long window to learn that there are plenty more available Monday, and loses the all-important bragging rights of having an iPhone at weekend social events. Had they released it just a few hours earlier, I think sales like this would have been far more lucrative. (While Apple would not want to leave money on the table, it’s possible high eBay prices would add to the hype and be in their interest.)
As before, I predict timing of auctions will be very important. At this point even a 1 day auction will close after 18 hours of iPhone sales, adding a lot of rish. The PS3 kept its high value for much of the Christmas season, but the iPhone, if not undersupplied, may drop to retail in as little as a day. A standard 1 week auction would be a big mistake. Frankly I think paying $1200 (or a $300 wait-in-line fee) is pretty silly.
The iPhone, by the way, seems like a cool generalized device. A handheld that has the basic I/O tools including GSM phone and is otherwise completely made of touchscreen seems a good general device for the future. Better with a small bluetooth keyboard. Whether this device will be “the one” remains to be seen, of course.
Update: read more »
Submitted by brad on Sun, 2007-06-24 20:50.
At Supernova 2007, several of us engaged Andrew Keen over his controversial book "The Cult of the Amateur." I will admit to not yet having read the book. Reviews in the blogosphere are scathing, but of course the book is entirely critical of the blogosphere so that's not too unexpected.
However, one of the things Keen said he worries about is what he calls the "scarcity of talent." He believes the existing "professional" media system did a good enough job at encouraging, discovering and promoting the talent that's out there, and so the world doesn't get more than slush with all the new online media. The amount of talent he felt, was very roughly constant.
I presented one interesting counter to this concept. I am from Canada. As you probably know, we excel at Hockey. Per capita certainly, and often on an absolute scale, Canada will beat any other nation in Hockey. This is only in part because of the professional leagues. We all play hockey when we are young, and this has no formal organization. The result is more talented players arise. The same is true for the USA in Baseball but not in Soccer, and so on.
This suggest that however much one might view YouTube as a vaster wasteland of terrible video, the existence of things like YouTube will eventually generate more and better videographers, and the world will be richer for it, at least if the world wants videographers. One could argue this just takes them away from something else but I doubt that accounts for all of it.