Digital cameras should have built-in tagging

So many people today are using tags to organize photos and to upload them to sites like flickr for people to search. Most types of tagging are easiest to do on a computer, but certain types of tagging would make sense to add to photos right in the camera, as the photos are taken.

For example, if you take a camera to an event, you will probably tag all the photos at the event with a tag for the event. A menu item to turn on such a tag would be handy. If you are always taking pictures of your family or close friends, you could have tags for them preprogrammed to make it easy to add right on the camera, or afterwards during picture review. (Of course the use of facial recognition and GPS and other information is even better.)

Tags from a limited vocabulary can also be set with limited vocabulary speech recognition, which cameras have the CPU and memory to do. Thus taking a picture of a group of friends, one could say their names right as you took the picture and have it tagged.

Of course, entering text on a camera is painful. You don't want to try to compose a tag with arrow buttons over a keyboard or the alphabet. Some tags would be defined when the camera is connected to the computer (or written to the flash card in a magic file from the computer.) You would get menus of those tags. For a new tag, one would just select something like "New tag 5" from the menu, and later have an interface to rename the tag to something meaningful.

As a cute interface, tag names could also be assigned with pictures. Print the tag name on paper clearly and take a picture of it in "new tag" mode. While one could imagine OCR here, since it doesn't matter if the OCR does it perfectly at first blush, you don't actually need it. Just display the cropped handwritten text box in the menus of tags. Convert them to text (via OCR or human typing) when you get to a computer. You can also say sound associations for such tags, or for generic tags.

Cameras have had the ability to record audio with pictures for a while, but listening to all that to transcribe it takes effort. Trained speech recognition would be great here but in fact all we really have to identify is when the same word or phrase is found in several photos as a tag, and then have the person type what they said just once to automatically tag all the photos the word was said on. If the speech interface is done right, menu use would be minimal and might not even be needed.


When I use a digital camera, I just want to take a picture. When I use a cell phone, I just want to make or receive a call. Why do things have to be so complicated? People who are too dumb to organize their digital pictures shouldn't own a camera, or a computer for that matter, to begin with. But I'm getting too old for all of those unnecessary buttons. Who's fingers were they designed for anyway? And I don't like having to pull out my reading glasses just to make a phone call or to take a picture. It seems that the designers of these devices care more about impressing us with fancy features, than they do about the functionality of their products. Digital zoom is a good example of that. Having said these things I'll also add that I'm not against technology, but some of these features should be reserved for high-end equipment only. So in general Brad, I would have to disagree with you about adding another unnecessary feature.

John at:

There are features that get in the way, and those that don't bother you if you don't use them. Some people love to tag their photos, and they upload them all to places like flickr where that's important. For them, tagging when shooting might be very handy. I agree that it should not be designed to get in the way of the ordinary shooter, and I can't imagine why it might, other than adding another entry to some menus.

Voice tagging, on the other hand, can be done without menus, though you do need a way to turn it on, or perhaps indeed yet another button to hold down while talking -- though many cameras already have this. But beyond that button it need not impinge much on the UI. The main difference is to have the camera (or even the post-xfer software) understand that the voice annotations are not single blocks of audio, but rather a list of tags, with tags in common among many photos.

Of course I'm the sort of shooter who uses the optical viewfinder, so speaking to a mic while shooting would be easy. For those who hold the camera away from their head to shoot, having to bring it up to get a good audio recording might not be so natural.

What people want as a grail is to just shoot 'em, transfer them and have them organized with no effort or little effort. I am a big fan of things that can help with that (GPS and compas in camera, face recognition after the fact, automatic groupings by shooting session etc.) but also like things that can organize with minimal effort, as voice tagging might entail. Speech interface on cameras actually could have many applications. They have the CPU for it, but CPU takes battery power.

Add new comment