Scanning table for old digital cameras


I have several sheetfed scanners. They are great in many ways -- though not nearly as automatic as they could be -- but they are expensive and have their limitations when it comes to real-world documents, which are often not in pristine shape.

I still believe in sheetfed scanners for the home, in fact one of my first blog posts here was about the paperless home, and some products are now on the market similar to this design, though none have the concept I really wanted -- a battery powered scanner which simply scans to flash cards, and you take the flash card to a computer later for processing.

My multi-page document scanners will do a whole document, but they sometimes mis-feed. My single-page sheetfed scanner isn't as fast or fancy but it's still faster than using a flatbed because the act of putting the paper in the scanner is the act of scanning. There is no "open the top, remove old document, put in new one, lower top, push scan button" process.

Here's a design that might be cheap and just what a house needs to get rid of its documents. It begins with a table which has an arm coming out from one side which has a tripod screw to hold a digital camera. Also running up the arm is a USB cable to the camera. Also on the arm, at enough of an angle to avoid glare and reflections are lighting, either white LED or CCFL tubes.

In the bed of the table is a capacitive sensor able to tell if your hand is near the table, as well as a simple photosensor to tell if there is a document on the table. All of this plugs into a laptop for control.

You slap a document on the table. As soon as you draw your hand away, the light flashes and the camera takes a picture. Then go and replace or flip the document and it happens again. No need to push a button, the removal of your hand with a document in place causes the photo. A button will be present to say "take it again" or "erase that" but you should not need to push it much. The light should be bright enough so the camera can shoot fairly stopped down, allowing a sharp image with good depth of field. The light might be on all the time in the single-sided version.

The camera can't be any camera, alas, but many older cameras in the 6MP range would get about 300dpi colour from a typical letter sized page, which is quite fine. Key is that the camera has macro mode (or can otherwise focus close) and can be made to shoot over USB. An infrared LED could also be used to trigger many consumer cameras. Another plus is manual focus. It would be nice if the camera can just be locked in focus at the right distance, as that means much faster shooting for typical consumer digital cameras. And ideally all this (macro mode, manual focus) can all be set by USB control and thus be done under the control of the computer.

Of course, 3-D objects can also be shot in this way, though they might get glare from the lights if they have surfaces at the wrong angles. A fancier box would put the lights behind cloth diffusers, making things bulkier, though it can all pack down pretty small. In fact, since the arm can be designed to be easily removed, the whole thing can pack down into a very small box. A sheet of plexi would be available to flatten crumpled papers, though with good depth of field, this might not strictly be necessary.

One nice option might be a table filled with holes and a small suction pump. This would hold paper flat to the table. It would also make it easy to determine when paper is on the table. It would not help stacks of paper much but could be turned off, of course.

A fancier and bulkier version would have legs and support a 2nd camera below the table, which would now be a transparent piece of plexiglass. Double sided shots could then be taken, though in this case the lights would have to be turned off on the other side when shooting, and a darkened room or shade around the bottom and part of the top would be a good idea, to avoid bleed through the page. Suction might not be such a good idea here. The software should figure if the other side is blank and discard or highly compress that image. Of course the software must also crop images to size, and straighten rectangular items.

There are other options besides the capacitive hand sensor. These include a button, of course, a simple voice command detector, and clever use of the preview video mode that many digital cameras now have over USB. (ie. the computer can look through the camera and see when the document is in place and the hand is removed.) This approach would also allow gesture commands, little hand signals to indicate if the document is single sided, or B&W, or needs other special treatment.

The goal however, is a table where you can just slap pages down, move your hand away slightly and then slap down another. For stacks of documents one could even put down the whole stack and take pages off one at a time though this would surely bump the stack a bit requiring a bit of cleverness in straightening and cropping. Many people would find they could do this as fast as some of the faster professional document scanners, and with no errors on imperfect pages. The scans would not be as good as true scanner output, but good enough for many purposes.

In fact, digital camera photography's speed (and ability to handle 3-D objects) led both Google Books and the Internet Archive to use it for their book scanning projects. This was of course primarily because they were unwilling to destroy books. Google came up with the idea of using a laser rangefinder to map the shape of the curved book page to correct any distortions in it. While this could be done here it is probably overkill.

One nice bonus here is that it's very easy to design this to handle large documents, and even to be adjustable to handle both small and large documents. Normally scanners wide enough for large items are very expensive.



I like the idea, but I don't think you need 6mp. From my experience 150dpi color is as good as 300dpi B&W for readable text. For that you can get away with 2MP. I've worked a bit with the high end Logitech webcams and the might fill the bill. They have glass optics and autofocus.

Once you have the right settings, disable the right light and autofocus, since they are slow. Also there isn't a 1/4-20 thread, which makes mounting a pain. There is a web site for people developing applications for logitech webcams.

And they are not up to it. The optics are not as good as they advertise. I sent one back because it had bad optics, they sent back one with similar bad optics. Work fine as webcams for videoconferencing, but not as still cameras. Besides, old 6mp digital cams are not much more expensive.

I don't have access to a quickcam any more, but it looks like you are right. This webpage shot some resolution targets and if the center is in focus, the edges are soft focus.

I tried your idea on my digital camera and the results are pretty good. the only problem is that there isn't any way to trigger the camera externally, but the setup was stable enough that it didn't any issues. Thanks for the idea.

Yes, I also experienced non-flat fields on my quickcam orbit AF. However instead of circular not-flatness, which is common with cheaper designs, I got poor flatness left to right, which is a sign of low quality manufacture.

I hope to do this project with my own papers.

If I build any hardware at all, it'll probably just be a foot pedal. Presumably my feet are free, and a foot switch is a very fast, reliable, and cheap way to trigger. The switch might even be the spacebar of my keyboard, which brings the cost down to zero.

I'm thinking about the OCR problem, given that:
- my OCR will not be very good;
- I won't want to manually tag each page (quickly adds a lot of UI and slowdown);
- and my searches will be rare.
I'm thinking about writing code to cluster similar-looking pages together, so I can make one tag for "current PG&E bills" and pretty-reliably get all those pages tagged. Date-scanned will probably help a lot, too.

I've set up foot switch systems and I've discovered that most users hate them. I suspect it would be less of a problem if you could operate it barefoot, but that isn't really an option in an industrial setting. If you plan to do your own image processing, I think buying some green felt will be helpful. I forget which color worked the best, but it was a fairly bright shade (kelly green?). If you go down to Michaels, they had some 1 ft square swatches for cheap. The right shade of green makes it easy to chromokey.

A few people in the document recognition field are aiming toward this. One of the leaders, Thomas Breuel, has written and spoken a fair amount about what he calls, "Oblivious Scanning". There's a semi-annual workshop, the International Workshop on Camera-Based Document Analysis and Recognition. My group's paper at this past workshop was about how to decide when to trigger the camera based on crude motion sensing from a video-resolution stream. What we and several other groups have found is that we have *not* found a consumer camera that will trigger and ship the image quickly enough for a snappy response. I had a summer intern and a senior engineer work on this and they spent weeks hassling with the Canon SX100 API---and reportedly that's one of the friendlier ones. In our rig the maximum capture rate is about 2 seconds/page for processing, then another half second or so for the user to turn the page. This is frustrating and not at all acceptable for casual document scanning. Also check out the SilverCreations sceye, I haven't tried it, but the specs say only 12 pages/minute.

Once the hardware issue is resolved, then the real interesting problem comes in processing the document images. This is true of scanned documents as well of course. But the variety of documents you can capture with a camera and the greater image degradation due to lighting, skew, wrinkles, and resolution makes the problem significantly harder. Neat Receipts is in airport kiosks now, but can only do a fraction of the document recognition you'd like. The document recognition problem goes far beyond OCR. You need doctype classification at multiple levels, then parsing of different items on the document to support organization and search. This is computer vision, really, and it is hard.

While I'd love to ramp up an R&D effort on the casual document scanner/organizer, one of our questions is, what is the market size for such a thing? The answer to that question will determine whether a turnkey solution gets built, or if hobbyists will have to forever cobble together their own custom solutions.

I don't know how important the ocr stuff is. I've had friends that loved the Paperport software and never used the ocr. Too bad they bundled it with junky over priced hardware. For most people, something this is legible is good enough. Look at how many people still tolerate fax output and don't even use the fine mode.

If you look at the resolution target in the Logitech webcam link below. The center part of the image is actually pretty good. If Logitech spent another 5 bucks on the optics for the laptop version and bundled in some document management software, they could raise the price $20 and sell another million units a year. The laptop version has a little stand. You just need to add a right angle joint to shoot down.

I agree that the cheaper consumer cameras won't deliver the image very quickly. The best bet may be something like a used D60 or 10D or 20D in the Canon dept, with the 50mm f/1.8. Used for under $400, perhaps much less, and then able to shoot multiple shots per second for a while, not multiple seconds per shot. 1080 line HD videocams do offer something, though not quite the resolution I would want, though of course fast response and better quality than the quickcam. Consumer cams like the quickcam should arrive with better optics before too long.

I agree the software is non trivial. Good document management software has a market at all levels, including pro scanners in offices and cameras in homes. Done with a fast camera or video camera, scanning like this can be faster on imperfect documents than any fancy document scanner, and so a market might be found in offices. In fact a typical office might find that the right combination is a document scanner for neat stacks of paper, and a camera scanner with suction table and nice lighting for irregular, bent or torn pieces of paper.

As for document recognition, one nice trick is that this can get better in the future for you. So it might not recognize a phone bill today, but before too long it will. (Picasa will even do faces for you now.)

One nice way to do document recognition would be with similarity, for which there are many algorithms. The program might say, "I see you have a lot of documents that match this template. What type of document is this?" and then you could say what it is and it would categorize them all. Thus the receipts from all the stores you shop at regularly would be grouped, phone bills etc.

I found a alternative firmware for Canon cameras that might be useful for this camera. You load the firmware from the SD card. Another feature is that you can use the usb connection for a remote shutter release.

The cheapest new camera I could find to use the firmware is the A470.

Hi Brad,

in the world of cinema and digital video, we often use a spray that will make most objects non-reflective. It's sort of fatty, I could not find an english translation for this.

We use it so it is possible, for example, to shoot videos of things like chromed pieces of metal, spoons, forks... ;-)

Another possibility for the scanner would be to only use diffuse light (you can obtain such a light by putting a directional lamp inside a box made of several layers of tracing papers. This is a cheap alternative to light boxes.


Add new comment