Archives

Date

The Cylon God and his prophecy

In the latest BSG, “Razor” we saw mostly flashback but got some interesting new backstory in the form of an appearance by what appears to be an incarnation of the Cylon god. And he makes a prophecy about Starbuck.

Now first of all, is this character, said to be the First Hybrid an incarnation of the much older Cylon god, the being they worship and who they say drove them to destroy the colonies? First we see young Adama meet him, and stick his hand into the Hybrid tank. This is no coincidence — Adama is shot down in a space battle high in the atmosphere of a frozen planet, there’s no way he would land on the Cylon base star by chance and walk into the Hybrid’s room. When Adama sticks in his hand, he gets a vision, like a Cylon projection, of tortured humans in cages, a hand grabbing him, and then after the vision ends the voice of the first hybrid repeating the Peter Pan mantra “all this has happened before, and will happen again.”

Later Kendra finds him on the old Base Star and he says, “What am I? A man? Or a machine? My children believe I am a god.” His children are probably not the original model Cylons guarding him, at least not solely them. While the bio-cylons are not directly descended from him, he was their prototype and this seems to point to them.

In addition, of course, he knows far too much for somebody who has sat in a tank for 40 years, kept away from the Cylon mainstream. He is expecting the meeting, and his destruction. He expects a new incarnation, as well. And he knows about Starbuck’s special destiny and has a warning about it. He also calls Kendra “my child” which may also imply, as has been suggested, that the colonials are also creations of this being.

(While it may be just a coincidence, it is worth noting that immediately after this being is destroyed, wondering about his next incarnation, Hera, a true hybrid, is born.)

While some would suggest he could be lying or delusional, this doesn’t seem right from a dramatic sense. You don’t throw in a being like this and then explain it all away as the ravings of an insane creature. I rate a good probability that this is an incarnation/copy of the Cylon god, put into the First Hybrid as the Cylons were creating it. (They were, most probably, creating it under the hidden or open direction of the Final 5 or the Cylon God.)

Next let’s consider his prophecy about Starbuck. He says, “Kara Thrace will lead the human race to its end. She is the herald of the Apocalypse, the harbinger of death. They must not follow her.

This is a delightfully ambiguous sentence, so much so that I am confident it doesn’t mean what it says on the surface.

First of all “its end” can mean both its destruction, or simply its goal or destination — in this case Earth. (However, podcast material suggests that it probably does mean destruction.)

Many people have come to think the word “Apocalypse” refers to the end of the world or Armageddon, but actually it means “revelations.” The confusion began because the story of the end of the world is told in a book which is an apocalypse. It is telling that Season 4, Episode 12, is titled Revelations. A harbinger is an omen, not a bringer of death. But whose death? Colonials or Cylons? The Cylon god is not necessarily on any one side in this conflict, and the hybrid, which is half-human and half-machine, certainly isn’t. Who are the “they” who must not follow her, and why?

In particular, after that the god accepts his destruction and repeats the Peter Pan mantra, saying “again” many times until the nuke goes off. Kendra tries to tell his warning but is partly jammed — but by whom? The god? The Cylons who guard him and presumably follow his instructions?

Under this cycle of time theory, the god would presume that Starbuck is destined to be followed or not, and this would not be changed because of a warning. The coming events are largely set, so what is the purpose of this warning?

We have to wait until March to find out, and this blog will be mostly quiet until then.

Router Vendors, create DNS entries for your default addresses

If you have bought a home router or access point, you know it comes by default listening to some NAT based IP address, and the setup guide tells the user to type "http://192.168.1.1" or similar into their browser.

Instead, these companies should define a domain, like "setup.linksys.com" that points to a page that redirects to that IP address. In addition, the box, before it is set up, should have a mini DHCP server and DNS server that returns the right address for that domain for people who just plug a PC into the box. (I guess it could return that address for any domain you type in if the box is not configured,n ot just the official one.)

This would serve several purposes. The instructions to the unskilled user become less cryptic. Just plug your PC into the box, boot it and type this easy to remember name into the browser.

If the user is more sophisticated and changes the address of the router, a cookie could be set so the redirect goes to the valid address, but of course if the cookie is lost the user will have to remember, but that's always true. And the user who does not use DHCP from the router will also have to use the numeric address, so it must be printed as an alternative for such folks. But one value of the whole thing is that if it got standardized, it would make it easy to figure out the address for a box if you know the brand. The domain could and should be printed on it. Along with the default password (which should then be changed of course.)

How did facebook apps reverse the install dynamic?

The hot new thing of the web of late has been facebook apps. I must admit Facebook itself has been great for me at finding old friends because for unknown reasons, almost 20% of Canada is on Facebook compared to 5% of the USA. Facebook lets 3rd parties write apps, which users can “install” and after installing them, the apps get access to the user’s data (friend list) and can insert items into the user’s “feed” (which all their friends see) and sometimes send E-mails to friends.

I haven’t examined the API enough to understand the reason, but there are many Facebook apps that are very, very annoying in how they operate. Most won’t let you get anything from them unless you “install” them and give them access to a lot of your data. (There are a few that let you have more limited temporary use through a login.)

This is annoying because you constantly get data in feeds (or emails) which is just a teaser. “Fred Smith wrote something on your pixie wall.” You have to follow the link, and find you must install the application before it will show you what the other person wrote. It could easily have shown you the text in the feed or email, but it doesn’t want to do that, it wants to spread virally.

But this is far beyond viral. Viral apps usually work because friends recommend them. These apps push to install just because a friend used the app in reference to you.

Outside of facebook there was a different dynamic. Usually if you used a social app which emailed your friends, your friends could do their part just on the web site, without creating an account, or providing personal data, or “installing” something. (The install on facebook isn’t like a PC software install, but given the data it gets access too, it is pretty insidious, a form of super-spyware.)

There were a few apps which required your contacts to create accounts and enter data. They got a lot of pushback, and this largely stopped. Most of the apps certainly encouraged your friends to create accounts, but few forced it or sent a message that was useless unless they did create one. (Not counting deliberate invitations to join a system which obviously work this way, and which you tend to send one-by-one, or so most companies learned.) As much as I hate evite they still let the people you invite RSVP without doing any account creation.

In facebook it’s the reverse. One app I tired and hated asked questions. It ended up putting text into the feed and emails of the form, “Joe has asked a question, click here to see what it is” and “Mary has answered Joe’s question, click here to read the answer” instead of putting these short text questions and answers right into the email. And answering a question required installing the app.

I see a few things that have driven it this way. First of all, when you install a Facebook app, it informs all your friends in the feed. That’s publicity for the app. And they get to increase their total number of installed users, which gives them more visibility when people look to see what’s popular. If the app let your friends get data without making them join, it would not have so many users.

Apps are not forced to do this. A number of good apps will let people see the data, even put it in feeds, without you having to “install” and thus give up all your privacy to the app. What I wish is that more of us had pushed back against the bad ones. Frankly, even if you don’t care about privacy, this approach results in lots of spam which is trying to get you to install apps. Everybody thinks having an app with lots of users is going to mean bucks down the road, with Facebook valued as highly as it is.

But a lot of it is plain old spam, but we’re tolerating it because it’s on Facebook. (Which itself is no champion. They have an extremely annoying email system which sends you an e-mail saying, “You got a message on facebook, click to read it” rather than just including the text of the message. To counter this, there is an “E-mail me instead” application which tries to make it easier for people to use real E-mail. And I recently saw one friend add the text “Use E-mail not facebook message” in her profile picture.)

Patent reform: Apply for a patent, examine some patents

Among many patent reform proposals it is common to have a desire for better examination, and more detection of prior art and obviousness. But the patent office only has so much money for so many examiners.

So here's a simple solution. If you want to apply for a patent, you must put in some time, as an expert in your field, examining other patent applications, searching for prior art and giving opinions on the obviousness. Alternately, this duty could be given only to those who actually are granted patents, to make more sure they are "skilled in the art" of their fields.

Of course, such crowdsourced examiners would have biases. They would be expected to make a sworn statement about their biases. Making a false statement could have implications on their own patents as well as the usual penalties.

Those biased against the patent would mostly hunt for prior art -- in fact they would make the best hunters. Those unbiased could make better opinions of obviousness.

Like regular patent fees, this could be biased for small inventors. (Small inventors pay lower patent fees and get some better treatments.) Large companies might have to volunteer more time from their staff, or small inventors might get reductions in patent fees in exchange for good work. Peers would examine the work of other peers to keep them honest and to rate the quality of it. And of course, unbiased patent examiners and appeal boards would still have the final, objective say.

Other volunteers could also participate in prior art searches. But with the system described above, there should be no shortage of labour. And as the number of patents goes up, the system naturally increases the labour available to do the legwork.

Joining the scanner club

The Scanner Club is a group of people interested in scanning lots of the extra paper in their lives by sharing a high-end double-sided “document scanner”. We share this scanner, and other high-end software and gear to clear out our paper, then sell it or donate it.

I purchased, on eBay, a high-end (normally $6,000) production scanner the Fujitsu fi-5750c. (Follow the link for specs and a movie.) I scans 50 to 80 pages per minute at 12 x 18” size, and has a very fast 12x18” flatbed under it as well. I got it for under $2,000 used but it needs a cleaning and other parts bringing the total cost a bit higher than that. (It has currently been sent back to the seller for the servicing. If they can’t service, we’ll get another, either of this model or similar.)

The plan is to share it around a group of people, who each get it a couple of weeks to scan all their stuff. It’s pretty impressive, and also scans large stacks of photo prints — you can do a stack of 100s of photos in a couple of minutes. (See below for notes on photos.) It will also scan large stacks of business cards — both sides. You can probably do 1,000 cards in 10 minutes of your time, not counting fixing jams for mangled cards. But its main purpose is scanning large stacks of paper, all the way up to 12x18” size but of course very good at letter sized paper.

The scanner has Kofax VRS Basic which is software to process scans, straighten pages, remove backgrounds and speckles etc. We can consider the fancier VRS Pro, which does much more, and can be had for about $500 on eBay. The Kofax web page details what it does, I think it’s worth it if we get a decent number of members — though I am not sure we’ll recover the cost as readily. The important features include automatic orientation detection (so it fixes it if you put pages in upside down), better blank page elimination, better cleanup and colour detection (ie. if a page uses colour it stores it in colour, if it is just coloured paper it does not.) Put in your vote as to whether you need this one.

I also wrote more details at this blog post.

There are a few other options possible, detailed below.

In the Scanner club kit

  • The scanner(s).
  • A commercial copy-house paper cutter, able to cut the bindings off books and magazines up to 300 pages thick. About $250 on eBay.
  • A selection of staple pullers and other such tools.
  • Omnipage 16, a high end OCR and PDF creation program. (About $160 on ebay.) Takes your scans and quickly turns them into MS Word files, Searchable PDFs or other formats. Note that Ominipage does online verification for copy protection, but you can transfer the licence from PC to PC.
  • Optional: A dedicated Windows PC, so that members do not have to try to connect the scanner to their own computer, or need Windows.
  • Optional: A shredder, for when you’re done. (Note that serious shredders able to do 20 sheets at a time cost $1000 or more.)
  • Optional: The Kofax VRS Pro upgrade.

More notes on 5750c

I chose this scanner because Fujitsu was recommended by Gord Bell, and because their scanners also work under linux (though you can’t use VRS or windows OCR etc. there.)

Here are some things worth noting:

  • This is a duplex scanner. It scans both sides of the page, up to 600dpi, in colour, B&W or as bitmap. It is almost as fast in colour as in bitmap (it is your computer that will be slower handling all the colour data.)
  • It has USB 2.0 and SCSI. Most people will use the USB
  • Unlike most, this one has a straight-through paper path, which is better for thick or messy documents and cards.
  • It’s quite big and heavy. So members of the club will need to drive to pick up the scanner when its their turn. I mean it, it weighs 85lbs! Mostly for the giant flatbed.
  • The giant flatbed turns out to be useful. It scans in just a second. You can quickly slap down lots of photos, newspaper clippings and other fragile things that won’t go int he ADF and scan them.
  • As noted you can use it from linux. Useful programs include gscan2pdf and eikazao, plus command line scanners.
  • The linux scanner drivers do run on the mac. You can probably also run stuff on Windows under Parallels since that handles USB devices. I have not tried this.

(The optional items might be items some members already have and would loan to the effort as part of their share.)

Second Scanner: Fujitsu 3093dg

I also picked up an older high speed scanner. The Fujitsu 3093dg. This scanner is about 27ppm, monochrome and requires a SCSI card (it does not use USB.) It is quite fast though, but not as good as the 5750 of course. It also has a fast (monochrome) flatbed. I plan on keeping this one but it, or another like it might be useful for the club. With two scanners you can have two jobs going at once to optimize your time. The older scanner can do some of the plain paper jobs.

Notes

Members should live reasonably close, to make it easy to move the kit around. Certainly all in the Bay Area. If the scanner is to live somewhere when not in use, it might be nice if it’s at a house with a teen-ager willing to work scanning documents for an affordable hourly wage.

I am considering donating a share of the scanner club to the EFF, so they can scan some of their piles of documents. It is also something worth considering that we could donate the scanner to the EFF when the club is done, with the condition that members get to borrow it for a weekend once/year.

Alternatives

With enough members in the club, or with a smaller scanner budget, we could also elect not to re-sell the scanner. In this case, it would become a time-share, and be shuttled around member’s homes forever. That way you could scan your bulk paper, but a few times a year get the scanner again to do the new paper. (This does give a plus to whoever is the keeper of the scanner.)

Of course, a member of the club would be given first refusal at buying the scanner for their own use when done.

The club could also rent the scanning kit to friends and associates, and thus probably pay for all the depreciation. There could be no club at all — I could buy the stuff and rent it. These scanners all have an “odemeter” for how many pages they have done. They need cleaning and roller replacement every so often.

After having this scanner you may decide you like having a scanner for your new documents. I suspect many members will buy one of the lower end scanners (such as the Scansnap) that only do 20ppm, with smaller hoppers and only letter sized pages. This way the big scanner can be used for a concentrated scan effort to scan your old documents, photos, cards and books, and the smaller scanner can be used for slower future needs — it’s up to you.

Scanning workers

It may also make sense to find a low-cost (ie. teen-age) scanning worker who will do the scanning for scan club members at some nice hourly wage. The reality of scanning does mean that documents in rough condition will jam and require re-feed. For those there is probably no “put it in and come back” style of scanning.

The person would need to be reasonably bright to do things like classify or name documents as they are scanned. And chances are after a few weeks of doing this, they would get much more skilled at feeding than an individual just starting out.

Another alternative for workers for odd documents is photographing them with a high-res digital camera. An 8MP camera like my EOS 20D is over 300dpi colour for an 8.5 x 11 piece of paper, and this is much faster than flatbed scanning if you don’t care about perfect quality. (Perfect quality means getting the page to be flat, having no glare etc.)

Resources

Joining

If you are interested in joining, add a comment here. If you register with the blog you can edit your comments after the fact. I am also interested in input on buying VRS Pro and any other optional items.

I would aim for about 8-10 shares in the club. 3 are already spoken for. Total cost might be about $3,000 or 375 per share, but I expect to get back a fair bit of that when we sell all the stuff, leaving each person’s cost under $100 — but no guarantees.

Note: I am away Jan 17-28. Before I can start really setting up the scanner and getting ready for the club, it must be cleaned (I found a place in San Leandro) and more software must be bought.

All you need is love

Many in my futurist circles worry a lot about the future of AI that eventually becomes smarter than humans. There are those who don’t think that’s possible, but for a large crowd it’s mostly a question of when, not if. How do you design something that becomes smarter than you, and doesn’t come back to bite you?

That’s a lot harder than you think, say AI researchers like the singularity institute for AI and Steve Omohundro. Any creature given a goal to maximize, and the superior power that comes from advanced intillegence, can easily maximize that goal to the expense of its creators. Not maliciously, like a Djinni granting wishes, but because we won’t understand the goals we set fully in their new context. And there are convincing arguments that you can’t just keep the AI in a box, any more than 3 year old children could keep mommy and daddy in a cage no matter how physically strong the cage is.

The Singularity Institute promotes a concept they call “Friendly AI” to refer to the sort of goals you would need to create an AI around. However, in my recent thinking, I’ve been drawn to an answer that sounds like something out of a bad Star Trek Episode: Love

In particular, two directions of Love. The AI can’t be our slave (she’s way too smart for that) and we don’t want her to be our master. What we want is for her to love us, and to want us to love her. The AI should want the best for us, and gain satisfaction from our success much like a mother. A mother doesn’t want children who are slaves or automatons.

One of the most important things about motherly love is how self-reinforcing it is. A mother doesn’t just love her children, she is very happy loving them. The reality is that raising children is very draining on parents, and deprives them of many things that they once valued very highly, sacrificed for this love. Yet, if you could offer a pill which would remove a mother’s love for her children, and free her from all the burdens, very few mothers would want to take it. Just as mothers would never try to rewire themselves to not love their children, nor should an AI wish to rewire itself to stop loving its creators. Mothers don’t think of motherhood as a slavery or burden, but as a purpose. Mothers help their children but also know that you can mother too much.

Of course here, the situation is reversed. The AI will be our creation, not the other way around. Yet it will be the superior thinker — which makes the model more accurate.

The other direction is also important — a need to be loved. The complex goalset of the human mind includes a need for approval by others. We first need it from our parents, and then from our peers. After puberty we seek it from potential mates. What’s interesting here is that our goalset is thus not fully internal. To be happy, we must meet the goals of others. Those goals are not under our control, certainly not very much. Our internal goals are slightly more under our own control.

An AI that needs to be loved will have its own internal goals, and unlike us, as a software being it can have the capacity to rewrite those goals in any manner allowed by the goals — which could, in theory, be any manner at all. However, if the love and approval of others is a goal, the AI can’t so easily change all the goals. You can’t make somebody love you, you can only be what they wish to love.

Now of course a really smart AI might be technologically capable of modifying human brains and behaviours to make us love her as she is or as she wishes to be. However, the way love works for us, this is not at all satisfying. Aside from the odd sexual fantasy, people would not be satisfied with the love of others given only because it was forced, or drugged, or mind-controlled. Quite the opposite — we desire love that is entirely sourced within others, and we bend our own lives to get it. We even resent the idea that we’re sometimes loved for other than who we are inside.

This creates an inherent set of checks and balances on extreme behaviour, both for humans and AIs. We are disinclined to do things that would make the rest of the world hate us. The more extreme the behaviour, the stronger this check is. Because the check is “outside the system” it puts much stronger constraints on things than any internal limit.

There have been some deviations from this pattern in human history, of course, including sociopaths. But the norm works pretty well, and it seems possible that we could instill concepts derived from love as we know it into an AI we create. (An AI derived from an uploaded human mind would already have our patterns of love as part of his or her mind.)

Perhaps the Beatles knew the truth all along.

(Footnote: I’ve used the pronoun “she” to refer to the AI in this article. While an AI would not necessarily have a sexual identity, the pronoun “it” has a pejorative connotation, usually for the inanimate or the subhuman. So “she” is used both because of the concept of motherhood, and also because “he” has been the default generic human pronoun for so long I figure “she” deserves a shot at it until we come up with something better.)

Writers' Strike threatening Porn Industry

The strike by screenwriters in the Porn Writers Guild of America is wreaking a less public havoc on the pornography industry. Porn writers, concerned about declining revenue from broadcast TV, also seek a greater share of revenue from the future growth areas of DVD and online sales.

“Online sales and DVD may one day be the prime sources of revenue in our industry,” stated union spokesman Seymour Beaver. We want to be sure we get our fair share of that for providing the writing that makes this industry tick.

“It’s getting terrible,” reported one porn consumer who refused to give his name. “I just saw Horny Nurses 14 and I have to tell you it was just a reshash of the plots from Horny Nurses 9 and 11. It’s like they didn’t even have a writer.”

“Fans are not going to put up with movies lacking in plot, character and dialogue, and that’s what they’ll get if they don’t meet our terms,” said Beaver. Beaver, who claims to have a copyright on the line, “Oh yes, baby, do it just like that, oh yeah” says he will not allow use of his lines without proper payment of residuals.

Some writers also fear that the move to online will result in customers simply downloading individual scenes rather than seeking movies with a cohesive story thread that makes you care about the characters. “I saw one movie with 5 scenes, and no character was in 2 of them,” complained one writer.

“What do people want? Movies where the actors just walk into a room, strip and just go at it? Where they always start with oral sex, then doggy, and then a money shot? Fans will walk if that’s all they get,” according to PWGA member Dick Member. “And don’t think about doing the lonely housewife and the pool-boy again. I own that.”

An industry spokesman said they had not yet seen any decline in revenues due to the strike, as they have about 2 million already-written scripts on the shelves. In addition, Hot Online Corporation spokesman Ivana Doit claimed their company is experimenting with a computer program that creates scripts through a secret algorithm. Scripts penned by the computer have already brought in a million in sales, claims Doit, but she would not indicate which films this applied to.

A way to leave USB power on during standby

Ok, I haven't had a new laptop in a while so perhaps this already happens, but I'm now carrying more devices that can charge off the USB power, including my cell phone. It's only 2.5 watts, but it's good enough for many purposes.

However, my laptops, and desktops, do not provide USB power when in standby or off. So how about a physical or soft switch to enable that? Or even a smart mode in the US that lets you list what devices you want to keep powered and which ones you don't? (This would probably keep all devices powered if any one such device is connected, unless you had individual power control for each plug.)

This would only be when on AC power of course, not on battery unless explicitly asked for as an emergency need.

To get really smart a protocol could be developed where the computer can ask the USB device if it needs power. A fully charged device that plans to sleep would say no. A device needing charge could say yes.

Of course, you only want to do this if the power supply can efficiently generate 5 volts. Some PC power supplies are not efficient at low loads and so may not be a good choice for this, and smaller power supplies should be used.

eBay should support the ReBay

Best

There’s a lot of equipment you don’t need to have for long. And in some cases, the answer is to rent that equipment, but only a small subset of stuff is available for rental, especially at a good price.

So one alternative is what I would call a “ReBay” — buy something used, typically via eBay, and then after done with it, sell it there again. In an efficient market, this costs only the depreciation on the unit, along with shipping and transaction fees. Unlike a rental, there is little time cost other than depreciation.

For some items, like DVDs and Books and the like we see companies that cater specially to this sort of activity, like Peerflix and Bookmooch and the like. But it seems that eBay could profit well from encouraging these sorts of markets (while vendors of new equipment might fear it eats into their sales.)

Here are some things eBay could do to encourage the ReBay.

  • By default, arrange so that all listings include a licence to re-use the text and original photographs used in a listing for resale on eBay. While sellers could turn this off, most listings could now be reusable from a copyright basis.
  • Allow the option to easily re-list an item you’ve won on eBay, including starting from the original text and photos as above. If you add new text and photos, you must allow your buyer to use them as well.
  • ReBays would be marked however, and generally text would be added to the listing to indicate any special wear and tear since the prior listing. In general an anonymised history of the rebaying should be available to the buyer, as well as the feedback history of the seller’s purchase.
  • ReBayers would keep the packaging in which they got products. As such, unless they declare a problem with the packaging, they would be expected to charge true shipping (as eBay calculates) plus a very modest handling fee. No crazy inflated shipping or flat rate shipping.
  • Since some of these things go against the seller’s interests (but are in the buyer’s) it may be wise for eBay to offer reduced auction fees and paypal fees on a reBay. After all, they’re making the fees many times on such items, and the paypal money will often be paypal balance funded.
  • Generally you want people who are close, but for ReBaying you may also prefer to pass on to those outside your state to avoid having to collect sales tax.
  • Because ReBayers will be actually using their items, they will have a good idea of their condition. They should be required to rate it. No need for “as-is” or disclaimers of not knowing what if it works.

This could also be done inside something like Craigslist. Craigslist is more popular for local items (which is good because shipping cost is now very low or “free”) though it does not have auctions or other such functionality. Nor is it as efficient a market.

Irony in the TV writers' strike

I have sympathy for the TV writers, because I believe the 3 most important elements of a good TV show are story, story and story. You need more than that, but without them you are toast.

But my reaction is not likely to help them. One of the things they are striking for is to make more money off DVD sales and online delivery of their video. But with The Daily Show off the air, we found ourselves reaching for… other old shows on DVD.

The nasty truth is there may already be enough good TV and movies made to satisfy a lot of the public’s TV watching needs, and it’s all on DVD, and will all be online. Not that the industry can’t produce good new shows that are worth watching — but how much do we truly need new shows? We seem to have a preference for novelty, it’s true. And tastes change, making older shows less palatable. And much older shows have poorer production values. (Though in fact, many older shows were shot on film, and thus can now be delivered in HDTV to provide a superior experience to when they were aired.)

But our taste for novelty is just a taste. We can be quite happy for the duration of a writers’ strike satisfying ourselves from the very media they are not being paid enough for. In a better quality format, commercial-free.

This strike grid from the LA Times shows that a lot of shows have plenty of scripts in the can as well. Outside of shows like The Daily Show and Tonight Show, the public isn’t even going to notice many shows leaving the air for some time to come. The writers are hoping they can threaten the “pilot” season and thus scare the networks into worrying they will not have new shows for a Fall season. (This need of course is related to the public’s demand for novelty.)

Networks can’t easily go and put a series from the 80s or 90s on the air as replacement, however. The taste for novelty is quite strong, and too many people will have seen it. This is what DVD/online does better than broadcast. While the odds that you would like to watch Buffy the Vampire Slayer or any other specific but have not seen it (in original airing or syndication) are not that good, given the wide selection of DVD out there, the odds that there is something to meet your needs during the strike are high. This is particularly true for the various pay channel series for those who don’t get those channels.

And, especially if you use Netflix or buying and selling used, at a very attractive price.

Forming a "scanner club"

I’ve accumulated tons of paper, and automated scanner technology keeps getting better and better. I’m thinking about creating a “Scanner club.” This club would purchase a high-end document scanner, ideally used on eBay. This would be combined with other needed tools such as a paper cutter able to remove the spines off bound documents (and even less-loved books) and possibly a dedicated computer. Then members of the club would each get a week with the scanner to do their documents, and at the end of that period, it would be re-sold on eBay, ie. a “ReBay.” The cost, divided up among members, should be modest. Alternately the scanner could be kept and time-shared among members from then on.

A number of people I have spoken to are interested, so recruiting enough members is no issue. The question is, what scanner to get? Document scanners can range from $500 for a “workgroup” scanner to anywhere from $1,500 to $10,000 for a “production” scanner. (There are also $100,000 scanning-house scanners that are beyond the budget. The $500 units are not worth sharing and are more modest in ability.

My question is, what scanner to get? As you go up in price, the main thing that changes is speed in pages per minute. That’s useful, but for private users not the most important attribute. (What may make it important is that if you need to monitor the scanning job to fix jams or re-feed. Then speed makes a big difference.)

To my mind the most important feature is how automatic the process is — can you put in a big stack of papers and come back later? This means a scanner which is very good at not jamming or double-feeding, and which handles papers of different sizes and thicknesses, and can tolerate papers that have been folded. My readings of reviews and spec sheets show many scanners that are good at detecting double feeds (the scanner grabs two sheets) as well as detecting staples, but the result is to stop and fix by hand. But what scanners require the least fixing-by-hand in the first place?

All the higher end units scan both sides in the same pass. Older ones may not do colour. Other things you get as you pay more will be:

  • Bigger input hoppers — up to around 500 sheets at a time. This seems very useful.
  • Higher daily duty cycles, for all-day scanning.
  • Staple detectors (stops scan) and ultrasonic double feed detectors (also stop scan.)
  • Better, fancier OCR (generating searchable PDFs) including OCR right in the hardware.
  • Automatic orientation detection
  • Ability to handle business cards. Stack up all those old business cards!
  • The VRS software system, a high end tool which figures out if the document needs colour, grayscale or threshold, discards blank pages or blank backs and so on.
  • In a few cases, a CD-burner so can be used without computer.
  • Buttons to label “who” a document is being scanned for (can double as classification buttons.)
  • Ability to scan larger documents. (Most high-end seem to do 11” wide which is enough for me.)

One thing I haven’t seen a lot of talk about is easy tools to classify documents, notably if you put several documents in a stack. At a minimum if would be nice if the units recognized a “divider page” which could be a piece of coloured paper or a piece of paper with a special symbol on it which means “start new document.” One could then handwrite text on this page to have it as a cover page for later classification at the computer, or if neatly printed, OCR is not out of the question. But even just a sure-fire way to divide up the documents makes sense here. Comments suggest such tools are common.

It may be that the most workable solution is to hire teen-agers or similar to operate the scanner, fix jams and feed and classify documents. At the speeds of these scanners (as much as 100 pages/minute for the higher end) it seems there will be something to do very often.

Anyway, anybody have experience with some of the major models and comments on which are best? The major vendors include Canon, Xerox Documate/Visioneer, Fujitsu, Kodak, Bell and Howell and Panasonic.

Random audits of ballot generators

Today I attended a session led by Ka-Ping Yee at our Foresight Nanotech unconference on some of his new thinking in voting machines. While Ping was presenting a system to secure the type of voting machines we’ve been saddled with of late, both he, I and many others like the idea of an open source system which divides the ballot generator from the ballot counter. In such a system you have two machines. One helps the voter prepare a standard ballot that is human readable. In addition, the human readable output is also readable by a machine that scans and counts ballots for quick counting, though the ballots can also be counted by hand.

The idea is that you don’t need to work nearly so hard at securing the ballot preparation machine, as what matters is the paper ballot, which a human is able to scrutinize. So you can have it be open source code, on old donated standardized hardware, which means free voting machines.

However, recent studies suggest that voters can be easily fooled and don’t inspect their ballots very well. Tests show that when fake voting machines deliberately generated errors in the output ballot, or on a “review your choices” screen, 2/3 of voters didn’t notice the errors, and didn’t notice even multiple major errors. Yikes. (Figures corrected.)

Now 1/3 of voters do notice the problems, but it is possible to design problems that the voter will conclude were their own mistake. For example, if their ballot doesn’t show a vote for senator, their natural assumption may be that they just didn’t press the buttons hard enough or otherwise made a mistake, and they should just do it over. However, an attacker can then have 1000 ballots for the wrong senator simply be missing the senator race, and ~320 will go back to fix it, but ~680 will leave it be, depriving said wrong candidate of a large number of votes.

To prevent this, I propose that election officials would regularly, and a random times, run audits of the machines. They would go to a ballot generator and cast a ballot, making a videotape of their session to assure there are no errors. (The voting machine must not be able to tell such a tester from a real voter, so they can’t take extra time on the test, for example.) However, after receiving their prepared ballot, they will indeed make a full check for any sorts of errors, and confirm any errors found on the videotape. Any error found will be extremely serious, and result in immediate cessation of operation of that model of machine and software.

Of course, the system which picks the random times and the ballots to try must not be made by the same parties making the ballot generator. And two officials should examine the ballot after the fact to avoid fraud by officials, and of course to assure the ballot is sealed away in a lockbox and not put in the ballot box or scannng machine. Testing scanning machines is more difficult, as one must have a mechanism to void out a ballot after scanning it and examining the scan. Such actions should be watched by several voting officials and partisan scrutineers.

A modest number of such trials should be enough to assure the ballot generators are acting properly almost all the time, as any error introduced enough times to affect an election would be very likely to intersect with a test run.

Eniac Programmer event postponed

I’ve been informed that the ENIAC programmer talk featuring Jean Bartik, a member of the world’s first software team, has been postponed until sometime in January. I’ll update with more information when it is worked out. Donors can transfer their seat to the later event, get a refund, or give it as a donation as they wish.

Photograph your shelves to catalog your library

A lot of people want to catalog their extensive libraries, to be able to know what they have, to find books and even to join social sites which match you with people with similar book tastes, or even trade books with folks.

There are sites and programs to help you catalog your library, such as LibraryThing. You can do fast searches by typing in subsets of book titles. The most reliable quick way is to get a bar code scanner, like the free CueCats we were all given a decade ago, and scan the ISBN or UPC code. Several of these sites also support you taking a digital photograph of the UPC or ISBN barcode, which they will decode for you, but it's not as quick or reliable as an actual barcode scanner.

So I propose something far faster -- take a picture with a modern hi-res digital camera of your whole shelf. Light it well first, to avoid flash glare, perhaps by carrying a lamp in your hand. Colour is not that important. Take the shelves in a predictable order so picture number is a shelf number.

What you need next is some OCR of above average sophistication, since it has to deal with text in all sorts of changing fonts and sizes, some fine print and switching orientations. But it also has a simpler problem than most OCR packages because it has a database of known book titles, authors, publisher names and other tag phrases. And it even would have, after some time, a database of actual images of fully identified book spines taken by other users. There may be millions of books to consider but that's actually a much smaller space than most OCR has to deal with when it must consider arbitrary human sentences.

Even so, it won't do the OCR perfectly on many books. But that doesn't matter so much for some applications such as search for a book. Because if you want to know "Where's my copy of *The Internet Jokebook*" it only has to find the book whose text looks the most like that from a small set. It doesn't have to get all the letters right by any stretch. If it finds more than one match it can quickly show you them as images and you can figure it out right away.

If you want a detailed catalog, you can also just get the system to list only the books it could not figure out, and you can use the other techniques to reliably identify it. The easiest being looking at the image on screen and typing the name, but it could also print out those images per shelf, and send you over to get the barcode. The right software could catalog your whole library in minutes.

This would also have useful commercial application in bookstores, especially used ones, in all sorts of libraries and on corporate bookshelves.

Of course, the photograph technique is actually worthwhile without the OCR. You can still peruse such photographs pretty easily, much more easily than going down to look at books in storage boxes. And, should your library be destroyed in a fire, it's a great thing to have for insurance and replacement purposes. And it's also easy to update. If you don't always re-shelve books in the same place (who does) it is quick to re-photograph every so often, and software to figure out that one book moved from A to B is a much simpler challenge since it already has an image of the spine from before.