Archives

Date
  • 01
  • 02
  • 03
  • 04
  • 05
  • 06
  • 07
  • 08
  • 09
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30

The comment spammers are going manual, it seems

Some time ago I modified this blog softare (Drupal) to ask a very simple question of people without accounts posting comments. It generally works very well at stopping robot posting, however the volume of spam has been increasing, so I changed the question. Volume may have dropped a touch but I still got a bunch, which means the spammers are actually live humans, not robots.

It’s also possible that asking natural language questions (rather than captcha style entry of text from a graphic) has gotten common enough that spammers have modified their software so they can figure out the answer once and easily code it, but I don’t think this is the case.

What’s curious is that my comment form also clearly explains that any links in comments will be done with the rel=nofollow tag, which tells Google and other search engines not to treat the link as a valid one when ranking pages. This means that, other than readers of the blog clicking on the links, which should be very rare, these spams should be unproductive for the spammer. But they’re still doing them.

The change however was prompted by a new breed of comment spam, where the spammers were copying other comments from inside large threads, but inserting their link on the author’s name. (This also uses rel=nofollow.) Indeed, such a technique does not automatically trigger my instincts to delete the spam, but they chose one of my own comments, so I recognized it. Right now my methods cut the spam enough that it is productive to manually delete what gets posted, though if the volume got high enough I would have to find other automated techniques.

(Drupal could of course help by having a much easier to use delete, including a ‘delete all from this IP address’ option.)

The VoIP world needs a pay-per-call E911 service

As most people in the VoIP world know, the FCC mandated that “interconnected” VoIP providers must provide E911 (which means 911 calling with transmission of your location) service to their customers. It is not optional, they can’t allow the customer to opt out to save money.

It sounds good on the surface, if there’s a phone there you want to be able to reach emergency services with it.

The meaning of interconnected is still being debated. It was mostly aimed at the Vonages of the world. The current definition applies to service that has a phone-like device that can make and receive calls from the PSTN. Most people don’t think it applies to PBX phones in homes and offices, though that’s not explicit. It doesn’t apply to the Skype client on your PC, one hopes, but it could very well apply if you have a more phone like device connecting to Skype, which offers Skype-in and Skype-out services on a pay per use basis and thus is interconnected with the PSTN.

Here’s the kicker. There are a variety of companies which will provide E911 connectivity services for VoIP companies. This means you pay them and they will provide a means for you to route your user’s calls to the right emergency public service access point, and pass along the address the user registered with the service. Seems like a fine business, but as far as I can tell, all these companies are charging by the customer per month, with fees between $1 and $2 per month.

This puts a lot of constraints on the pricing models of VoIP services. There’s a lot of room for innovative business models that include offering limited or trial PSTN connection for free, or per-usage billing with no monthly fees. (All services I know of do the non-PSTN calling for free.) Or services that appear free but are supported by advertising or other means. You’ve seen that Skype decided to offer free PSTN services for all of 2006. AIM Phoneline offers a free number for incoming calls, as do many others.

Read on…  read more »

RSS aggregator to pull threads from multiple intertwined blogs

It’s common in the blogosphere for bloggers to comment on the posts of other bloggers. Sometimes blogs show trackbacks to let you see those comments with a posting. (I turned this off due to trackback spam.) In some cases we effectively get a thread, as might appear in a message board/email/USENET, but the individual components of the thread are all on the individual blogs.

So now we need an RSS aggregator to rebuild these posts into a thread one can see and navigate. It’s a little more complex than threading in USENET, because messages can have more than one parent (ie. link to more than one post) and may not link directly at all. In addition, timestamps only give partial clues as to position in a thread since many people read from aggregators and may not have read a message that was posted an hour ago in their “thread.”

At a minimum, existing aggregators (like bloglines) could spot sub-threads existing entirely among your subscribed feeds, and present those postings to you. You could also define feeds which are unsubscribed but which you wish to see or be informed of postings from in the event of a thread. (Or you might have a block-list of feeds you don’t want to see contributions from.) They could just have a little link saying, “There’s a thread including posts from other blogs on this message” which you could expand, and that would mark those items as read when you came to the other blog.

Blog search tools, like Technoratti could also spot these threads, and present a typical thread interface for perusing them. Both readers and bloggers would be interested in knowing how deep the threads go.

Please don't videoblog (vlog)

At the blogger panel at Fall VON (repurposed to be both video on the net as well as voice) Vlogger and blip.tv advocate Dina Kaplan asked bloggers to start vlogging. It’s started a minor debate.

My take? Please don’t.

I’ve written before on what I call the reader-friendly vs. writer-friendly dichotomy. My thesis is that media make choices about where to be on that spectrum, though ideal technology reduces the compromises. If you want to encourage participation, as in Wikis, you go for writer friendly. If you have one writer and a million readers, like the New York Times, you pay the writer to work hard to make it as reader friendly as possible.

When video is professionally produced and tightly edited, it can be reader (viewer) friendly. In particular if the video is indeed visual. Footage of tanks rolling into a town can convey powerful thoughts quickly.

But talking head audio and video has an immediate disadvantage. I can read material ten times faster than I can listen to it. At least with podcasts you can listen to them while jogging or moving where you can’t do anything else, but video has to be watched. If you’re just going to say your message, you’re putting quite a burden on me to force me to take 10 times as long to consume it — and usually not be able to search it, or quickly move around within it or scan it as I can with text.

So you must overcome that burden. And most videologs don’t. It’s not impossible to do, but it’s hard. Yes, video allows better expression of emotion. Yes, it lets me learn more about the person as well as the message. (Though that is often mostly for the ego of the presenter, not for me.)

Recording audio is easier than writing well. It’s writer friendly. Video has the same attribute if done at a basic level, though good video requires some serious work. Good audio requires real work too — there’s quite a difference between “This American Life” and a typical podcast.

Indeed, there is already so much pro quality audio out there like This American Life that I don’t have time to listen to the worthwhile stuff, which makes it harder to get my attention with ordinary podcasts. Ditto for video.

There is one potential technological answer to some of these questions. Anybody doing an audio or video cast should provide a transcript. That’s writer-unfriendly but very reader friendly. Let me decide how I want to consume it. Let me mix and match by clicking on the transcript and going right to the video snippet.

With the right tools, this could be easy for the vlogger to do. Vlogger/podcaster tools should all come with trained speech recognition software which can reliably transcribe the host, and with a little bit of work, even the guest. Then a little writer-work to clean up the transcript and add notes about things shown but not spoken. Now we have something truly friendly for the reader. In fact, speaker-independent speech recognition is starting to almost get good enough for this but it’s still obviously the best solution to have the producer make the transcript. Even if the transcript is full of recognition errors. At least I can search it and quickly click to the good parts, or hear the mis-transcribed words.

If you’re making podcaster/vlogger tools, this is the direction to go. In addition, it’s absolutely the right thing for the hearing or vision impaired.

VAD (Video After Demand) instead of VoD

In an earlier blog post I attempted to distinguish TVoIP (TV over internet) with IPTV, a buzzword for cable/telco live video offerings. My goal was to explain that we can be very happy with TV, movies and video that come to us over the internet after some delay.

The two terms aren’t really very explanatory, so now I suggested VAD, for Video-after-demand. Tivo and Netflix have taught us that people are quite satisifed if they pick their viewing choices in advance, and then later — sometimes weeks or months later — get the chance to view them. The key is that when they sit down to watch something, they have a nice selection of choices they actually want to see.

The video on demand dream is to give you complete live access to all the video in the world that’s available. Click it and watch it now. It’s a great dream, but it’s an expensive one. It needs fast links with dedicated bandwidth. If your movie viewing is using 4 of your 6 megabits, somebody else in the house can’t use those megabits for web surfing or other interactive needs.

With VaD you don’t need much in your link. In fact, you can download shows that you don’t have the ability to watch live at all, or get them at higher quality. You just have to wait. Not staring at a download bar, of course, nobody likes that, but wait until a later watching session, just as you do when you pick programs to record on a PVR like the Tivo.

I said these things before, but the VaD vision is remarkably satisfying and costs vastly less, both to the consumer, and those building out the networks. It can be combined with IP multicasting (someday) to even be tremendously efficient. (Multicasting can be used for streaming but if packets are lost you have only a limited time to recover them based on how big your buffer is.)

Trade show giveaway: Toothpaste

Trade show booths are always searching for branded items to hand out to prospects. Until they fix the airport bans, how about putting your brand on a tube of toothpaste and/or other travel liquids now banned from carry-on bags?

(Yeah, most hotels will now give you these, but it’s the thought that counts and this one would be remembered longer than most T-shirts.)

Medical adhesive that sticks to skin, but not hair?

As a hirsute individual, I beg the world’s makers of medical tapes and band-aids to work on an adhesive that is decent at sticking to skin, but does not stick well to hair.

Not being versed in the adhesive chemistries of these things, I don’t know how difficult this is, but if one can be found, many people would thank you.

Failing that would be an adhesive with a simple non-toxic solvent that unbinds it, which could be swabbed on while slowly undoing tape.

Some early panoramas of the burn itself

While it will be a while before I get the time to build all my panoramas of this year’s Burning Man, I did do some quick versions of some of those I shot of the burn itself. This year, I arranged to be on a cherry picker above the burn. I wish I had spent more time actually looking at the spectacle, but I wanted to capture panoramas of Burning Man’s climactic moment. The entire city gathers, along with all the art cars for one shared experience. A large chunk of the experience is the mood and the sound which I can’t capture in a photo, but I can try to capture the scope.

This thumbnail shows the man going up, shooting fireworks and most of the crowd around him. I will later rebuild it from the raw files for the best quality.

Shooting panoramas at night is always hard. You want time exposures, but if any exposure goes wrong (such as vibration) the whole panorama can be ruined by a blurry frame in the middle. On a boomlift, if anybody moves — and the other photographer was always adjusting his body for different angles — a time exposure won’t be possible. It’s also cramped and if you drop something (as I did my clamp knob near the end) you won’t get it back for a while. In addition, you can’t have everybody else duck every time you do a sweep without really annoying them, and if you do you have to wait a while for things to stabilize.

It was also an interesting experience riding to the burn with DPW, the group of staff and volunteers who do city infrastructure. They do work hard, in rough conditions, but it gives them an attitude that crosses the line some of the time regarding the other participants. When we came to each parked cherry picker, people had leaned bikes against them, and in one case locked a bike on one. Though we would not actually move the bases, the crew quickly grabbed all the bikes and tossed them on top of one another, tangling pedal in spoke, probably damaging some and certainly making some hard to find. The locked bike had its lock smashed quickly with a mallet. Now the people who put their bikes on the pickers weren’t thinking very well, I agree, and the DPW crew did have to get us around quickly but I couldn’t help but cringe with guilt at being part of the cause of this, especially when we didn’t move the pickers. (Though I understand safety concerns of needing to be able to.)

Anyway, things “picked up” quickly and the view was indeed spectacular. Tune in later for more and better pictures, and in the meantime you can see the first set of trial burn panoramas for a view of the burn you haven’t seen.

Better handling of reading news/blogs after being away

I’m back fron Burning Man (and Worldcon), and though we had a decently successful internet connection there this time, you don’t want to spend time at Burning Man reading the web. This presents an instance of one of the oldest problems in the “serial” part of the online world, how do you deal with the huge backup of stuff to read from tools that expect you to read regularly.

You get backlogs of your E-mail of course, and your mailing lists. You get them for mainstream news, and for blogs. For your newsgroups and other things. I’ve faced this problem for almost 25 years as the net gave me more and more things I read on a very regular basis.

When I was running ClariNet, my long-term goal list always included a system that would attempt to judge the importance of a story as well as its topic areas. I had two goals in mind for this. First, you could tune how much news you wanted about a particular topic in ordinary reading. By setting how iportant each topic was to you, a dot-product of your own priorities and the importance ratings of the stories would bring to the top the news most important to you. Secondly, the system would know how long it had been since you last read news, and could dial down the volume to show you only the most important items from the time you were away. News could also simply be presented in an importance order and you could read until you got bored.

There are options to do this for non-news, where professional editors would rank stories. One advantage you get when items (be they blog posts or news) get old is you have the chance to gather data on reading habits. You can tell which stories are most clicked on (though not as easily with full RSS feeds) and also which items get the most comments. Asking users to rate items is usually not very productive. Some of these techniques (like using web bugs to track readership) could be privacy invading, but they could be done through random sampling.

I propose, however, that one way or another popular, high-volume sites will need to find some way to prioritize their items for people who have been away a long time and regularly update these figures in their RSS feed or other database, so that readers can have something to do when they notice there are hundreds or even thousands of stories to read. This can include sorting using such data, or in the absence of it, just switching to headlines.

It’s also possible for an independent service to help here. Already several toolbars like Alexa and Google’s track net ratings, and get measurements of net traffic to help identify the most popular sites and pages on the web. They could adapt this information to give you a way to get a handle on the most important items you missed while away for a long period.

For E-mail, there is less hope. There have been efforts to prioritize non-list e-mail, mostly around spam, but people are afraid any real mail actually sent to them has to be read, even if there are 1,000 of them as there can be after two weeks away.